Sunday, June 19, 2016

How is cv_values_ computed in sklearn.linear::RidgeCV?

Leave a Comment

The reproducible example to fix the discussion:

from sklearn.linear_model import RidgeCV from sklearn.datasets import load_boston from sklearn.preprocessing import scale   boston = scale(load_boston().data) target = load_boston().target  import numpy as np alphas = np.linspace(1.0,200.0, 5) fit0 = RidgeCV(alphas=alphas, store_cv_values = True, gcv_mode='eigen').fit(boston, target) fit0.alpha_ fit0.cv_values_[:,0] 

The question: what formula is used to compute fit0.cv_values_?

Edit:

@Abhinav Arora answer below seems to suggests that fit0.cv_values_[:,0][0], the first entry of fit0.cv_values_[:,0] would be

(fit1.predict(boston[0,].reshape(1, -1)) - target[0])**2 

where fit1 is a ridge regression with alpha = 1.0, fitted to the data-set from which observation 0 was removed.

Let's see:

1) create new dataset with first row of original dataset removed:

from sklearn.linear_model import Ridge boston1 = np.delete(boston, (0), axis=0) target1 = np.delete(target, (0), axis=0) 

2) fit a ridge model with alpha = 1.0 on this truncated dataset:

fit1 = Ridge(alpha=1.0).fit(boston1, target1) 

3) check the MSE of that model on the first data-point:

(fit1.predict(boston[0,].reshape(1, -1)) - target[0])**2 

it is array([ 37.64650853]) which is not the same as what is produced by the fit0.cv_values_[:,0], ergo:

fit0.cv_values_[:,0][0] 

which is 37.495629960571137

What gives?

2 Answers

Answers 1

Quoting from the Sklearn documentation:

Cross-validation values for each alpha (if store_cv_values=True and cv=None). After fit() has been called, this attribute will contain the mean squared errors (by default) or the values of the {loss,score}_func function (if provided in the constructor).

Since you have not provided any scoring function in the constructor and also not provided anything for the cv argument in the constructor, this attribute should store the mean squared error for each sample using Leave-One out cross validation. The general formula for Mean Squared Error is

Mean Squared Error

where the Y (with the cap) is the prediction of your regressor and the other Y is the true value.

In your case, you are doing Leave-One out cross validation. Therefore, in every fold you have only 1 test point and thus n = 1. So, in your case doing a fit0.cv_values_[:,0] will simply give you the squared error for every point in your training data set when it was a part of the test fold and when the value of alpha was 1.0

Hope that helps.

Answers 2

Let's look - it's open source after all

The first call to fit makes a call upwards to its parent, _BaseRidgeCV (line 997, in that implementation). We haven't provided a cross-validation generator, so we make another call upwards to _RidgeGCV.fit. There' plenty of math in the documentation of this function, but we're so close to the source that I'll let you go and read about it.

Here's the actual source

    v, Q, QT_y = _pre_compute(X, y)     n_y = 1 if len(y.shape) == 1 else y.shape[1]     cv_values = np.zeros((n_samples * n_y, len(self.alphas)))     C = []      scorer = check_scoring(self, scoring=self.scoring, allow_none=True)     error = scorer is None      for i, alpha in enumerate(self.alphas):         weighted_alpha = (sample_weight * alpha                           if sample_weight is not None                           else alpha)         if error:             out, c = _errors(weighted_alpha, y, v, Q, QT_y)         else:             out, c = _values(weighted_alpha, y, v, Q, QT_y)         cv_values[:, i] = out.ravel()         C.append(c) 

Note the un-exciting pre_compute function

def _pre_compute(self, X, y):     # even if X is very sparse, K is usually very dense     K = safe_sparse_dot(X, X.T, dense_output=True)     v, Q = linalg.eigh(K)     QT_y = np.dot(Q.T, y)     return v, Q, QT_y 

Abinav has explained what's going on on a mathematical level -it's simply accumulating the weighted mean squared error. The details of their implementation, and where it differs from your implementation, can be evaluated step-by-step from the code

If You Enjoyed This, Take 5 Seconds To Share It

0 comments:

Post a Comment