Data Science Asked on December 27, 2020
Below is the linear regression model I fitted and not sure if I am doing the right way as I am getting neat to 99% accuracy
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import cross_val_score
ln_regressor = LinearRegression()
mse = cross_val_score(ln_regressor, X_train, Y_train , scoring = 'neg_mean_squared_error', cv = 5)
mean_mse = np.mean(mse)
print(mean_mse)
ln_regressor.fit(X_train, Y_train)
** MSE SCORE =-6.612466691367042e-06**
y_pred = ln_regressor.predict(X_test)
mse2 = cross_val_score(ln_regressor, X_test, y_pred , scoring = 'neg_mean_squared_error', cv = 5)
mean_mse2 = np.mean(mse2)
print(mean_mse2)
**MSE score = -4.645751512870382e-31**
Please Note:
My data is in log scale & transformed to standard scaling later on
R2= cross_val_score(ln_regressor,X_test, y_pred,cv = 10)
R2.mean()
R2 mean is ‘0.9999030728571852’
So first thing first, accuracy is a classification concept. You can't say you have 99% accuracy for a regression problem.
Your code seems ok. Cross validation is not necessary here since you are not doing any hyper-parameter tuning or model selection. The mse error is indeed low, so I would suggest you go back to normalize your data, since if your target $y$ has a very small span, i.e. low $sigma$ in Gaussian case, you will get a meaningless low mse guaranteed.
Answered by plpopk on December 27, 2020
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP