Data Science Asked on February 8, 2021
Checking on this community if any one can help with this problem posted on Cross Validated.
Detailed question is as below:
OLS Regression Results
===============================================================================
Dep. Variable: Losses in Thousands R-squared: 0.305
Model: OLS Adj. R-squared: 0.304
Method: Least Squares F-statistic: 1171.
Date: Fri, 20 Dec 2019 Prob (F-statistic): 0.00
Time: 11:12:52 Log-Likelihood: -72503.
No. Observations: 10703 AIC: 1.450e+05
Df Residuals: 10698 BIC: 1.451e+05
Df Model: 4
Covariance Type: nonrobust
======================================================================================
coef std err t P>|t| [0.025 0.975]
--------------------------------------------------------------------------------------
const 539.6565 7.950 67.884 0.000 524.074 555.239
Age -6.1490 0.112 -54.971 0.000 -6.368 -5.930
Number of Vehicles -1.7906 2.151 -0.832 0.405 -6.007 2.426
M 97.2349 4.094 23.750 0.000 89.210 105.260
Single 136.7923 4.094 33.410 0.000 128.767 144.818
==============================================================================
Omnibus: 7898.559 Durbin-Watson: 2.010
Prob(Omnibus): 0.000 Jarque-Bera (JB): 403312.043
Skew: 3.029 Prob(JB): 0.00
Kurtosis: 32.456 Cond. No. 187.
==============================================================================
Shown above are the results of an OLS model I ran in Python.
Below are my few understandings:
Omnibus : value close to Zero, to indicate normal distribution of
error
Prob(Omnibus): Value must be close to 1 for normal error
distribution
Skew : Same as above, close to zero
Condition Number – Indicates multicollinearity, so it must be relatively small number,something below 30. In below results, it is way above 30 but with
correlation function, i couldn’t see any correlation(i found one but
i dropped the variable so nothing left now)
Results after logarithmic transformation of y variable.
Dep. Variable: Losses in Thousands R-squared: 0.326
Model: OLS Adj. R-squared: 0.326
Method: Least Squares F-statistic: 1295.
Date: Fri, 20 Dec 2019 Prob (F-statistic): 0.00
Time: 14:34:13 Log-Likelihood: -9712.2
No. Observations: 10703 AIC: 1.943e+04
Df Residuals: 10698 BIC: 1.947e+04
Df Model: 4
Covariance Type: nonrobust
======================================================================================
coef std err t P>|t| [0.025 0.975]
--------------------------------------------------------------------------------------
const 6.3490 0.023 281.983 0.000 6.305 6.393
Age -0.0203 0.000 -64.137 0.000 -0.021 -0.020
Number of Vehicles 0.0007 0.006 0.118 0.906 -0.011 0.013
M 0.2137 0.012 18.429 0.000 0.191 0.236
Single 0.3159 0.012 27.240 0.000 0.293 0.339
==============================================================================
Omnibus: 1231.182 Durbin-Watson: 1.998
Prob(Omnibus): 0.000 Jarque-Bera (JB): 1943.765
Skew: -0.825 Prob(JB): 0.00
Kurtosis: 4.279 Cond. No. 187.
=============================================================================
`
Correlation Matrix:
Ac_No Age Years of Experience Number of Vehicles Losses in Thousands Losses in Thousands_log
Ac_No 1.000000 0.008291 0.008437 -0.003056 -0.000794 -0.001057
Age 0.008291 1.000000 0.997161 0.008366 -0.442962 -0.509823
Yr Exp 0.008437 0.997161 1.000000 0.008545 -0.442115 -0.511495
No Veh -0.003056 0.008366 0.008545 1.000000 -0.011553 -0.004839
Loss -0.000794 -0.442962 -0.442115 -0.011553 1.000000 0.849515
Loss_l -0.001057 -0.509823 -0.511495 -0.004839 0.849515 1.000000
Describe():
Age Number of Vehicles M Single
count 10703.000000 10703.000000 10703.000000 10703.000000
mean 42.519761 2.497804 0.492292 0.490984
std 18.298802 0.951530 0.499964 0.499942
min 16.000000 1.000000 0.000000 0.000000
25% 24.000000 2.000000 0.000000 0.000000
50% 42.000000 2.000000 0.000000 0.000000
75% 61.000000 3.000000 1.000000 1.000000
max 70.000000 4.000000 1.000000 1.000000
R-Square is also very poor in this case (0.33) though there were slight improvement with log transformation(from 0.31 to 0.33).
To get a good model and to get the values of "Omnibus" and other parameters in limit, what other things I can do?
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP