Data Science Asked by rocksNwaves on December 30, 2020
Question 1:
In the example of logistic regression, I often see the regularization constant and penalty methods being tuned by a grid search. However, it seems like there are a lot more options for tuning:
classifier_os.get_params()
gives:
{'C': 1.0,
'class_weight': None,
'dual': False,
'fit_intercept': True,
...
and many more!
So my question is: Are these other parameters typically not worth tuning, or are they left out in examples for another reason?
For example, I changed to solver='liblinear'
and got sub-par results compared to lbfgs
. Is that a coincidence, or are the other parameters already at their best by default?
Question 2:
Does performing grid search on hyper-parameters guarantee improved performance when tested on the same data set?
I ask because my intuition was “yes”, however I got slightly lower scores after tuning my regularization constant:
p_grid = {
'C': np.logspace(-3, 3, 7)
}
clf = model_selection.GridSearchCV(classifier_os, p_grid, scoring='accuracy')
clf.fit(x_train, y_train)
y_pred = clf.predict(x_test)
metrics.classification_report(y_pred, y_test, output_dict=True)
Gives me the following scores (C=10):
accuracy : 0.8218181818181818
macro avg:
precision : 0.8210875331564986
recall : 0.8213603058298822
f1-score : 0.8212129655428624
support : 275
As compared to before tuning (C=1):
accuracy : 0.8290909090909091
macro avg:
precision : 0.8287798408488063
recall : 0.8285358354537744
f1-score : 0.8286468069310212
The only thing that the tuning changed was to make the regularization constant C=10
instead of the default C=1
Grid search grows geometrically with the number of hyperparameters and values for each hyperparameter. It is best practice to minimize the search space to fewer options.
Increased performance on training dataset is no guarantee of increased performance on test set.
Answered by Brian Spiering on December 30, 2020
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP