Getting very low/ wrong accuracy from RandomizedSearchCV

Question

I am currently using RandomizedSearchCV to optimize my hyper-parameters. However the reported scores of each iteration is very low. When I then evaluate the highest scoring candidate I get very high accuracy (0.97), while the RandomizedSearchCV reports something much lower (0.32).
search = clf_rand_search.fit(x_traintest, y_traintest)
print(search.score(x_validation,y_validation))
0.32

print(accuracy_score(y_validation.flatten(), search.predict(x_validation).flatten()))
0.9798260869565217

My input and output are both 2-D matrix with (100,9) and (100,230) shape for the train/test data. With lower samples for the validation data.
Should I format my data differently for the RandomizedSearchCV?
Input features first two are normalised and the last one-hot encoded.

Output classification 0 or 1 for 230 nodes.

clf = MLPClassifier(solver = 'adam', 
                max_iter=9999,
                alpha=1e-5
                )

hidden_layers = 8
neurons = list(range(10,210,5))

m = [0]*(hidden_layers*len(neurons))

for i in range(1,hidden_layers+1):
    for idx,i2 in enumerate(neurons):
        m[((i-1)*len(neurons)) + (idx)] = [neurons[idx]]*i

param_space = {
    'hidden_layer_sizes':   m,
    'activation':           ['identity', 'logistic', 'tanh', 'relu'],
    'learning_rate':        ['constant','invscaling','adaptive'],
    'learning_rate_init':   np.arange(1e-4,0.1+1e-4,1e-4)
    }

clf_rand_search = RandomizedSearchCV(clf, param_space, n_iter=10,
                                        scoring="accuracy", verbose=True, cv=2,
                                        n_jobs=-1)

ignatius · Accepted Answer

UPDATE AFTER EXCHANGING COMMENTS
You might be facing issues with the computation of the accuracy

I think the MLP with the log-loss can work well with your output data. Your output data is a vector $(N, 230)$ with $N$ the number of samples, with only 1s or 0s. This data is indeed a one-hot encoded vectors with multiple 1s.

You are computing the accuracy by flattening the predictions and comparing them elemt-wise ($N*230$ elemetns),

For computing the accuracy, I guess if the classifier is not comparing the one-hot vectors (ground truths and predictions) element wise , but comparing if they re the same for each one of the $N$ samples.  Think that if only one element of the 230 onte-hot encoded is missclassified in one sample, this accuracy will drop by $1/$N$*100$ %. If accuracy is computed element-wise, the drop in the accuracy will be $1/(230*N)*100$ %...

Try to update your accuracy computation assingning a 1 only if each predicted (230, 1) vector of each sample is equal to its (230,1) ground truth vector.

PREVIOUS ANSWER BEFORE COMMENTS
I guess you are experimeting problems with your data...
Your data does not seem to be a binary classification problem, for binary classification your data should have the following dimensions:

Input: $(N, K)$
Output: $(N, 1)$ or $(N, 2)$ if is one-hot encoded.

With $N$ the number of sample of the split, $K$ the dimension of the feature input space
If your output data is of size
$(N, D)$,
then it could be a regression problem that maps a feature space of $K$ to $D$. Yout MLP should try to mimmic a function.
$f'(X): K to D$
For regression problems you must use other metrics, like MAE or MSE
You can treat the problem from various perspectives (just some ideas).

as a masking/segmentation problem in 1D (making an analogy with mask segmentation for 2D images...)
A Bayesian approach could work well also. You can try to estimate the posterior probability of parameter of the Binomial distribution which models the nodes being 1 or 0 (posterior) from your data (likelihood)... and then update your posterior distribution with new data so that you can have updated and more accurate credible intervals. Obviouslly you may thing about if a Binomial distribution models fine your data (i.e. independancy, etc...) Maybe other probability distribution works better

Thge following example works well. I know it does not answer the question (I'm just trying to help), but at least, we can see with an example that RandomizedSearchCV should give the same score as the accuracy_score with your code for a binary classification problem with a MLP
from sklearn.datasets import make_classification
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import RandomizedSearchCV
from sklearn.metrics import accuracy_score
from scipy.stats import uniform
from sklearn.neural_network import MLPClassifier
from sklearn.model_selection import train_test_split
import numpy as np
X, y = make_classification(n_samples=1000, random_state=0)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
clf = MLPClassifier(solver = 'adam', 
                max_iter=9999,
                alpha=1e-5
                )

hidden_layers = 2
neurons = list(range(10,210,5))

m = [0]*(hidden_layers*len(neurons))

for i in range(1,hidden_layers+1):
    for idx,i2 in enumerate(neurons):
        m[((i-1)*len(neurons)) + (idx)] = [neurons[idx]]*i

param_space = {
    'hidden_layer_sizes':   m,
    'activation':           ['identity', 'logistic', 'tanh', 'relu'],
    'learning_rate':        ['constant','invscaling','adaptive'],
    'learning_rate_init':   np.arange(1e-4,0.1+1e-4,1e-4)
    }

clf_rand_search = RandomizedSearchCV(clf, param_space, n_iter=2,
                                        scoring="accuracy", verbose=True, cv=2,
                                        n_jobs=-1)

search = clf_rand_search.fit(X_train, y_train)
print(search.score(X_test,y_test))

print(accuracy_score(y_test.flatten(), search.predict(X_test).flatten()))

Getting very low/ wrong accuracy from RandomizedSearchCV

One Answer

Add your own answers!

Ask a Question