Data Science Asked by Bharathi on March 7, 2021
I am using the sklearn_api of gensim to create an estimator for a Word2vec model to pass it to sklearn’s gridsearch . My code is as follows :
from gensim.sklearn_api import W2VTransformer
from sklearn.model_selection import GridSearchCV
s_obj = W2VTransformer(size=100,min_count=1,window=5)
parameters = {'size':(100,150,200),'min_count':(1,2,4),'alpha':(0.025,0.015)}
s_model = GridSearchCV(s_obj,parameters,cv=2)
s_model.fit(sentences)
print(s_model.best_params_)
Running the above code, I get the following error:
"If no scoring is specified, the estimator passed should have a 'score' method. The estimator W2VTransformer(alpha=0.025, batch_words=10000, cbow_mean=1,
hashfxn=<built-in function hash>, hs=0, iter=5,
max_vocab_size=None, min_alpha=0.0001, min_count=1, negative=5,
null_word=0, sample=0.001, seed=1, sg=0, size=100,
sorted_vocab=1, trim_rule=None, window=5, workers=3) does not."
I do not know how to resolve this. I tried using scoring='accuracy'
or scoring='hamming'
but they don’t seem to work either.
Can someone please help me get rid of this error?
Do:
from sklearn.metrics import accuracy_score, make_scorer
s_model = GridSearchCV(s_obj,parameters,cv=2, scoring=make_scorer(accuracy_score))
Answered by Noah Weber on March 7, 2021
I think you don't need all the functionality of GridSearchCV i.e. fit, K-Fold.
So you simply write a custom function to try all the different options and see which gives the best score.
First thing
You will need to define your score. It is what you are actually looking for
e.g. maybe the ratio of dimensions in vector and the word count.
from gensim.sklearn_api import W2VTransformer
import itertools
def score_func(word, vector):
#Define what you want to measure e.g. Ratio of Vector's dim and Word count etc.
# I am returning a constant for demonstration
return 1.0
Then
We can simply loop on all the parm combination to get the best one
parm_dict = {'size':(100,150,200),'min_count':(1,2,4),'alpha':(0.025,0.015)}
def cust_param_search(parm_dict):
score_best, parm_best = 0,()
s_obj = W2VTransformer(size=100,min_count=1,window=5)
size, min_count, alpha = [tup for k,tup in parm_dict.items()] # Individual parm tuples
parm_combo = list(itertools.product(size, min_count, alpha)) # Create all combinations
for parms in parm_combo:
s, m , a = parms
s_obj = W2VTransformer(size=s,min_count=m,window=5, alpha = a)
##Get other stuff to call the score function
word, vector = "Hello",["H","L","O"] #Dummy parameters
score = score_func(word, vector)
if score > score_best:
score_best = score
parm_best = parms
print("Best score -",score_best, "Best parms - ",parm_best)
cust_param_search(parm_dict)
Answered by 10xAI on March 7, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP