Stack Overflow Asked by Charles Biwer on December 30, 2021
so I am looking at a supervised, binary prediction problem. Dataframe is mostly categorical which I one-hot encoded. I handled all missing values, NaN, and infinite values.
The dataframe (df) has 2 numerical features and the rest categorical (one-hot encoded). Reminder the dependent variable is binary.
dataset_target = df[['dependent_var']].values
dataset_target = pd.DataFrame(dataset_target)
dataset_target.columns=['dependent_var']
regressor = RandomForestRegressor(n_estimators=500, random_state=0, n_jobs=-1)
# Train the classifier
regressor.fit(df, dataset_target.values)
# Print the name and gini importance of each feature
for feature in regressor.feature_importances_:
print(feature)
The model is supposed to help me select the most important features, it is running however with very unsatisfying results (only 0 and one 1) which I don’t even understand and thus don’t know what to change on my input.
this is my first prediction project at my internship as a DA, very glad about any help.
This is a snipplet of the output of the randomForrest:
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP