TransWikia.com

Sklearn ValueError: X has 2 features per sample; expecting 11

Data Science Asked by tbone on May 4, 2021

I try to visualizing multiple logistic regression but I get the above error.

I’m practicing on red wine quality data set from kaggle.

Here is a full traceback:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-88-230199fd3a97> in <module>
      4 X1, X2 = np.meshgrid(np.arange(start = X_set[:, 0].min() - 1, stop = X_set[:, 0].max() + 1, step = 0.01),
      5                      np.arange(start = X_set[:, 1].min() - 1, stop = X_set[:, 1].max() + 1, step = 0.01))
----> 6 plt.contourf(X1, X2, classifier.predict(np.array([X1.ravel(), X2.ravel()]).T).reshape(X1.shape),
      7              alpha = 0.75, cmap = ListedColormap(('red', 'green')))
      8 plt.xlim(X1.min(), X1.max())

/usr/local/lib/python3.6/dist-packages/sklearn/linear_model/base.py in predict(self, X)
    287             Predicted class label per sample.
    288         """
--> 289         scores = self.decision_function(X)
    290         if len(scores.shape) == 1:
    291             indices = (scores > 0).astype(np.int)

/usr/local/lib/python3.6/dist-packages/sklearn/linear_model/base.py in decision_function(self, X)
    268         if X.shape[1] != n_features:
    269             raise ValueError("X has %d features per sample; expecting %d"
--> 270                              % (X.shape[1], n_features))
    271 
    272         scores = safe_sparse_dot(X, self.coef_.T,

ValueError: X has 2 features per sample; expecting 11

Here is the code:

#Split the variables
X = dataset.iloc[:, :11].values
y = dataset.iloc[:, -1].values

# Splitting the dataset into the Training set and Test set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state = 0)

# Feature Scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

# Fitting Logistic Regression to the Training set
from sklearn.linear_model import LogisticRegression
classifier = LogisticRegression(random_state = 0)
classifier.fit(X_train, y_train)

# Predicting the Test set results
y_pred = classifier.predict(X_test)

# Making the Confusion Matrix
from sklearn.metrics import confusion_matrix, accuracy_score
cm = confusion_matrix(y_test, y_pred)
acc_score = accuracy_score(y_test, y_pred)
print(acc_score*100)

Below is the visualization code:

# Visualising the Training set results
from matplotlib.colors import ListedColormap
X_set, y_set = X_train, y_train
X1, X2 = np.meshgrid(np.arange(start = X_set[:, 0].min() - 1, stop = X_set[:, 0].max() + 1, step = 0.01),
                     np.arange(start = X_set[:, 1].min() - 1, stop = X_set[:, 1].max() + 1, step = 0.01))
plt.contourf(X1, X2, classifier.predict(np.array([X1.ravel(), X2.ravel()]).T).reshape(X1.shape),
             alpha = 0.75, cmap = ListedColormap(('red', 'green')))
plt.xlim(X1.min(), X1.max())
plt.ylim(X2.min(), X2.max())
for i, j in enumerate(np.unique(y_set)):
    plt.scatter(X_set[y_set == j, 0], X_set[y_set == j, 1],
                c = ListedColormap(('red', 'green'))(i), label = j)
plt.title('Logistic Regression (Training set)')
plt.xlabel('Age')
plt.ylabel('Estimated Salary')
plt.legend()
plt.show()

I know that the error is that the model has been trained using 11 functions, but it is envisaged to use 2 functions, but I do not know exactly what to change.

3 Answers

X = dataset.iloc[:, :11].values

It is asking for 2 features per sample. But you are passing the 11. Do it like this

X = dataset.iloc[:, :2].values

Correct answer by Ajay Thakur on May 4, 2021

You have to make combination of the code you wrote. You can only plot 3 axis at a time. x,y is what you give and z is the contour plane. Think of it visually. SO figure out what you want each axis to be and make one for x1,y,z next will be x2,y,z then x3,y,z and so on and so forth. I believe you would want index, variant, prediction as the 3 axis.

Answered by mystichrome_blue on May 4, 2021

Just do one thing

From sklearn import *

The problem will be fixed

Answered by Nusaiba Noushad on May 4, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP