TransWikia.com

Error while plotting Logistic Regression Classification

Data Science Asked on August 15, 2021

I was trying to plot by using the following code

# Visualising the Training set results
from matplotlib.colors import ListedColormap
X_set, y_set = X_train, Y_train
X1, X2 = np.meshgrid(np.arange(start = X_set[:, 0].min() - 1, stop = X_set[:, 0].max() + 1, step = 0.01),
                     np.arange(start = X_set[:, 1].min() - 1, stop = X_set[:, 1].max() + 1, step = 0.01))
plt.contourf(X1, X2, classifier.predict(np.array([X1.ravel(), X2.ravel()]).T).reshape(X1.shape),
             alpha = 0.75, cmap = ListedColormap(('red', 'green')))
plt.xlim(X1.min(), X1.max())
plt.ylim(X2.min(), X2.max())
for i, j in enumerate(np.unique(y_set)):
    plt.scatter(X_set[y_set == j, 0], X_set[y_set == j, 1],
                c = ListedColormap(('red', 'green'))(i), label = j)
plt.title('Logistic Regression (Training set)')
plt.xlabel('Age')
plt.ylabel('Estimated Salary')
plt.legend()
plt.show()

I’m getting the following error

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-26-af47df65cc77> in <module>
      2 from matplotlib.colors import ListedColormap
      3 X_set, y_set = X_train, Y_train
----> 4 X1, X2 = np.meshgrid(np.arange(start = X_set[:, 0].min() - 1, stop = X_set[:, 0].max() + 1, step = 0.01),
      5                      np.arange(start = X_set[:, 1].min() - 1, stop = X_set[:, 1].max() + 1, step = 0.01))
      6 plt.contourf(X1, X2, classifier.predict(np.array([X1.ravel(), X2.ravel()]).T).reshape(X1.shape),

C:ProgramDataAnaconda3libsite-packagespandascoreframe.py in __getitem__(self, key)
   2978             if self.columns.nlevels > 1:
   2979                 return self._getitem_multilevel(key)
-> 2980             indexer = self.columns.get_loc(key)
   2981             if is_integer(indexer):
   2982                 indexer = [indexer]

C:ProgramDataAnaconda3libsite-packagespandascoreindexesbase.py in get_loc(self, key, method, tolerance)
   2895                 )
   2896             try:
-> 2897                 return self._engine.get_loc(key)
   2898             except KeyError:
   2899                 return self._engine.get_loc(self._maybe_cast_indexer(key))

pandas_libsindex.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas_libsindex.pyx in pandas._libs.index.IndexEngine.get_loc()

TypeError: '(slice(None, None, None), 0)' is an invalid key

Datasets:

X_train

     Pclass  Sex        Age     Fare  Embarked
0         3    1  22.000000   7.2500         2
1         1    0  38.000000  71.2833         0
2         3    0  26.000000   7.9250         2
3         1    0  35.000000  53.1000         2
4         3    1  35.000000   8.0500         2
..      ...  ...        ...      ...       ...
886       2    1  27.000000  13.0000         2
887       1    0  19.000000  30.0000         2
888       3    0  29.699118  23.4500         2
889       1    1  26.000000  30.0000         0
890       3    1  32.000000   7.7500         1

[891 rows x 5 columns]

Y_train

     Survived
0           0
1           1
2           1
3           1
4           0
..        ...
886         0
887         1
888         0
889         1
890         0

[891 rows x 1 columns]

One Answer

The error occurs because X_train and Y_train are pandas dataframes and you are trying to directly access them as arrays. There are two solutions:

  1. Convert X_train and Y_train to arrays before using them as:

X_set, y_set = X_train.values, Y_train.values

  1. Whenever you are accessing their values use iloc so everywhere replace X_set by X_set.iloc

Answered by bkshi on August 15, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP