TransWikia.com

value error of incompatible wrong dimension during training a model

Data Science Asked on June 3, 2021

I was implementing a decision tree on a dataset. Before that, I wanted to transform a particular column with CountVectorizer. For this, I am using pipeline to make it simpler.

But there is an error of incompatible row dimensions.

code

# Imported the libraries....
from sklearn.feature_extraction.text import CountVectorizer as cv
from sklearn.preprocessing import OneHotEncoder as ohe
from sklearn.compose import ColumnTransformer as ct
from sklearn.pipeline import make_pipeline as mp
from sklearn.tree import DecisionTreeClassifier as dtc

transformer=ct(transformers=[('review_counts',cv(),['verified_reviews']),
                             ('variation_dummies', ohe(),['variation'])
                            ],remainder='passthrough')

pipe= mp(transformer,dtc(random_state=42))

x= data[['rating','variation','verified_reviews']].copy()
y= data.feedback

x_train,x_test,y_train,y_test= tts(x,y,test_size=0.3,random_state=42,stratify=y)
print(x_train.shape,y_train.shape)             # ((2205, 3), (2205,))

pipe.fit(x_train,y_train)                       # Error on this line

Error

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-79-a981c354b190> in <module>()
----> 1 pipe.fit(x_train,y_train)

7 frames
/usr/local/lib/python3.6/dist-packages/scipy/sparse/construct.py in bmat(blocks, format, dtype)
    584                                                     exp=brow_lengths[i],
    585                                                     got=A.shape[0]))
--> 586                     raise ValueError(msg)
    587 
    588                 if bcol_lengths[j] == 0:

ValueError: blocks[0,:] has incompatible row dimensions. Got blocks[0,1].shape[0] == 2205, expected 1.

pic

Questions

  1. How is this error of incompatible row dimension forming?
  2. How it can be solved?

One Answer

As per the documentation, whenever the transformer expects a 1D array as input, the columns were specified as a string ("xxx"). For the transformers which expects 2D data, we need to specify the column as a list of strings (["xxx"]).

so the code below will work.

## Important: i have passed the columns a string to CV and list of columns to OHE

transformer=ct(transformers=[('review_counts',cv(),'verified_reviews'), 
                             ('variation_dummies', ohe(),['variation'])
                            ],remainder='passthrough')

Credit goes to Another man who helped me on this.

Answered by tedd on June 3, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP