Whether to replace NaN values in result column

Question

I have a training dataset where we have to predict "Result" based on features "A", "B", "C" and "D" using machine learning. For a few rows, the "Result" is empty (7/19612).
While for the other features I have filled the NaN values with their mean, I don't understand whether or not to do so with the result column. Will it better to drop the rows entirely instead?

prashant0598 · Answer

There is no correct way as dealing with nan values it depends on the dataset.But i would suggest rather than doing single imputation you could do multiple imputation using IterativeImputer in sklearn.
As very less target variables are missing it won't make a significant impact unless the problem statement is sensitive to finding outliers like fraud detection.
Refer Iterative Imputation

Whether to replace NaN values in result column

One Answer

Add your own answers!

Ask a Question