TransWikia.com

How we can Identify Specific Feature from a larger amount of Dataset?

Data Science Asked by Vikas Ukani on March 7, 2021

In Machine Learning, we need to play with any kind of datasets.

In the Dataset, There are too many records and features, Some datasets had lots of features (sometimes it’s called columns),

So the main problem for a data scientist is to know the behavior of the dataset and get meaningful insights from the dataset,

Let’s take one example from Kaggle platform, There’s a dataset for house price prediction to know what the price of a house, based on there features,

here is the link of the datasets
House Price Prediction Advance Regression Machine Learning Problem

So, The question is how to identify the meaningful feature from the dataset?

One Answer

I dont think there is one correct way, but what you can do is

  1. Use PCA if you have many features. This will reduce some number of features based on the amount of variance in each feature. You may use other dimensionality reduction techniques.
  2. You can use models like Lightgbm or random forest and know which feature are important. 3. You may use Lasso Regression for feature selection.
  3. You may use intuition to see if some features are just does not make any sense.

These are some of the methods to understand important features. You may read this article: https://towardsdatascience.com/the-5-feature-selection-algorithms-every-data-scientist-need-to-know-3a6b566efd2

Correct answer by Hitesh Somani on March 7, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP