TransWikia.com

Can we have a dataset with slight difference in target values for same value of feature variable?

Data Science Asked on October 4, 2021

I am trying to generate a dataset which involves 1 feature variable(X) and 1 target variable(y).

The feature variable represents values on the X-axis on the graph and target variable represents values on Y-axis.

Datatype of X: integer

Datatype of y: floating point

I have N such graphs for same values of X, but a slight variation in y values.

One of the graph is as follows:

enter image description here

I want to fit the data into a regression.

Now, my question is how to generate the dataset for this use case. Should I include values from all graphs into a single dataset? But, in this case, for every unique value of X, I will have N rows with same value of X and a different value of y?

I am doubtful about this approach.

Any help is greatly appreciated!

One Answer

I'm not sure what's the context of your question but there's no problem with the approach you've outlined. Different values of y for the same unique value of x (over different rows, such that for example you have: x = {1, 1}, y = {1, 2}) are a natural result of the noise usually assumed in the model you fit (e.g. $y = x + epsilon$).

Hope this helps.

Answered by Iyar Lin on October 4, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP