Data Science Asked on October 4, 2021
I am trying to generate a dataset which involves 1 feature variable(X
) and 1 target variable(y
).
The feature variable represents values on the X-axis on the graph and target variable represents values on Y-axis.
Datatype of X
: integer
Datatype of y
: floating point
I have N
such graphs for same values of X
, but a slight variation in y
values.
One of the graph is as follows:
I want to fit the data into a regression.
Now, my question is how to generate the dataset for this use case. Should I include values from all graphs into a single dataset? But, in this case, for every unique value of X
, I will have N
rows with same value of X
and a different value of y
?
I am doubtful about this approach.
Any help is greatly appreciated!
I'm not sure what's the context of your question but there's no problem with the approach you've outlined. Different values of y
for the same unique value of x
(over different rows, such that for example you have: x = {1, 1}
, y = {1, 2}
) are a natural result of the noise usually assumed in the model you fit (e.g. $y = x + epsilon$).
Hope this helps.
Answered by Iyar Lin on October 4, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP