Data Science Asked by user94586 on July 26, 2021
I have manually created a random data set around some mean value and I have tried to use gradient descent linear regression to predict this simple mean value.
I have done exactly like in the manual and for some reason my predictor coefficients are going to infinity, even though it worked for another case.
Why, in this case, can it not predict a simple 1.4 value?
clear all;
n=10000;
t=1.4;
sigma_R = t*0.001;
min_value_t = t-sigma_R;
max_value_t = t+sigma_R;
y_data = min_value_t + (max_value_t - min_value_t) * rand(n,1);
x_data=[1:10000]';
m=0
c=0
L=0.0001
epochs=1000 %iterations
for i=1:epochs
y_pred=m.*x_data+c;
D_m=(-2/n)*sum(x_data.*(y_data-y_pred));
D_c=(-2/n)*sum((y_data-y_pred));
m=m-L*D_m;
c=c-L*D_c;
end
plot(x_data,y_data,'.')
hold on;
grid;
plot(x_data,y_pred)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%
question update:
Hello , I have tried to write down your code in the Matlab language i am more femiliar.
my feature matrices of the form NX2 [1,X_data] is called Xmat.
i followed every step in converting the code, and i get in both Theta NAN.
Where did i go wrong?
$
%%start Matlab code
n=1000;
t=1.4;
sigma_R = t*0.001;
min_value_t = t-sigma_R;
max_value_t = t+sigma_R;
y_data = min_value_t + (max_value_t – min_value_t) * rand(n,1);
x_data=[1:1000];
L=0.0001; %learning rate
%plot(x_data,y_data);
itter=1000;
theta_0=0;
theta_1=0;
theta=[theta_0;theta_1];
itter=1000;
for i=1:itter
onss=ones(1,1000);
x_mat=[onss;x_data]’;
pred=x_mat*theta;
residuals = (pred-y_data);
for k=1:2 %start theta loop
partial=2*dot(residuals,x_mat(:,k));
theta(k)=theta(k)-L*partial;
end%end theta loop
end % end itteration loop
%%end matlab code
$
Although this answer shows code with Python, the theory is exactly the same. Two advices when running a linear regression via gradient descent from scratch:
Your data plotted with matplotlib:
import numpy as np
from numpy.random import randn, seed
x_data = 20 * randn(1000) + 100
y_data = x_data + (10 * randn(1000) + 50)
n=1000
t=1.4
sigma_R = t*0.001
min_value_t = t-sigma_R
max_value_t = t+sigma_R
y_data = min_value_t + (max_value_t - min_value_t) * np.random.rand(n,1)
x_data=np.array(range(1000))
pyplot.scatter(x_data, y_data)
pyplot.show()
Build your dataset to train the model:
final_df_DICT = {'X': x_data}
import pandas as pd
H = pd.DataFrame(final_df_DICT)
feature_matrix = np.zeros(n*2)
feature_matrix.shape = (n, 2)
feature_matrix[:,0] = 1
feature_matrix[:,1] = H['X']
#standardize features
feature_matrix = (feature_matrix - feature_matrix.mean()) / feature_matrix.std()
target_data = y_data.reshape(len(y_data), )
Look at the feature matrix, having the first column filled with 1's and the second column having the actual x_data:
Model training with your hiperparameters:
w = [0, 0]
L=0.0001
epochs=1000
iteration = 0
cost=[]
while iteration < epochs:
pred = np.dot(feature_matrix, w)
residuals = pred-target_data
#we calculate the gradient for the 2 coeffs with the scalar product
for i in range(len(w)):
partial = 2*np.dot(residuals, feature_matrix[:, i])
w[i] = w[i] - L*partial
iteration += 1
computed_cost = np.sum(np.power((pred - target_data), 2)) / n
cost.append(computed_cost)
print('coef: {}'.format(w))
print('cost: {}'.format(cost[-1]))
Result:
coef: [-1.80963253e+00 -6.15189807e-06] cost: 6.466287828899486e-07
Let's plot the fitted regression model predictions over the original dataset (we trained on the whole dataset, not taking into account any validation set in this case...):
my_predictions = np.dot(feature_matrix, w)
pyplot.scatter(feature_matrix[:, 1], target_data)
pyplot.scatter(feature_matrix[:, 1], my_predictions, color='r')
pyplot.show()
Answered by German C M on July 26, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP