Data Science Asked on November 13, 2021
I am trying to implement a simple multivariate linear regression model without using any inbuilt machine libraries. So far, I have been able to get a root mean squared error for training about $2.93$ and the model from the normal (closed-form) equation is able to produce a training RMSE of $~2.3$. I am looking for ways in which I can improve my implementation of the gradient descent algorithm. Below is my implementation:
My gradient descent method looks like this:
$theta = theta – [(alpha/2N) * X (Xtheta – Y)]$ where $theta$ is the model parameter, $N$ is the number of training elements, $X$ is the input and $Y$ are the target elements. $alpha$ is the step size.
def gradientDescent(self):
for i in range(self.iters):
# T = T - (alpha/2N) * X*(XT - Y)
self.theta = self.theta - (self.alpha/len(self.X)) * np.sum(self.X * (self.X @ self.theta.T - self.Y), axis=0)
return errors
I had set the $alpha$ as $0.1$ and number of iterations as 1000. The gradient descent reaches convergence at around 700-800 iterations (checked).
My error function is like:
def error_function(self):
# Error function: (1/2N) * (XT - Y)^2 where T is theta
error_values = np.power(((self.X @ self.theta.T) - self.Y), 2)
return np.sum(error_values)/(2 * len(self.X))
I was expecting the training error from the gradient descent and the normal equations would turn out to be similar, but they have a bit of a huge difference. So, I wanted to know whether I am doing anything wrong or not.
PS I have not normalized the data, yet. Normalizing leads to a much lower RMSE (~$0.22$)
That could be due to many different reasons. The most important one is that your cost function might be stuck in local minima. To solve this issue, you can use a different learning rate or change your initialization for the coefficients.
There might be a problem in your code for updating weights or calculating the gradient.
However, I used both methods for a simple linear regression and got the same results as follows:
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.datasets import make_regression
# generate regression dataset
X, y = make_regression(n_samples=100, n_features=1, noise=30)
def cost_MSE(y_true, y_pred):
'''
Cost function
'''
# Shape of the dataset
n = y_true.shape[0]
# Error
error = y_true - y_pred
# Cost
mse = np.dot(error, error) / n
return mse
def cost_derivative(X, y_true, y_pred):
'''
Compute the derivative of the loss function
'''
# Shape of the dataset
n = y_true.shape[0]
# Error
error = y_true - y_pred
# Derivative
der = -2 / n * np.dot(X, error)
return der
# Lets run an example
X_new = np.concatenate((np.ones(X.shape), X), axis = 1)
learning_rate = 0.1
X_new_T = X_new.T
n_iters = 100
mse = []
#initialize the weight vector
alpha = np.array([0, np.random.rand()])
for _ in range(n_iters):
# Compute the predicted y
y_pred = np.dot(X_new, alpha)
# Compute the MSE
mse.append(cost_MSE(y, y_pred))
# Compute the derivative
der = cost_derivative(X_new_T, y, y_pred)
# Update the weight
alpha -= learning_rate * der
alpha
for the gradient descent the coefficients were:
array([-3.36575322, 28.06370831])
Here is the code for closed-form solution:
np.dot(np.linalg.inv(np.dot(X_new_T,X_new)), np.dot(X_new_T, y))
And the coefficients for the closed-form solution:
array([-3.36575322, 28.06370831])
As the coefficients are equal, the RMSE, MSE, R2 are equal.
Answered by nimar on November 13, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP