Data Science Asked on April 22, 2021
I am trying to implement a gradient descent algorithm for a simple linear function:
y(x) = x
Where initial hypothesis function is:
h(x) = 0.5 * x
and learning rate:
alpha = 0.1
Target function graph is blue and hypothesis is green.
Cost function:
J = 1/2m * sum[(h(x) - y(x)) * (h(x) - y(x))]
Gradient descent:
q = q - alpha/m * sum[(h(x) - y(x)) * x]
My implementation does not converge:
import numpy as np
import matplotlib.pyplot as plt
def y(x):
return x
def get_h(q):
""" Create hypothesis function
Args:
q - coefficient to multiply x with
Returns:
h(x) - hypothesis function
"""
return lambda x: q*x
def j(x, y, h):
"""Calculte a single value of a cost function
Args:
x - target function argument values
y - target function
h - hypothesis function
Returns:
Value of a cost function for the given hypothesis function
"""
m = len(x)
return (1/(2*m)) * np.sum( np.power( (y(x) -h(x)),2 ) )
def df(h, y, xs):
"""Calculate gradient of a cost function
Args:
h - hypothesis function
y - target function
xs - x values
Returns:
differential of a cost function for a hypothesis with given q
"""
df = np.sum((h(xs)-y(xs))*xs) / len(xs)
return df
xs = np.array(range(100))
ys = y(xs)
hs = h(xs)
costs = []
qs = []
q = 0.5
alpha = 0.1
h = get_h(0.5) # initial hypothesis function
iters = 10
for i in range(iters):
cost = j(xs,y,h)
costs.append(cost)
qs.append(q)
print("q: {} --- cost: {}".format(q,cost))
df_cost = df(h, y, xs)
q = q - alpha * df_cost # update coefficient
h = get_h(q) # new hypothesis
What am I doing wrong? Should I account for q0 even if my target function intercept is zero?
Update
Answer: https://stats.stackexchange.com/questions/484750/linear-function-gradient-descent
As you and @gunes pointed out in this post, your formula are correct, but hyperparameters $alpha$ and $iterations$ were not well adjusted.
Answered by etiennedm on April 22, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP