Data Science Asked by revy on September 18, 2020
I was trying to implement Logistic Regression from scratch in python to learn better how it works under the hood. In particular I am following this video tutorial from Andrew Ng.
This is the dataset I am using for testing the algorithm: marks.txt
I’ve found that without normalizing the data, the algorithm does not converge and the loss is not decreasing (sometimes it is a NaN).
This is the implementation:
import numpy as np
import pandas as pd
def normalize(X, axis=0):
return (X - np.mean(X, axis=axis)) / np.std(X, axis=axis)
class LogisticRegression():
def __init__(self, num_epochs=10000, lr=0.001 ):
self.num_epochs = num_epochs
self.lr = lr
def __sigmoid(self, Z):
return 1 / (1 + np.exp(-Z))
def __loss(self, y, A):
return - np.mean(y * np.log(A) + (1 - y) * np.log(1 - A))
def fit(self, X, y):
n, m = X.shape
print(f'Number of features = {n}')
print(f'Number of samples = {m}')
W = np.zeros((n, 1))
b = 0
for epoch in range(1, self.num_epochs + 1):
Z = np.dot(W.T, X) + b
A = self.__sigmoid(Z)
dZ = A - y
dW = 1/m * np.dot(X, dZ.T)
db = 1/m * np.sum(dZ)
W -= self.lr * dW
b -= self.lr * db
if epoch == 1 or epoch % 100 == 0:
J = self.__loss(y, A)
print(f'Epoch {epoch} - Loss = {J}')
columns = [
'mark_1',
'mark_2',
'y'
]
data = pd.read_csv('marks.txt', names=columns, header=None)
X = data.iloc[:, :-1].values
y = data.iloc[:, -1:].values
lr = LogisticRegression(num_epochs=10000, lr=0.01)
lr.fit(X.T, y.T)
If I execute this, I got the following output:
Number of features = 2
Number of samples = 100
Epoch 1 - Loss = 0.6931471805599453
Epoch 100 - Loss = nan
Epoch 200 - Loss = 4.804976603222295
Epoch 300 - Loss = 7.859811065112183
Epoch 400 - Loss = nan
Epoch 500 - Loss = 4.7897185742553186
Epoch 600 - Loss = 7.836867515204696
Epoch 700 - Loss = nan
Epoch 800 - Loss = 4.774454897975551
Epoch 900 - Loss = 7.813880674612202
Epoch 1000 - Loss = nan
Epoch 1100 - Loss = 4.759205019552172
Epoch 1200 - Loss = 7.790844866695895
Epoch 1300 - Loss = nan
Epoch 1400 - Loss = 4.743971469023722
Epoch 1500 - Loss = 7.7677542901506875
Epoch 1600 - Loss = nan
Epoch 1700 - Loss = 4.728752106484584
Epoch 1800 - Loss = 7.744603001274255
Epoch 1900 - Loss = nan
Epoch 2000 - Loss = 4.713554131813897
Epoch 2100 - Loss = 7.721384895223854
Epoch 2200 - Loss = nan
Epoch 2300 - Loss = 4.698361852011675
Epoch 2400 - Loss = 7.698093686352158
Epoch 2500 - Loss = nan
Epoch 2600 - Loss = 4.683196015386
Epoch 2700 - Loss = 7.674722887733128
Epoch 2800 - Loss = nan
Epoch 2900 - Loss = 4.6680544815204925
Epoch 3000 - Loss = 7.6512657900116805
Epoch 3100 - Loss = nan
Epoch 3200 - Loss = 4.65294492110793
Epoch 3300 - Loss = 7.627715439737382
Epoch 3400 - Loss = nan
Epoch 3500 - Loss = 4.637870088341966
Epoch 3600 - Loss = 7.604064617373076
Epoch 3700 - Loss = nan
Epoch 3800 - Loss = 4.62281379407897
Epoch 3900 - Loss = 7.580305815203287
Epoch 4000 - Loss = nan
Epoch 4100 - Loss = 4.60781250038029
Epoch 4200 - Loss = 7.556431215405509
Epoch 4300 - Loss = nan
Epoch 4400 - Loss = 4.592835472351133
Epoch 4500 - Loss = 7.532432668589291__main__:19: RuntimeWarning: divide by zero encountered in log
__main__:19: RuntimeWarning: invalid value encountered in multiply
Epoch 4600 - Loss = nan
Epoch 4700 - Loss = 4.57789045326783
Epoch 4800 - Loss = 7.508301673152992
Epoch 4900 - Loss = nan
Epoch 5000 - Loss = 4.563010160089178
Epoch 5100 - Loss = 7.48402935585585
Epoch 5200 - Loss = nan
Epoch 5300 - Loss = 4.548178514140011
Epoch 5400 - Loss = 7.459606454052721
Epoch 5500 - Loss = nan
Epoch 5600 - Loss = 4.533383810118562
Epoch 5700 - Loss = 7.4350233000888215
Epoch 5800 - Loss = nan
Epoch 5900 - Loss = 4.518654642394596
Epoch 6000 - Loss = 7.4102698084014715
Epoch 6100 - Loss = nan
Epoch 6200 - Loss = 4.503979863783454
Epoch 6300 - Loss = 7.385335465922019
Epoch 6400 - Loss = nan
Epoch 6500 - Loss = 4.489373494021938
Epoch 6600 - Loss = 7.3602093264129
Epoch 6700 - Loss = nan
Epoch 6800 - Loss = 4.474823890959029
Epoch 6900 - Loss = 7.334880009407734
Epoch 7000 - Loss = nan
Epoch 7100 - Loss = 4.46034588627989
Epoch 7200 - Loss = 7.309335704445365
Epoch 7300 - Loss = nan
Epoch 7400 - Loss = 4.445943673235942
Epoch 7500 - Loss = 7.283564181295987
Epoch 7600 - Loss = nan
Epoch 7700 - Loss = 4.431627025808597
Epoch 7800 - Loss = 7.257552806867497
Epoch 7900 - Loss = nan
Epoch 8000 - Loss = 4.4174006755844095
Epoch 8100 - Loss = 7.231288569447483
Epoch 8200 - Loss = nan
Epoch 8300 - Loss = 4.403243132948716
Epoch 8400 - Loss = 7.2047581108777345
Epoch 8500 - Loss = nan
Epoch 8600 - Loss = 4.3891854547270475
Epoch 8700 - Loss = 7.177947767170446
Epoch 8800 - Loss = nan
Epoch 8900 - Loss = 4.3752278129468944
Epoch 9000 - Loss = 7.150843617955713
Epoch 9100 - Loss = nan
Epoch 9200 - Loss = 4.361364188554799
Epoch 9300 - Loss = 7.123431544995584
Epoch 9400 - Loss = nan
Epoch 9500 - Loss = 4.34760573682999
Epoch 9600 - Loss = 7.095697299812474
Epoch 9700 - Loss = nan
Epoch 9800 - Loss = 4.333968348729086
Epoch 9900 - Loss = 7.067626580257764
Epoch 10000 - Loss = nan
Otherwise, if I normalize the data (zero mean unit variance) before fitting the model, it seems to work correctly (I can see the loss decreasing):
def normalize(X, axis=0):
return (X - np.mean(X, axis=axis)) / np.std(X, axis=axis)
# [...]
data = pd.read_csv('data/marks.csv', names=columns, header=0)
X = data.iloc[:, :-1].values
y = data.iloc[:, -1:].values
X = normalize(X) # Normalize data
lr = LogisticRegression(num_epochs=10000, lr=0.01)
lr.fit(X.T, y.T)
Output:
Number of features = 2
Number of samples = 100
Epoch 1 - Loss = 0.6931471805599453
Epoch 100 - Loss = 0.5733935847364559
Epoch 200 - Loss = 0.4967653811151946
Epoch 300 - Loss = 0.4456019909522728
Epoch 400 - Loss = 0.4094643825544129
Epoch 500 - Loss = 0.38268993233906584
Epoch 600 - Loss = 0.3620711093033572
Epoch 700 - Loss = 0.34569287144258726
Epoch 800 - Loss = 0.33235335562007345
Epoch 900 - Loss = 0.3212643235482712
Epoch 1000 - Loss = 0.3118887927384652
Epoch 1100 - Loss = 0.3038488197031603
Epoch 1200 - Loss = 0.29687082702689194
Epoch 1300 - Loss = 0.29075191833313574
Epoch 1400 - Loss = 0.28533840424281665
Epoch 1500 - Loss = 0.28051169317796093
Epoch 1600 - Loss = 0.2761787696062344
Epoch 1700 - Loss = 0.27226561328471377
Epoch 1800 - Loss = 0.2687125532958896
Epoch 1900 - Loss = 0.2654709248024843
Epoch 2000 - Loss = 0.2625006214821513
Epoch 2100 - Loss = 0.25976827555417215
Epoch 2200 - Loss = 0.25724588518032776
Epoch 2300 - Loss = 0.25490976581258473
Epoch 2400 - Loss = 0.2527397395020597
Epoch 2500 - Loss = 0.2507185013239025
Epoch 2600 - Loss = 0.24883111923883924
Epoch 2700 - Loss = 0.24706463561630515
Epoch 2800 - Loss = 0.24540774701833637
Epoch 2900 - Loss = 0.24385054481312596
Epoch 3000 - Loss = 0.24238430349560172
Epoch 3100 - Loss = 0.24100130673780307
Epoch 3200 - Loss = 0.23969470351293196
Epoch 3300 - Loss = 0.2384583883670421
Epoch 3400 - Loss = 0.23728690121408835
Epoch 3500 - Loss = 0.23617534301823132
Epoch 3600 - Loss = 0.23511930448367577
Epoch 3700 - Loss = 0.23411480545586144
Epoch 3800 - Loss = 0.23315824319134465
Epoch 3900 - Loss = 0.2322463480086729
Epoch 4000 - Loss = 0.23137614511221863
Epoch 4100 - Loss = 0.23054492160266965
Epoch 4200 - Loss = 0.2297501978647279
Epoch 4300 - Loss = 0.22898970266443042
Epoch 4400 - Loss = 0.22826135140291767
Epoch 4500 - Loss = 0.22756322706622129
Epoch 4600 - Loss = 0.22689356348620354
Epoch 4700 - Loss = 0.22625073058962492
Epoch 4800 - Loss = 0.22563322136316274
Epoch 4900 - Loss = 0.22503964030418896
Epoch 5000 - Loss = 0.22446869316192905
Epoch 5100 - Loss = 0.22391917780259743
Epoch 5200 - Loss = 0.22338997605632013
Epoch 5300 - Loss = 0.2228800464239599
Epoch 5400 - Loss = 0.22238841753904443
Epoch 5500 - Loss = 0.2219141822944289
Epoch 5600 - Loss = 0.22145649255554345
Epoch 5700 - Loss = 0.22101455439246653
Epoch 5800 - Loss = 0.22058762377191163
Epoch 5900 - Loss = 0.22017500265778595
Epoch 6000 - Loss = 0.21977603547546334
Epoch 6100 - Loss = 0.21939010590048816
Epoch 6200 - Loss = 0.21901663393723186
Epoch 6300 - Loss = 0.21865507325717176
Epoch 6400 - Loss = 0.2183049087700573
Epoch 6500 - Loss = 0.21796565440434787
Epoch 6600 - Loss = 0.21763685107601996
Epoch 6700 - Loss = 0.2173180648272089
Epoch 6800 - Loss = 0.2170088851182173
Epoch 6900 - Loss = 0.2167089232582348
Epoch 7000 - Loss = 0.2164178109617001
Epoch 7100 - Loss = 0.21613519901863631
Epoch 7200 - Loss = 0.21586075606851618
Epoch 7300 - Loss = 0.21559416746830468
Epoch 7400 - Loss = 0.21533513424628176
Epoch 7500 - Loss = 0.2150833721340996
Epoch 7600 - Loss = 0.21483861067028356
Epoch 7700 - Loss = 0.2146005923690525
Epoch 7800 - Loss = 0.21436907194893295
Epoch 7900 - Loss = 0.21414381561617263
Epoch 8000 - Loss = 0.21392460039843075
Epoch 8100 - Loss = 0.21371121352464903
Epoch 8200 - Loss = 0.21350345184738453
Epoch 8300 - Loss = 0.213301121304228
Epoch 8400 - Loss = 0.21310403641523415
Epoch 8500 - Loss = 0.21291201981356775
Epoch 8600 - Loss = 0.21272490180681325
Epoch 8700 - Loss = 0.2125425199666215
Epoch 8800 - Loss = 0.21236471874456622
Epoch 8900 - Loss = 0.21219134911226462
Epoch 9000 - Loss = 0.21202226822398146
Epoch 9100 - Loss = 0.21185733910008378
Epoch 9200 - Loss = 0.21169643032984978
Epoch 9300 - Loss = 0.21153941579225452
Epoch 9400 - Loss = 0.21138617439347157
Epoch 9500 - Loss = 0.211236589819924
Epoch 9600 - Loss = 0.21109055030581808
Epoch 9700 - Loss = 0.21094794841416828
Epoch 9800 - Loss = 0.2108086808304075
Epoch 9900 - Loss = 0.21067264816773867
Epoch 10000 - Loss = 0.21053975478345366
I can see two possible reasons:
Could anyone give an hint if there are issues in the implementation or any other consideration why the algorithm is not working without normalization?
Normalization/Standardization is suggested because it makes the convergence easy and faster.
Andrew Ng has explained the process and the reason with a bowl shape 2-dimensional loss space in his course.
If you are not doing that, you should be very slow with your LR and spend a lot more epochs.
I found this combination working
lr = LogisticRegression(num_epochs=500000, lr=0.001)
Answered by 10xAI on September 18, 2020
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP