Data Science Asked by 324 on July 1, 2021
Suppose I have a dataframe like the following
CustID ... Class
1 ... C0
2 ... C1
3 ... C1
4 ... C0
5 ... C0
6 ... C1
I am trying to find the information gain, and I was wondering if this is the correct process?
I began by finding the information gain of the parent node
$$I(mathrm{parent}) = -(3/6) log_2 (3/6) – (3/6) log_2 (3/6) = 0.5 + 0.5 = 1.$$
Then, the information of the children node, Customer ID.
$$I(mathrm{children}) = -(1/6) log_2 (1/6) – (1/6) log_2 (1/6) – (1/6) log_2 (1/6) – (1/6) log_2 (1/6) – (1/6) log_2 (1/6) – (1/6) log_2 (1/6) = 6 times (- (1/6) log_2 (1/6)) = 2.584963$$
However, $$Delta = 1 – 2.584963 = -1.584963.$$ Is it possible to have a negative information gain?
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP