TransWikia.com

Information Gain of a Customer ID Attribute

Data Science Asked by 324 on July 1, 2021

Suppose I have a dataframe like the following

CustID  ...  Class
1       ...     C0
2       ...     C1
3       ...     C1
4       ...     C0
5       ...     C0
6       ...     C1

I am trying to find the information gain, and I was wondering if this is the correct process?

I began by finding the information gain of the parent node
$$I(mathrm{parent}) = -(3/6) log_2 (3/6) – (3/6) log_2 (3/6) = 0.5 + 0.5 = 1.$$

Then, the information of the children node, Customer ID.
$$I(mathrm{children}) = -(1/6) log_2 (1/6) – (1/6) log_2 (1/6) – (1/6) log_2 (1/6) – (1/6) log_2 (1/6) – (1/6) log_2 (1/6) – (1/6) log_2 (1/6) = 6 times (- (1/6) log_2 (1/6)) = 2.584963$$

However, $$Delta = 1 – 2.584963 = -1.584963.$$ Is it possible to have a negative information gain?

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP