TransWikia.com

Update a pandas data frame column using Apply,Lambda and Group by Functions

Data Science Asked by Srujan K.N. on November 30, 2020

I have a data frame in the format mentioned in the screenshot below. Column ‘Candidate Won‘ has only ‘loss‘ as the column value for all the rows. I want to update the Column ‘Candidate Won’ to a value ‘won’ if the corresponding row’s ‘% of Votes‘ is maximum when grouped by ‘Constituency‘ Column otherwise the value should be ‘loss’. I want to achieve the result by using a combination of apply, lambda, and group by, instead of using loops/iterations.

DataFrame : (df_andhrapradesh)

enter image description here

Code below works for a specific constituency in the data frame :

df_amalapuram=df_andhrapradesh[df_andhrapradesh['Constituency']=='Amalapuram']
df_amalapuram['Candidate Won']=df_amalapuram['% of Votes'].apply(lambda x:"Won" if x==df_amalapuram['% of Votes'].max() else "Loss")

Tried something like below to make it work for the entire data frame which has different constituencies but it failed:

df_andhrapradesh['Candidate Won']=df_andhrapradesh['% of Votes'].apply(lambda x:"Won" if x==df_andhrapradesh.groupby('Constituency')['% of Votes'].max() else "Loss")

One Answer

I used 'Apply' function to every row in the pandas data frame and created a custom function to return the value for the 'Candidate Won' Column using data frame,row-level 'Constituency','% of Votes'

Custom Function Code:

def update_candidateresult(df,a,b):
max_voteshare=df.groupby(df['Constituency']==a)['% of Votes'].max()[True]
if b==max_voteshare:
    return "won"
else:
    return "loss"

Final Code :

 df_andhrapradesh['Candidate Won']=df_andhrapradesh.apply(lambda row:update_candidateresult(df_andhrapradesh,row['Constituency'],row['% of Votes']),axis=1)

Answered by Srujan K.N. on November 30, 2020

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP