Stack Overflow Asked by toby chamberlain on January 10, 2021
My initial dataframe is:
Name Info1 Info2
0 Name1 Name1-Info1 Name1-Info2
1 Name1 Name1-Info1 Name1-Info2
2 Name1 Name1-Info1 Name1-Info2
3 Name2 Name2-Info1 Name2-Info2
4 Name2 Name2-Info1 Name2-Info2
and i would like to return the number of repetitions of each row as such:
Name Info1 Info2 Count
0 Name1 Name1-Info1 Name1-Info2 3
1 Name2 Name2-Info1 Name2-Info2 2
How can I count a pandas dataframe over duplications?
df.groupby(['Name', 'Info1', 'Info2']).size().reset_index().rename(columns={0:"count"})
Correct answer by Tom Ron on January 10, 2021
size = df.groupby('Name').size().tolist()
df = df.groupby('Name').tail(1).reset_index()
df['Count'] = size
Answered by Sam S on January 10, 2021
Given your example df
:
Name Info1 Info2
0 Name1 Name1-Info1 Name1-Info2
1 Name1 Name1-Info1 Name1-Info2
2 Name1 Name1-Info1 Name1-Info2
3 Name2 Name1-Info2 Name1-Info2
4 Name2 Name1-Info2 Name1-Info2
The following:
df.pivot_table(index=list(df), aggfunc='size')
Will return what you're after:
Name Info1 Info2
Name1 Name1-Info1 Name1-Info2 3
Name2 Name1-Info2 Name1-Info2 2
Answered by JPI93 on January 10, 2021
Add column 'count'
and do df.groupby
df['count'] = 1
df.groupby(['Name', 'Info1', 'Info2'])['count'].sum().reset_index()
Answered by EddyG on January 10, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP