Stack Overflow Asked by Nathaniel on December 14, 2020
Hi I’m trying to have my script count the number of times it sees the same words in specified columns with some of those columns having multiple separated by a comma.
For example –
Labels Labs
a1, b3 1
a2 3
b3 1
I would want two outputs.
Labels # of labels
a1 1
b3 2
Labels Lab1 Lab3
a1 1 0
a2 0 1
b3 2 0
I was trying to use groupby to count but the only output I am getting in excel is below and I am unable to know what they belong to
20
2
1
7
7
I have been playing with this but keep getting the same result shown above
df1 = df.groupby('Labs').count()
Keys
Setup
df = pd.read_csv(io.StringIO("""
Labels Labs
a1, b3 1
a2 3
b3 1
"""), sep=r"s{2,}", engine="python")
# split string into list (assume consistent separator pattern)
df["Labels"] = df["Labels"].str.split(", ")
First output:
df.explode("Labels").groupby("Labels").size()
Out[69]:
Labels
a1 1
a2 1
b3 2
dtype: int64
Second output:
df.explode("Labels").pivot_table(index="Labels", columns="Labs", aggfunc="size")
.fillna(0).astype(int)
Out[70]:
Labs 1 3
Labels
a1 1 0
a2 0 1
b3 2 0
Correct answer by Bill Huang on December 14, 2020
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP