Stack Overflow Asked on January 1, 2022
I have a dataframe like this:
Time Name Value
2007Q1 A 30
2007Q2 A 35
2007Q3 A 28
...
2007Q1 B 31
2007Q2 B 50
2007Q3 B 60
...
2007Q1 C 20
2007Q2 C 15
2007Q3 C 30
I want to add another column called Results and perform calculations between each row for each Name. I want to use the value for a quarter divided by the value for the previous quarter and then minus 1, which is similar as Value(Q2)/Value(Q1)-1. Also, I want to group by Name, only do the calculation within the rows with the same name. The results should be like:
Time Name Value Results
2007Q1 A 30
2007Q2 A 35 0.1667
2007Q3 A 28 -0.2
...
2007Q1 B 31
2007Q2 B 50 0.6129
2007Q3 B 60 0.2
...
2007Q1 C 20
2007Q2 C 15 -0.25
2007Q3 C 30 1
The starting time period for each ‘Name’ should have no value for Results.
Thanks to everyone who can help!
Use DataFrame.groupby
on Name
and use groupby.shift
to shift the column Value
then use Series.div
to divide it with Value
, finally use Series.sub
to subtract 1
:
df['Results'] = df['Value'].div(df.groupby('Name')['Value'].shift()).sub(1)
Result:
print(df)
Time Name Value Results
0 2007Q1 A 30 NaN
1 2007Q2 A 35 0.166667
2 2007Q3 A 28 -0.200000
3 2007Q1 B 31 NaN
4 2007Q2 B 50 0.612903
5 2007Q3 B 60 0.200000
6 2007Q1 C 20 NaN
7 2007Q2 C 15 -0.250000
8 2007Q3 C 30 1.000000
Answered by Shubham Sharma on January 1, 2022
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP