Stack Overflow Asked by Tyler Klein on November 24, 2021
All,
I have a dataframe with repeated indices. I’m trying to update the values using the index for all rows with that index. Here is an example of what I have
name x
t
0 A 5
0 B 2
1 A 7
2 A 5
2 B 9
2 C 3
"A" is present at every time. I want to replace "x" with the current value of "x", minus the value of "x" for "A" at that time. The tricky part is to get with an array or dataframe that is, in this case
array([5, 5, 7, 5, 5, 5])
which is the value for "A", but repeated for each timestamp. I can then subtract this from df[‘x’]. My working solution is below.
temp = df[df['name'] == 'A']
d = dict(zip(temp.index, temp['x']))
df['x'] = df['x'] - df.index.to_frame()['t'].replace(d)
name x
t
0 A 0
0 B -3
1 A 0
2 A 0
2 B 4
2 C -2
This works, but feels a bit hacky, and I can’t help but think there is a better (and must faster) solution…
groupby .cumsum()
of where name =A
and subtract fast value in each group from the rest
df['x']=df.groupby((df.name=='A').cumsum())['x'].apply(lambda s:s.sub(s.iloc[0]))
name x
t
0 A 0
0 B -3
1 A 0
2 A 0
2 B 4
2 C -2
Answered by wwnde on November 24, 2021
I will do reindex
df.x-=df.loc[df.name=='A','x'].reindex(df.index).values
df
Out[362]:
name x
t
0 A 0
0 B -3
1 A 0
2 A 0
2 B 4
2 C -2
Answered by BENY on November 24, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP