Stack Overflow Asked by Alon Tru on December 11, 2021
I’m trying to sum my df’s rows as follows,
let’s say I have the beneath df (each cell in a row contains a vector/list of the same size!)
In the real problem, I have a large number of columns and it can vary. But I do have a list that contains the names of those columns.
df = pd.DataFrame([
[[1,2,3],[1,2,3],[1,2,3]],
[[1,1,1],[1,1,1],[1,1,1]],
[[2,2,2],[2,2,2],[2,2,2]]
], columns=['a','b','c'])
I’m trying to create a new Column that will contain the sum of all the vectors in every row- as np.array would do! and get this following vectors as a result:
[3,6,9]
[3,3,3]
[6,6,6]
and not like the .sum(axis=1) does..
[1,2,3,1,2,3,1,2,3]
[1,1,1,1,1,1,1,1,1]
[2,2,2,2,2,2,2,2,2]
Can anyone think of an idea, thanks in advance 🙂
Another way using pd.Series.explode
:
df['sum'] = df.apply(pd.Series.explode).sum(axis=1).groupby(level=0).agg(list)
Output:
a b c sum
0 [1, 2, 3] [1, 2, 3] [1, 2, 3] [3.0, 6.0, 9.0]
1 [1, 1, 1] [1, 1, 1] [1, 1, 1] [3.0, 3.0, 3.0]
2 [2, 2, 2] [2, 2, 2] [2, 2, 2] [6.0, 6.0, 6.0]
Answered by Scott Boston on December 11, 2021
If same lengths of lists create numpy array and sum for improve performance:
df['Sum'] = np.array(df.to_numpy().tolist()).sum(axis=1).tolist()
print (df)
a b c Sum
0 [1, 2, 3] [1, 2, 3] [1, 2, 3] [3, 6, 9]
1 [1, 1, 1] [1, 1, 1] [1, 1, 1] [3, 3, 3]
2 [2, 2, 2] [2, 2, 2] [2, 2, 2] [6, 6, 6]
Answered by jezrael on December 11, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP