Stack Overflow Asked by Gonzalo Polo on October 10, 2020

I would like to know if there is an efficient way (avoiding for loops) of doing a `serie.cumsum()`

but with a **shift of n**.

The same way you can see `serie.cumsum()`

like the inverse of `serie.diff(1)`

I am looking for an inverse of `diff(n)`

(I know that for a proper inverse you need the initial values but for simplicity I ignore them here) that could be called `cumsum_shift`

More explicitly implementing it with a **for loop** (which I would like to avoid):

```
def cumsum_shift(s, shift = 1, init_values = [0]):
s_cumsum = pd.Series(np.zeros(len(s)))
for i in range(shift):
s_cumsum.iloc[i] = init_values[i]
for i in range(shift,len(s)):
s_cumsum.iloc[i] = s_cumsum.iloc[i-shift] + s.iloc[i]
return s_cumsum
```

This code with `shift = 1`

is exactly the same that the `s.cumsum()`

pandas method does but of course the **pandas method do it in C code** (I guess) so it is much faster (of course you should always use the `s.cumsum()`

pandas method and not implement it yourself with a for loop).

My question then is

**What would be the way of doing cumsum_shift avoiding a for loop with pandas methods?**

Adding an example of input and output

If you call it with:

```
s = pd.Series([1,10,100,2,20,200,5,50,500])
s.diff(3)
out[26] 0 NaN
1 NaN
2 NaN
3 1.0
4 10.0
5 100.0
6 3.0
7 30.0
8 300.0
dtype: float64
```

With this input, the ouput of `cumsum_shift(s.diff(3), shift = 3, init_values = [1,2,3])`

is again the original series `s`

. Notice the shift of 3, this with just `cumsum()`

e.g `s.diff(3).cumsum()`

would not recover the original `s`

:

```
cumsum_shift(s.diff(3), shift = 3, init_values= [1,10,100])
out[27]
0 1.0
1 10.0
2 100.0
3 2.0
4 20.0
5 200.0
6 5.0
7 50.0
8 500.0
dtype: float64
```

But let me emphasize that the initial values is not a big deal, a constant difference is not a problem. I would like to know **how to perform a cumsum of shifted differenced serie without having to use a for loop**

The same way that if you do a `diff()`

and then a `cumsum()`

you get back the orginal one up to the initial value:

```
s = pd.Series([1,10,100,2,20,200,5,50,500])
s.diff().cumsum()
out[28]
0 NaN
1 9.0
2 99.0
3 1.0
4 19.0
5 199.0
6 4.0
7 49.0
8 499.0
dtype: float64
```

I would like to know if there some clever way of doing something like `s.diff(n).cumsum(n)`

that returned something correct up to some constant initial values.

**EDIT 2 – Reverse a Moving Average**

Thinking of an application of the "shifted cumsum" I found this other question in SO of how to **reverse a moving average** that I have answered using my `cumsum_shift`

function and I think it clarifies more what I am asking here

You can use the pandas method rolling.sum() among with sum:

```
s.rolling(shift).sum()
```

However you may want to fill the NaN values until the shift with the original df.

Answered by Elif on October 10, 2020

Get help from others!

Recent Questions

- How can I transform graph image into a tikzpicture LaTeX code?
- How Do I Get The Ifruit App Off Of Gta 5 / Grand Theft Auto 5
- Iv’e designed a space elevator using a series of lasers. do you know anybody i could submit the designs too that could manufacture the concept and put it to use
- Need help finding a book. Female OP protagonist, magic
- Why is the WWF pending games (“Your turn”) area replaced w/ a column of “Bonus & Reward”gift boxes?

Recent Answers

- Joshua Engel on Why fry rice before boiling?
- Lex on Does Google Analytics track 404 page responses as valid page views?
- Jon Church on Why fry rice before boiling?
- haakon.io on Why fry rice before boiling?
- Peter Machado on Why fry rice before boiling?

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP