Stack Overflow Asked on December 3, 2021
The data frame contains two variables (time
and rate
) and 10 observations
time <- seq(1:10)
rate <- 1-(0.99^time)
dat <- data.frame(time, rate)
I need to add a new column (called new_rate
).
new_rate
is defined as follows
Note: new_rate_1
is the first observation of new the column new_rate, etc.
new_rate_1 = rate_1
new_rate_2 = (1-rate_1)*rate_2
new_rate_3 = (1-rate_1)*(1-rate_2)*rate_3
new_rate_4 = (1-rate_1)*(1-rate_2)*(1-rate_3)*rate_4
...
new_rate_10 = (1-rate_1)*(1-rate_2)*(1-rate_3)*(1-rate_4)*(1-rate_5)*(1-rate_6)*(1-rate_7)*(1-rate_8)*(1-rate_9)*rate_10
How this can be done in base R or dplyr
?
A straightforward math approach using cumprod
should work
> c(1, head(cumprod(1 - rate), -1)) * rate
[1] 0.01000000 0.01970100 0.02881885 0.03709807 0.04432372 0.05033049
[7] 0.05500858 0.05830607 0.06022773 0.06083074
If you want to practice with recursions, you can try the method below
f <- function(v, k = length(v)) {
if (k == 1) {
return(v[k])
}
u <- f(v, k - 1)
c(u, tail(u, 1) * (1 / v[k - 1] - 1) * v[k])
}
such that
> f(rate)
[1] 0.01000000 0.01970100 0.02881885 0.03709807 0.04432372 0.05033049
[7] 0.05500858 0.05830607 0.06022773 0.06083074
Answered by ThomasIsCoding on December 3, 2021
In case you are still interested in how to do it with purrr::reduce
family of functions. Here are two solutions:
library(purrr)
accumulate2(dat$rate[-nrow(dat)], dat$rate[-1], .init = dat$rate[1],
~ ..1 * (1/..2 - 1) * ..3) %>%
simplify()
[1] 0.01000000 0.01970100 0.02881885 0.03709807 0.04432372 0.05033049 0.05500858 0.05830607
[9] 0.06022773 0.06083074
And also in base R we could do the following:
Reduce(function(x, y) {
x * (1/dat$rate[y - 1] - 1) * dat$rate[y]
}, init = dat$rate[1],
seq_len(nrow(dat))[-1], accumulate = TRUE)
[1] 0.01000000 0.01970100 0.02881885 0.03709807 0.04432372 0.05033049 0.05500858 0.05830607
[9] 0.06022773 0.06083074
Answered by Anoushiravan R on December 3, 2021
cumprod
to the rescue (hat-tip to @Cole for simplifying the code):
dat$rate * c(1, cumprod(1 - head(dat$rate, -1)))
The logic is that you are essentially doing a cum
ulative prod
uct of 1 - dat$rate
, multiplied by the current step.
At the first step, you can just keep the existing value, but then you need to offset the two vectors so that the multiplication gives the desired result.
Proof:
out <- c(
dat$rate[1],
(1-dat$rate[1])*dat$rate[2],
(1-dat$rate[1])*(1-dat$rate[2])*dat$rate[3],
(1-dat$rate[1])*(1-dat$rate[2])*(1-dat$rate[3])*dat$rate[4],
(1-dat$rate[1])*(1-dat$rate[2])*(1-dat$rate[3])*(1-dat$rate[4])*dat$rate[5],
(1-dat$rate[1])*(1-dat$rate[2])*(1-dat$rate[3])*(1-dat$rate[4])*(1-dat$rate[5])*dat$rate[6],
(1-dat$rate[1])*(1-dat$rate[2])*(1-dat$rate[3])*(1-dat$rate[4])*(1-dat$rate[5])*(1-dat$rate[6])*dat$rate[7],
(1-dat$rate[1])*(1-dat$rate[2])*(1-dat$rate[3])*(1-dat$rate[4])*(1-dat$rate[5])*(1-dat$rate[6])*(1-dat$rate[7])*dat$rate[8],
(1-dat$rate[1])*(1-dat$rate[2])*(1-dat$rate[3])*(1-dat$rate[4])*(1-dat$rate[5])*(1-dat$rate[6])*(1-dat$rate[7])*(1-dat$rate[8])*dat$rate[9],
(1-dat$rate[1])*(1-dat$rate[2])*(1-dat$rate[3])*(1-dat$rate[4])*(1-dat$rate[5])*(1-dat$rate[6])*(1-dat$rate[7])*(1-dat$rate[8])*(1-dat$rate[9])*dat$rate[10]
)
all.equal(
dat$rate * c(1, cumprod(1 - head(dat$rate, -1))),
out
)
#[1] TRUE
Answered by thelatemail on December 3, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP