Justas Mundeikis
Justas Mundeikis

Reputation: 995

R dataframe cumsum over rows

I have a data frame like this

df <- data.frame(var=c("A", "B", "C"), 
             "2020-01-01"=c(1,2,3),
             "2020-01-02"=c(1,2,3),
             "2020-01-03"=c(1,2,3))

and I would like to calculate the cumulative sum of each variable (on each row), but if I run apply(df, 1, cumsum) it coerces values to NAs and if I run apply(df[,2:4], 1, cumsum) it gives my back only the matrix output of selected rows and columns (and transposes the output).

Could anybody suggest simple way of transformation such data frames (that have in some columns character values) with cumsum? Im working with tidyverse, so any solutions that work with pyping stream, are welcome even more!

So my intended output should look like this:

var X2020.01.01 X2020.01.02 X2020.01.03 1 A 1 2 2 2 B 2 4 6 3 C 3 6 9

Thanks a lot

Upvotes: 0

Views: 376

Answers (3)

akrun
akrun

Reputation: 887851

We can use rowCumsums from matrixStats

library(matrixStats)
df[-1] <- rowCumsums(as.matrix(df[-1]))

Upvotes: 0

rg255
rg255

Reputation: 4169

Using datatable to melt then dcast the data (effectively transposing it) then cumsum on each column, then transpose it back:

library(data.table)
dt1 <- dcast(melt(
     dcast(melt(dt1, id.vars = "var"), ... ~ var)[, 
       LETTERS[1:3] := cumsum(.SD), .SDcols = LETTERS[1:3]], 
     id.vars = "variable"), ... ~ variable)

Gives:

> dt1
   variable.1 2020-01-01 2020-01-02 2020-01-03
1:          A          1          2          3
2:          B          2          4          6
3:          C          3          6          9

Upvotes: 0

jay.sf
jay.sf

Reputation: 73622

As you have already noticed yourself, apply yields a transposed output, so just transpose back again. You could add the result with data.frame to a joined data frame; the column names of the cumsums will thereby be extended with .1.

df <- data.frame(df, t(apply(df[-1], 1, cumsum)))
df
# var X2020.01.01 X2020.01.02 X2020.01.03 X2020.01.01.1 X2020.01.02.1 X2020.01.03.1
# 1   A           1           1           1             1             2             3
# 2   B           2           2           2             2             4             6
# 3   C           3           3           3             3             6             9

To get the result of your edit to your question, just do

df <- data.frame(t(apply(df[-1], 1, cumsum)))
df
#   X2020.01.01 X2020.01.02 X2020.01.03
# 1           1           2           3
# 2           2           4           6
# 3           3           6           9

Upvotes: 2

Related Questions