How to sum up identical cells in a data frame?

Question

I am working on defining a temporal network with Pajek software.
Below the data and code that I am using:

library(data.table)
Aggregated <- fread("
    act1_1 act1_2 act1_3 act1_4 act1_5
    2        1      3      2    6
    1        2      2      1  1
    1        4      2      2  3
    ")


cols <- names(Aggregated)
n <- length(cols)

vi <- CJ(rn = 1:nrow(Aggregated), len = 2:5, start = 1:n)[

  , end := start + len - 1L][

    end <= n]

dl <- melt(setDT(Aggregated)[, rn := .I], id.vars = "rn", variable.name = "pos", 
           variable.factor = TRUE)[

             , pos := as.integer(pos)][]

result <- dl[vi, on = .(rn, pos >= start, pos <= end), 
             .(rn, values = toString(value), position = toString(cols[x.pos])), 
             by = .EACHI, nomatch = 0L][

               , .(freq = .N), by = .(values, position)]

result[order(nchar(values), values)]

Below the outcome:

           values                               position freq
 1:          1, 1                         act1_4, act1_5    1
 2:          1, 2                         act1_1, act1_2    1
 3:          1, 3                         act1_2, act1_3    1
 4:          1, 4                         act1_1, act1_2    1
 5:          2, 1                         act1_1, act1_2    1
 6:          2, 1                         act1_3, act1_4    1
 7:          2, 2                         act1_2, act1_3    1
 8:          2, 2                         act1_3, act1_4    1
 9:          2, 3                         act1_4, act1_5    1
10:          2, 6                         act1_4, act1_5    1
11:          3, 2                         act1_3, act1_4    1
12:          4, 2                         act1_2, act1_3    1
13:       1, 2, 2                 act1_1, act1_2, act1_3    1
14:       1, 3, 2                 act1_2, act1_3, act1_4    1
15:       1, 4, 2                 act1_1, act1_2, act1_3    1
16:       2, 1, 1                 act1_3, act1_4, act1_5    1
17:       2, 1, 3                 act1_1, act1_2, act1_3    1
18:       2, 2, 1                 act1_2, act1_3, act1_4    1
19:       2, 2, 3                 act1_3, act1_4, act1_5    1
20:       3, 2, 6                 act1_3, act1_4, act1_5    1
21:       4, 2, 2                 act1_2, act1_3, act1_4    1
22:    1, 2, 2, 1         act1_1, act1_2, act1_3, act1_4    1
23:    1, 3, 2, 6         act1_2, act1_3, act1_4, act1_5    1
24:    1, 4, 2, 2         act1_1, act1_2, act1_3, act1_4    1
25:    2, 1, 3, 2         act1_1, act1_2, act1_3, act1_4    1
26:    2, 2, 1, 1         act1_2, act1_3, act1_4, act1_5    1
27:    4, 2, 2, 3         act1_2, act1_3, act1_4, act1_5    1
28: 1, 2, 2, 1, 1 act1_1, act1_2, act1_3, act1_4, act1_5    1
29: 1, 4, 2, 2, 3 act1_1, act1_2, act1_3, act1_4, act1_5    1
30: 2, 1, 3, 2, 6 act1_1, act1_2, act1_3, act1_4, act1_5    1

My question how to create another column that count the frequencies with the same values such as:

                                                              Sum of freq
 5:          2, 1                         act1_1, act1_2    1      2
 6:          2, 1                         act1_3, act1_4    1
 7:          2, 2                         act1_2, act1_3    1      2
 8:          2, 2                         act1_3, act1_4    1

s__ · Accepted Answer

Maybe this could be helpful:

library(data.table)
#... this is the last row of your code renamed
df <- result[order(nchar(values), values)]
df[,summed:=sum(freq), by=values]

 df
           values                               position freq summed
 1:          1, 1                         act1_4, act1_5    1      1
 2:          1, 2                         act1_1, act1_2    1      1
 3:          1, 3                         act1_2, act1_3    1      1
 4:          1, 4                         act1_1, act1_2    1      1
 5:          2, 1                         act1_1, act1_2    1      2
 6:          2, 1                         act1_3, act1_4    1      2
 7:          2, 2                         act1_2, act1_3    1      2
 8:          2, 2                         act1_3, act1_4    1      2
 9:          2, 3                         act1_4, act1_5    1      1
10:          2, 6                         act1_4, act1_5    1      1
11:          3, 2                         act1_3, act1_4    1      1
...

EDIT: You can try this:

df$sm <- ifelse(duplicated(df$values) == T, NA, df$summed)
df
           values                               position freq summed sm
 1:          1, 1                         act1_4, act1_5    1      1  1
 2:          1, 2                         act1_1, act1_2    1      1  1
 3:          1, 3                         act1_2, act1_3    1      1  1
 4:          1, 4                         act1_1, act1_2    1      1  1
 5:          2, 1                         act1_1, act1_2    1      2  2
 6:          2, 1                         act1_3, act1_4    1      2 NA
 7:          2, 2                         act1_2, act1_3    1      2  2
 8:          2, 2                         act1_3, act1_4    1      2 NA
 9:          2, 3                         act1_4, act1_5    1      1  1
10:          2, 6                         act1_4, act1_5    1      1  1

How to sum up identical cells in a data frame?

Answers (2)

Related Questions