merge data tables in R

Question

My apologies for this simple question. Basically, I want to make three separate cumsum() tables and merge them together by the first table. For example:

a <- cumsum(table(df$variable))
b <- cumsum(table(df$variable[c(TRUE, FALSE)]))
c <- cumsum(table(df$variable[c(FALSE, TRUE)]))

Where a is the cumsum of the entire vector of df$variable, b is the cumsum of the odd-numbered values of df$variable, c is the cumsum of the even-numbered values of df$variable. Another way of interpreting this is that combining b and c produces a.

This is the entire vector of numbers.

  [1] 18 17 15 10  5  0 10 10  0 10 15  5  5  5 25 15 13  0  0  0 25 18 15 15  1  4  5
 [28]  5  5 15  5 12 15  0  3 12 20  0  5  5 13 10 10 10  3 15 13 20 12 60 10 10  2  0
 [55]  5 10  8  4  0 15  5  5 15  5  0  5  2  8  5  5  5  5  9  9  3  7 20 25  5  4 10
 [82] 10  2  4  5  5 18  8  0 10  5  5  7 12  5 13 26 20 13 21  5 15 10 10  5 15  5 15
[109]  0  1 13 21 25 25  5 14  5 15 10  0  5 15  3  4  5 15 15  5 25 25  5 15  0  2 13
[136] 22  2 10  3  3 15 11  0  2 40 35 24 24  5  5 10  5 16  0 17 19 20  5  5  5  0 15
[163]  3 13 20  4  5  5  3 19 25 25  0 15  5  3 22 22 25  5 15 15  5 15 17  9  5  5 15
[190] 10

For a, I used cbind(cumsum(table(df$variable)))

For b, I used cbind(cumsum(table(df$variable[c(TRUE, FALSE)])))

For c, I used cbind(cumsum(table(df$variable[c(FALSE, TRUE)])))

In frequency form, the distributions should look something like this.

    a   b   c
0   18  10  8
1   2   1   1
2   6   4   2
3   9   7   2
4   6   0   6
5   47  28  19
7   2   1   1
8   3   1   2
9   3   1   2
10  19  7   12
11  1   0   1
12  4   1   3
13  8   6   2
14  1   0   1
15  25  9   16
16  1   1   0
17  3   2   1
18  3   2   1
19  2   0   2
20  6   4   2
21  2   0   2
22  3   1   2
24  2   1   1
25  10  6   4
26  1   1   0
35  1   0   1
40  1   1   0
60  1   0   1
    190 95  95

But I want it in cumsum() form, such that it should look something like this. I wrote out the first 6 rows as illustration.

    a   b   c
0   18  10  8
1   20  11  9
2   26  15  11
3   35  22  13
4   41  22  19
5   88  50  38
7   90  51  39

The problem I've been having is that the subsets a and b doesn't have all the values (i.e. some values have 0 frequency), such that it shortens the length of the vector; as a result, I'm unable to properly merge or cbind() these values.

Any suggestion is greatly appreciated.

thelatemail · Accepted Answer

You could probably get there using match quite easily. Assuming your data is:

set.seed(1)
df <- data.frame(variable=rbinom(10,prob=0.5,size=3))

Something like this seems to work

out <- data.frame(a,b=b[match(names(a),names(b))],c=c[match(names(a),names(c))])
replace(out,is.na(out),0)

#   a b c
#0  1 0 1
#1  4 2 2
#2  7 4 3
#3 10 5 5

merge data tables in R

Answers (1)

Related Questions