user11418708
user11418708

Reputation: 902

Data frame grouping rows based on condition

For some this is an easy exercise for myself is a little bit tricky. I would like to calculate the amount of time persons with the same id spend in time t. Basically I would like to know the total amount of time people spend together and alone based on gender.

Input

id     DMSex       t1  t2  t3  t4  t5  t6 
12       M         15  0   0   15  15  15
12       F         0   15  15  0   0   15  
13       F         15  0   15  0   0   0
13       M         15  15   0  15  0   15

Output:

 id  Together  Male only  Female only
 12    15           45       15
 13    15           45       30

Upvotes: 0

Views: 63

Answers (2)

G. Grothendieck
G. Grothendieck

Reputation: 270248

Create male and female matrices and then use the indicated computations.

ix <- -(1:2)
males <- as.matrix(subset(DF, DMSex == "M")[ix])
females <- as.matrix(subset(DF, DMSex == "F")[ix])

data.frame(id = unique(DF$id),
  together = rowSums(pmin(females, males)),
  males_only = rowSums(pmax(males - females, 0)),
  females_only = rowSums(pmax(females - males, 0)))

giving:

  id together males_only females_only
2 12       15         45           30
3 13       15         45           15

Note

Lines <- "id     DMSex       t1  t2  t3  t4  t5  t6 
12       M         15  0   0   15  15  15
12       F         0   15  15  0   0   15  
13       F         15  0   15  0   0   0
13       M         15  15   0  15  0   15"
DF <- read.table(text = Lines, header = TRUE)

Upvotes: 1

B. Christian Kamgang
B. Christian Kamgang

Reputation: 6529

You can also solve it as follows:

df <- read.table(text = "id     DMSex       t1  t2  t3  t4  t5  t6 
12       M         15  0   0   15  15  15
12       F         0   15  15  0   0   15  
13       F         15  0   15  0   0   0
13       M         15  15   0  15  0   15", header = TRUE)

library(data.table)

setDT(df)[order(DMSex), {
  pos <- sapply(.SD, function(x) all(x > 0))
  comm <- sum(.SD[1, pos, with = FALSE])
  onlyFM <- rowSums(.SD[, !pos, with = FALSE])
  .(together = comm, males_only = onlyFM[2], females_only = onlyFM[1])
}, by = id, .SDcols = t1:t6]

#       id together males_only females_only
# 1:    12       15         45           30
# 2:    13       15         45           15

Upvotes: 3

Related Questions