YefR
YefR

Reputation: 369

Subtract specific rows

I have data that looks the following way:

Participant  Round  Total 
1        100     5
1        101     8
1        102     12
1        200     42      
2        100     14 
2        101     71
40       100     32   
40       101     27
40       200     18

I want to get a table with the Total of last Round (200) minus the Total of first Round (100) ;

For example - for Participant 1 - it is 42 - 5 = 37.

The final output should look like:

Participant  Total 
1         37
2       
40       -14

Upvotes: 6

Views: 1752

Answers (4)

grrgrrbla
grrgrrbla

Reputation: 2589

try this:

df <- read.table(header = TRUE, text = "
Participant Round Total
                 1        100     5
1        101     8
1        102     12 
1        200     42      
2        100     14 
2        101     71
2 200 80
40       100     32   
40       101     27
40       200     18")

library(data.table)
setDT(df)[ , .(Total = Total[Round == 200] - Total[Round == 100]), by = Participant]

Upvotes: 2

David Arenburg
David Arenburg

Reputation: 92282

With base R

aggregate(Total ~ Participant, df[df$Round %in% c(100, 200), ], diff) 
#   Participant Total
# 1           1    37
# 2           2      
# 3          40   -14

Or similarly combined with subset

aggregate(Total ~ Participant, df, subset = Round %in% c(100, 200), diff) 

Or with data.table

library(data.table) ;
setDT(df)[Round %in% c(100, 200), diff(Total), by = Participant]
#    Participant  V1
# 1:           1  37
# 2:          40 -14

Or using binary join

setkey(setDT(df), Round)
df[.(c(100, 200)), diff(Total), by = Participant]
#    Participant  V1
# 1:           1  37
# 2:          40 -14

Or with dplyr

library(dplyr)
df %>%
  group_by(Participant) %>%
  filter(Round %in% c(100, 200)) %>%
  summarise(Total = diff(Total))
# Source: local data table [2 x 2]
# 
#   Participant Total
# 1           1    37
# 2          40   -14

Upvotes: 12

Zfunk
Zfunk

Reputation: 1193

Everyone loves a bit of sqldf, so if your requirement isn't to use apply then try this:

Firstly some test data:

df <- read.table(header = TRUE, text = "
Participant Round Total
                 1        100     5
1        101     8
1        102     12 
1        200     42      
2        100     14 
2        101     71
2 200 80
40       100     32   
40       101     27
40       200     18")

Next use SQL to create 2 columns - one for the 100 round and one for the 200 round and subtract them

rolled <- sqldf("
  SELECT tab_a.Participant AS Participant
        ,tab_b.Total_200 - tab_a.Total_100 AS Difference
    FROM (
        SELECT Participant
            ,Total AS Total_100
        FROM df
        WHERE Round = 100
        ) tab_a
    INNER JOIN (
        SELECT Participant
            ,Total AS Total_200
        FROM df
        WHERE Round = 200
        ) tab_b ON (tab_a.Participant = tab_b.Participant)
    ")

Upvotes: 1

Mamoun Benghezal
Mamoun Benghezal

Reputation: 5314

you can try this

library(dplyr)
group_by(df, Participant) %>%
    filter(row_number()==1 | row_number()==max(row_number())) %>%
    mutate(df = diff(Total)) %>%
    select(Participant, df) %>%
    unique()
Source: local data frame [3 x 2]
Groups: Participant

  Participant  df
1           1  37
2           2  57
3          40 -14

Upvotes: 2

Related Questions