Reputation: 369
I have data that looks the following way:
Participant Round Total
1 100 5
1 101 8
1 102 12
1 200 42
2 100 14
2 101 71
40 100 32
40 101 27
40 200 18
I want to get a table with the Total
of last Round
(200) minus the Total
of first Round
(100) ;
For example - for Participant 1 - it is 42 - 5 = 37
.
The final output should look like:
Participant Total
1 37
2
40 -14
Upvotes: 6
Views: 1752
Reputation: 2589
try this:
df <- read.table(header = TRUE, text = "
Participant Round Total
1 100 5
1 101 8
1 102 12
1 200 42
2 100 14
2 101 71
2 200 80
40 100 32
40 101 27
40 200 18")
library(data.table)
setDT(df)[ , .(Total = Total[Round == 200] - Total[Round == 100]), by = Participant]
Upvotes: 2
Reputation: 92282
With base R
aggregate(Total ~ Participant, df[df$Round %in% c(100, 200), ], diff)
# Participant Total
# 1 1 37
# 2 2
# 3 40 -14
Or similarly combined with subset
aggregate(Total ~ Participant, df, subset = Round %in% c(100, 200), diff)
Or with data.table
library(data.table) ;
setDT(df)[Round %in% c(100, 200), diff(Total), by = Participant]
# Participant V1
# 1: 1 37
# 2: 40 -14
Or using binary join
setkey(setDT(df), Round)
df[.(c(100, 200)), diff(Total), by = Participant]
# Participant V1
# 1: 1 37
# 2: 40 -14
Or with dplyr
library(dplyr)
df %>%
group_by(Participant) %>%
filter(Round %in% c(100, 200)) %>%
summarise(Total = diff(Total))
# Source: local data table [2 x 2]
#
# Participant Total
# 1 1 37
# 2 40 -14
Upvotes: 12
Reputation: 1193
Everyone loves a bit of sqldf, so if your requirement isn't to use apply then try this:
Firstly some test data:
df <- read.table(header = TRUE, text = "
Participant Round Total
1 100 5
1 101 8
1 102 12
1 200 42
2 100 14
2 101 71
2 200 80
40 100 32
40 101 27
40 200 18")
Next use SQL to create 2 columns - one for the 100 round and one for the 200 round and subtract them
rolled <- sqldf("
SELECT tab_a.Participant AS Participant
,tab_b.Total_200 - tab_a.Total_100 AS Difference
FROM (
SELECT Participant
,Total AS Total_100
FROM df
WHERE Round = 100
) tab_a
INNER JOIN (
SELECT Participant
,Total AS Total_200
FROM df
WHERE Round = 200
) tab_b ON (tab_a.Participant = tab_b.Participant)
")
Upvotes: 1
Reputation: 5314
you can try this
library(dplyr)
group_by(df, Participant) %>%
filter(row_number()==1 | row_number()==max(row_number())) %>%
mutate(df = diff(Total)) %>%
select(Participant, df) %>%
unique()
Source: local data frame [3 x 2]
Groups: Participant
Participant df
1 1 37
2 2 57
3 40 -14
Upvotes: 2