Reputation: 11
Here's some example code to help everyone understand what I am trying to do here:
ID PicName Rating Correct
1 flower 4
1 Answer 1
2 cat 5
2 Answer 1
What I'm trying to do is link up the "Answer" rows with the corresponding data in the row above them, then delete those "Answer" rows. So namely, I need the contents of the "Correct" column on "Answer" rows moved to the cell in the same column of the previous row. My instinct on this problem was initially to try to index those answer rows and search through the dataframe this way, but I'm not sure how to implement a command to take the data out of these specific cells and move it to other specific cells. Any insight on this problem would be greatly appreciated, thanks!
Upvotes: 1
Views: 1309
Reputation: 28685
Using the data.table
package you can go through each ID
, select the first value of PicName
and Rating
, and the second value of Correct
.
library(data.table)
setDT(df) # give df class "data.table"
df[, .( PicName = PicName[1]
, Rating = Rating[1]
, Correct = Correct[2])
, by = ID]
Returns
ID PicName Rating Correct
1: 1 flower 4 1
2: 2 cat 5 1
Another option is to merge
the Answer rows with the others (gives same result)
setkey(df, ID)
merge(df[PicName != 'Answer', -'Correct']
, df[PicName == 'Answer', .(ID, Correct)])
Data Used:
df <- fread(
"ID PicName Rating Correct
1 flower 4 NA
1 Answer NA 1
2 cat 5 NA
2 Answer NA 1")
===================
If you have more than one pair per ID
group you can use the below code. This picks out rows where PicName == 'Answer'
is FALSE
, uses those rows for PicName
and Rating
columns, then uses the other rows for the Correct
column.
library(data.table)
library(magrittr)
setDT(df) # give df class "data.table"
df[, (PicName == 'Answer')
%>% {.(PicName = PicName[!.]
, Rating = Rating[!.]
, Correct = Correct[.])}
, by = ID]
Upvotes: 1
Reputation: 2335
@Renu's approach is great, however, I prefer dplyr
/ tidyverse
:
df %>%
group_by(ID) %>%
summarise(
PicName = first(PicName),
Rating = first(Rating),
Correct = nth(Correct, 2))
returns:
# A tibble: 2 x 4
ID PicName Rating Correct
<int> <fct> <int> <int>
1 1 flower 4 1
2 2 cat 5 1
Pretty much the same thing though.
Upvotes: 1