user275801
user275801

Reputation: 25

How to subset multiple times from larger data frame in R

I have the following wide data in a .csv:

Subj Show1_judgment1 Show1_judgment2 Show1_judgment3 Show1_judgment4 Show2_judgment1 Show2_judgment2 Show2_judgment3 Show2_judgment4
1 3 2 5 6 3 2 5 7
2 1 3 5 4 3 4 1 6
3 1 2 6 2 3 7 2 6

The columns keep going for the same four judgments for a series of 130 different shows.

I want to change this data into long form so that it looks like this:

Subj show judgment1 judgment2 judgment3 judgment4
1 show1 2 5 6 1
1 show2 3 5 4 4
1 show3 2 6 2 5

Usually, I would use base r to subset the columns into their own dataframes and then used rbind to put them into one dataframe.

But since there are so many different shows, it will be very inefficient to do it like that. I am relatively novice at R, so I can only do very basic for loops, but I think a for loop that subsets the subject column (first column in data) and then groups of 4 sequential columns would do this much more efficiently.

Can anyone help me create a for loop for this?

Thank you in advance for your help!

Upvotes: 0

Views: 94

Answers (1)

r2evans
r2evans

Reputation: 160407

No for loop required, this is transforming or "pivoting" from wide to long format.

tidyr

tidyr::pivot_longer(dat, -Subj, names_pattern = "(.*)_(.*)", names_to = c("show", ".value"))
# # A tibble: 6 x 6
#    Subj show  judgment1 judgment2 judgment3 judgment4
#   <int> <chr>     <int>     <int>     <int>     <int>
# 1     1 Show1         3         2         5         6
# 2     1 Show2         3         2         5         7
# 3     2 Show1         1         3         5         4
# 4     2 Show2         3         4         1         6
# 5     3 Show1         1         2         6         2
# 6     3 Show2         3         7         2         6

data.table

Requires data.table-1.14.3, relatively new (or from github).

data.table::melt(
  dat, id.vars = "Subj",
  measure.vars = measure(show, value.name, sep = "_"))

Upvotes: 2

Related Questions