Reputation: 85

Taking variable names out of column and creating new columns in R

I am trying to take a dataframe like this

    name        response
 1   Phil        Exam
 2   Terry       Test
 3   Simmon      Exam
 4   Brad        Quiz

And turn it into this

    name        response    Exam    Test   Quiz
   1 Phil        Exam        Exam  
   2 Terry       Test                Test
   3 Simmon      Exam        Exam
   4 Brad        Quiz                       Quiz

I tried to use a for loop, extracting each row. Then I would check to see if the column already existed and if it did not it would create a new column. I couldnt get this close to working and am unsure how to do this.

Upvotes: 0

Answers (3)

akrun

Reputation: 887951

We can use dcast

library(data.table)
dcast(setDT(df1), name + response ~ response, value.var = 'response', fill = "")
#     name response Exam Quiz Test
#1:   Brad     Quiz      Quiz     
#2:   Phil     Exam Exam          
#3: Simmon     Exam Exam          
#4:  Terry     Test           Test

Upvotes: 0

www

Reputation: 39174

A base R solution. We can create a function to replace words that do not match to the target word, and then create the new column to the data frame.

# Create example data frame
dt <- read.table(text = "    name        response
 1   Phil        Exam
 2   Terry       Test
 3   Simmon      Exam
 4   Brad        Quiz", 
                 header = TRUE, stringsAsFactors = FALSE)

# A function to create a new column based on the word in response
create_Col <- function(word, df, fill = NA){
  new <- df$response
  new[!new == word] <- fill
  return(new)
} 

# Apply this function
for (i in unique(dt$response)){
  dt[[i]] <- create_Col(word = i, df = dt)
}

dt
    name response Exam Test Quiz
1   Phil     Exam Exam <NA> <NA>
2  Terry     Test <NA> Test <NA>
3 Simmon     Exam Exam <NA> <NA>
4   Brad     Quiz <NA> <NA> Quiz

Upvotes: 0

jdobres

Reputation: 11957

This can be accomplished a few ways. Might be a good opportunity to get to know the tidyverse:

library(tidyverse)
new.df <- spread(old.df, response, response)

This is an unusual use of tidyr::spread(). In this case, it constructs new column names from the values in "response", and also fills those columns with the values in "response". The fill argument can be used to change what goes in the resulting blank cells.

Upvotes: 2

Taking variable names out of column and creating new columns in R

Answers (3)

Related Questions