dmunslow
dmunslow

Reputation: 149

Reshaping Data in R - Creating new columns based on values in an existing column

So I am working on a problem in R, where I have a data frame that have a column which contains a series of variable names:

*Name*   *id_key*   *detail*    *var_names*  *values*
 Jose      123        red         foo          abc
 Jose      123        blue        foo          abc
 Jose      123        green       foo          abc
 Mel       456        red         bar          555
 Mel       456        green       bar          555
 Dom       789        yellow      choo         fjfj55bar

What I would like to achieve is the following:

*Name*   *id_key*   *detail*   *foo*    *bar*   *choo*
 Jose      123        red       abc      NA      NA
 Jose      123        blue      abc      NA      NA 
 Jose      123        green     abc      NA      NA
 Mel       456        red       NA       555     NA
 Mel       456        green     NA       555     NA
 Dom       789        yellow    NA       NA      fjfj55bar

I tried using dcast from the reshape2 package with the following command - but it did not produce the desired results:

toy_data_unmelt <- dcast(toy_data, formula = name~var_names, value.var = "values")

Any help would be greatly appreciated!

Upvotes: 1

Views: 88

Answers (2)

Deepak Rajendran
Deepak Rajendran

Reputation: 368

You will want to use the spread function from the tidyr package for this:

library(tidyr)

toy_data = data.frame(Name = c("Jose", "Jose", "Jose", "Mel", "Mel", "Dom"), 
                      id_key = c(123, 123, 123, 456, 456, 789),
                      detail = c("red", "blue", "green", "red", "green", "yellow"), 
                      var_names = c("foo", "foo", "foo", "bar", "bar", "choo"),
                      values = c("abc", "abc", "abc", "555", "555", "fjfj55bar"))

toy_data %>% spread(var_names, values, fill = NA)

Output:

#  Name id_key detail  bar      choo  foo
#1  Dom    789 yellow <NA> fjfj55bar <NA>
#2 Jose    123   blue <NA>      <NA>  abc
#3 Jose    123  green <NA>      <NA>  abc
#4 Jose    123    red <NA>      <NA>  abc
#5  Mel    456  green  555      <NA> <NA>
#6  Mel    456    red  555      <NA> <NA>

Upvotes: 1

Melissa Key
Melissa Key

Reputation: 4551

reshape2 has been replaced by tidyr. (reshape2 still available, but I would make the switch to keep your code current.) Here's the tidyr solution:

library(tidyr)
toy_data <- read_table("*Name*   *id_key*   *detail*    *var_names*  *values*
 Jose      123        red         foo          abc
  Jose      123        blue        foo          abc
  Jose      123        green       foo          abc
  Mel       456        red         bar          555
  Mel       456        green       bar          555
  Dom       789        yellow      choo         fjfj55bar")
toy_data_wide <- spread(toy_data, `*var_names*`, `*values*`)

or, using the pipe operator

toy_data_wide <- toy_data %>%
  spread(`*var_names*`, `*values*`)

Upvotes: 1

Related Questions