Reputation: 351
Using the below data, I would like to make something generalized to convert a single column with factors into separate linked columns. Here the 'letters' column would create two new columns 'a' and 'b'.
data <- data.frame(letters = c("a", "a", "b", "c"),ints = c(1, 2, 1, 1), reals = c(.01, .22, .01, .02))
letters ints reals
a 1 0.01
a 2 0.22
b 1 0.01
b 1 0.02
Would like to return something like this.
ints a b
1 .01 NA
2 .22 NA
1 NA .01
1 NA .02
Is there a way to do this using tidytext for example? Or perhaps expanding on something like this would work:
model.matrix( ints ~ letters + reals, data = data )
Upvotes: 1
Views: 86
Reputation: 3502
There should not be c in your letters
variable as you showed in your data.frame.
You can do it using dplyr
and tidyr
if you have many letters
data1 <- data %>%
dplyr::mutate(id=seq_len(nrow(.))) %>%
tidyr::spread(letters, reals) %>%
dplyr::arrange(id)
> data1
ints id a b
1 1 1 0.01 NA
2 2 2 0.22 NA
3 1 3 NA 0.01
4 1 4 NA 0.02
Or if you have only two letters a and b, you can also do it using ifelse.
library(dplyr)
data1 <- data %>%
dplyr::mutate(a = ifelse(letters=="a"&ints%in%c(1,2), reals, NA),
b = ifelse(letters=="b"&ints==1, reals, NA))
> data1
letters ints reals a b
1 a 1 0.01 0.01 NA
2 a 2 0.22 0.22 NA
3 b 1 0.01 NA 0.01
4 b 1 0.02 NA 0.02
Upvotes: 1
Reputation: 887501
We can do this with dcast
from data.table
library(data.table)
dcast(setDT(data), ints +rowid(ints) ~ letters, value.var = "reals")[order(ints_1)
][, ints_1 := NULL][]
# ints a b
#1: 1 0.01 NA
#2: 2 0.22 NA
#3: 1 NA 0.01
#4: 1 NA 0.02
Upvotes: 3
Reputation: 93908
I think this is what you want, assuming a typo in your example:
reshape(
transform(data,id=seq_len(nrow(data))),
idvar=c("id","ints"), timevar="letters", direction="wide"
)
# ints id reals.a reals.b
#1 1 1 0.01 NA
#2 2 2 0.22 NA
#3 1 3 NA 0.01
#4 1 4 NA 0.02
Upvotes: 4