Reputation: 29
I have a dataframe that looks like this:
ID Region
1 a b c d
2 b c d e g
3 a c e f g h
.
.
.
I want to put it into a csv file that looks like this
ID a b c d e f g h
1 1 1 1 1 NA NA NA NA
2 NA 1 1 1 1 NA 1 NA
3 1 NA 1 NA 1 1 1 1
.
.
.
I can convert the dataframe into a csv file, but don't know how to format it this way. Any help would be greatly appreciated!
Upvotes: 0
Views: 144
Reputation: 887008
An option is also with cSplit_e
library(splitstackshape)
out <- cSplit_e(df, 'Region', type = 'character', drop = TRUE, sep=" ")
names(out) <- sub("Region_", "", names(out))
-output
> out
ID a b c d e f g h
1 1 1 1 1 1 NA NA NA NA
2 2 NA 1 1 1 1 NA 1 NA
3 3 1 NA 1 NA 1 1 1 1
df <- structure(list(ID = 1:3, Region = c("a b c d", "b c d e g", "a c e f g h"
)), class = "data.frame", row.names = c(NA, -3L))
Upvotes: 1
Reputation: 10375
Using reshape2
library(reshape2)
df2=melt(strsplit(df$Region,""))
df2=df2[df2$value!=" ",]
df2=dcast(df2,L1~value)
ifelse(is.na(df2),NA,1)
L1 a b c d e f g h
[1,] 1 1 1 1 1 NA NA NA NA
[2,] 1 NA 1 1 1 1 NA 1 NA
[3,] 1 1 NA 1 NA 1 1 1 1
Upvotes: 1
Reputation: 173793
If your data frame looks like this:
df <- data.frame(ID = 1:3, Region = c("a b c d", "b c d e g", "a c e f g h"))
df
#> ID Region
#> 1 1 a b c d
#> 2 2 b c d e g
#> 3 3 a c e f g h
And you have a vector of all possible regions you want to define:
all_regions <- c("a", "b", "c", "d", "e", "f", "g", "h")
Then you can do:
cbind(ID = df$ID,
setNames(as.data.frame(t(sapply(strsplit(df$Region, " "),
function(x) as.numeric(all_regions %in% x)))),
all_names))
#> ID a b c d e f g h
#> 1 1 1 1 1 1 0 0 0 0
#> 2 2 0 1 1 1 1 0 1 0
#> 3 3 1 0 1 0 1 1 1 1
Upvotes: 1