Reputation: 1293
I have a dataframe with three factors of which two are binary and the third one is integer:
DATA YEAR1 YEAR2 REGION1 REGION2
OBS1 X 1 0 1 0
OBS2 Y 1 0 0 1
OBS3 Z 0 1 1 0
etc.
Now I want to transform it to something like this
YEAR1_REGION1 YEAR1_REGION2 YEAR2_REGION1 YEAR2_REGION2
OBS1 X 0 0 0
OBS2 0 Y 0 0
OBS3 0 0 Z 0
Basic matrix multiplication is not what I'm after. I would like to find a neat way to do this that would automatically have the columns renamed as well. My actual data has three factor dimensions with 20*8*6 observations so finally there will be 960 columns altogether.
Upvotes: 2
Views: 113
Reputation: 18437
Here's another approach based on outer
and similar to @Roland answer.
year <- grep("YEAR", names(DF), value = TRUE)
region <- grep("REGION", names(DF), value = TRUE)
data <- as.character(DF$DATA)
df <- outer(year, region, function(x, y) DF[,x] * DF[,y])
colnames(df) <- outer(year, region, paste, sep = "_")
df <- as.data.frame(df)
for (i in seq_len(ncol(df)))
df[as.logical(df[,i]), i] <- data[as.logical(df[,i])]
df
## YEAR1_REGION1 YEAR2_REGION1 YEAR1_REGION2 YEAR2_REGION2
## OBS1 X 0 0 0
## OBS2 0 0 Y 0
## OBS3 0 Z 0 0
Upvotes: 4
Reputation: 132626
Maybe others will come up with a more succinct possibility, but this creates the expected result:
DF <- read.table(text=" DATA YEAR1 YEAR2 REGION1 REGION2
OBS1 X 1 0 1 0
OBS2 Y 1 0 0 1
OBS3 Z 0 1 1 0", header=TRUE)
DF[,-1] <- lapply(DF[,-1], as.logical)
DF[,1] <- as.character(DF[,1])
res <- apply(expand.grid(2:3, 4:5), 1, function(i) {
tmp <- rep("0", length(DF[,1]))
ind <- do.call(`&`,DF[,i])
tmp[ind] <- DF[ind,1]
tmp <- list(tmp)
names(tmp) <- paste0(names(DF)[i], collapse="_")
tmp
})
res <- as.data.frame(res)
rownames(res) <- rownames(DF)
# YEAR1_REGION1 YEAR2_REGION1 YEAR1_REGION2 YEAR2_REGION2
# OBS1 X 0 0 0
# OBS2 0 0 Y 0
# OBS3 0 Z 0 0
However, I suspect there is a much better possibility to achieve what you actually want to do, without creating a huge wide-format data.frame.
Upvotes: 4