Reputation: 11
Please let me know 'R code' that could rearrange data from
AA 100 NA
BB 200 300
CC 300 NA
DD 100 400
to
AA 100 0 0 0
BB 0 200 300 0
CC 0 0 300 0
DD 100 0 0 400
OR
100 200 300 400
AA 1 0 0 0
BB 0 1 1 0
CC 0 0 1 0
DD 1 0 0 1
Upvotes: 1
Views: 133
Reputation: 762
To get the values, one could also use the reshape2 package:
DF <- read.table(text = "AA 100 NA
BB 200 300
CC 300 NA
DD 100 400")
library(reshape2)
dfm <- melt(DF, id = "V1")
dcast(dfm, V1 ~ factor(value), fill = 0)[, -6]
V1 100 200 300 400
1 AA 100 0 0 0
2 BB 0 200 300 0
3 CC 0 0 300 0
4 DD 100 0 0 400
The last column in dcast() is removed because NA is a value in dfm$value and takes up the last column in the cast data frame.
Upvotes: 0
Reputation: 162311
df <- read.table(text = "AA 100 NA
BB 200 300
CC 300 NA
DD 100 400")
table(data.frame(letters = df[,1], numbers = unlist(df[,-1])))
# numbers
# letters 100 200 300 400
# AA 1 0 0 0
# BB 0 1 1 0
# CC 0 0 1 0
# DD 1 0 0 1
Upvotes: 6
Reputation: 55340
# SAMPLE DATA
myDF <- structure(list(V2 = c(100L, 200L, 300L, 100L), V3 = c(NA, 300L, NA, 400L)), .Names = c("V2", "V3"), class = "data.frame", row.names = c("AA", "BB", "CC", "DD"))
Assuming myDf
is your original data frame
# create columns sequence
Columns <- seq(100, 400, by=100)
newMat <- sapply(Columns, function(c) rowSums(c==myDF, na.rm=T))
# assign names
colnames(newMat) <- Columns
newMat
# 100 200 300 400
# AA 1 0 0 0
# BB 0 1 1 0
# CC 0 0 1 0
# DD 1 0 0 1
c == myDF
gives a matrix of TRUE/FALSE values.
If you perform arithmetic on T/F, they are treated as 1/0
Thus, we can take the rowSum()
for each row AA, BB, etc.
which will tell us how many times each row is equal to c.
We use sapply
to iterate over each column value, 100, 200, etc.
lapply
returns for us a list
sapply
, takes that list and simplifies it into a nice matrix.
we then clean up the names to make things pretty.
Upvotes: 3