Reputation: 45
I'm struggling with coming up with the correct process to transform some data I'm doing analysis on without resorting to a scripting language.
The data takes a format similar to the following
data.frame(Group=LETTERS[1:3],Total=c(100,120,130),Modified=c(12,15,32))
Group Total Modified
1 A 100 12
2 B 120 15
3 C 130 32
I'd like the resulting data frame to look like
+-------+----------+
| Group | Modified |
+-------+----------+
| A | Y |
| A | Y |
| A | Y |
| . | . |
| . | . |
| . | . |
| A | N |
| A | N |
| B | Y |
| B | Y |
| . | . |
| . | . |
| . | . |
| B | N |
+-------+----------+
There should be 12 rows with Group A and Modified = Y and 88 rows with Group A and Modified = N. Same goes for B, C, etc.
In most cases there are additional columns that will need to be repeated on each row along with the Group info.
Upvotes: 4
Views: 174
Reputation: 109864
Slightly different approach:
dat <- data.frame(Group=LETTERS[1:3],Total=c(100,120,130),Modified=c(12,15,32))
dat$diff <- dat$Total - dat$Modified
library(reshape2)
dat2 <- melt(dat[, -2])
dat2 <- dat2[order(dat2$Group), ]
levels(dat2$variable) <- c("Y", "N")
dat2 <- dat2[rep(1:nrow(dat2), dat2$value), -3]
rownames(dat2) <- NULL
Upvotes: 0
Reputation: 93813
Code to convert:
result <- do.call(rbind,
by(test,
test$Group,
function(x)
data.frame(
Group=x$Group[1],
Modified=rep(c("Y","N"),c(x$Modified,x$Total - x$Modified))
)
)
)
Output like:
> head(result)
Group Modified
A.1 A Y
A.2 A Y
A.3 A Y
A.4 A Y
A.5 A Y
A.6 A Y
Checking it worked:
> with(result,table(Group,Modified))
Modified
Group N Y
A 88 12
B 105 15
C 98 32
Upvotes: 3
Reputation: 115392
You can use rep
with the appropriate times
argument.
A data.table
solution for coding elegance
library(data.table)
# your data is in the data.frame DF
DF <- data.table(DF)
levels <- c('Y', 'N')
DF[,list(Modified = rep(levels,c(Modified,Total-Modified))),by = Group]
Upvotes: 10