Reputation: 48
I have a much larger version of the following data set.
mydata <- data.frame("C1" = c("EB", "L", 1, 2, 3) ,
"C2" = c("WB", "T", 4, 5, 6),
"C3" = c("NB", "R", 7, 8, 9),
"C4" = c("SB", "Total", 12, 15, 18),
"C5" = c("", "L",1, 2, 3),
"C6" = c("", "TR", 4, 5, 6),
"C7" = c("", "Total", 5, 7, 9),
"C8" = c("", "L",1, 2, 3),
"C9" = c("", "R", 4, 5, 6),
"C10" = c("", "Total", 5, 7, 9),
"C11" = c("", "L",1, 2, 3),
"C12" = c("", "TR", 4, 5, 6),
"C13" = c("", "Total", 5, 7, 9), stringsAsFactors = FALSE)
> mydata
C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 C12 C13
1 EB WB NB SB
2 L T R Total L TR Total L R Total L TR Total
3 1 4 7 12 1 4 5 1 4 5 1 4 5
4 2 5 8 15 2 5 7 2 5 7 2 5 7
5 3 6 9 18 3 6 9 3 6 9 3 6 9
I want a combination of the first two rows so the resulting data set will look something like this
> myresult
C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 C12 C13
1 EBL EBT EBR EBTotal WBL WBTR WBTotal NBL NBR NBTotal SBL SBTR SBTotal
2 1 4 7 12 1 4 5 1 4 5 1 4 5
3 2 5 8 15 2 5 7 2 5 7 2 5 7
4 3 6 9 18 3 6 9 3 6 9 3 6 9
Basically, I want to keep the rest of the data frame as it is.
Please note: this is a situation from a very large data set with 1,000 records and 18 cols. Any help will be much appreciated. Thanks in advance!
Update: I've tried
mydata[1,]= paste0(rep(mydata[1,], each = length(mydata[2,])), mydata[2,])
but this just repeats one value from the first row
C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 C12 C13
1 EBL EBT EBR EBTotal EBL EBTR EBTotal EBL EBR EBTotal EBL EBTR EBTotal
2 L T R Total L TR Total L R Total L TR Total
3 1 4 7 12 1 4 5 1 4 5 1 4 5
4 2 5 8 15 2 5 7 2 5 7 2 5 7
5 3 6 9 18 3 6 9 3 6 9 3 6 9
Upvotes: 3
Views: 135
Reputation: 388982
Looking at the data it looks like you want to paste the first row combination with second row till the value "Total"
comes in second row. We can extract first and second row separately, split
them based on occurrence of "Total"
and paste them with first_row
to get new values.
first_row <- mydata[1, ]
first_row <- first_row[first_row != ""]
second_row <- as.character(mydata[2, ])
mydata[1, ] <- unlist(Map(paste0,first_row, split(second_row,
c(0, cumsum(second_row == "Total")[-length(second_row)]))))
mydata <- mydata[-2, ]
mydata
# C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 C12 C13
#1 EBL EBT EBR EBTotal WBL WBTR WBTotal NBL NBR NBTotal SBL SBTR SBTotal
#3 1 4 7 12 1 4 5 1 4 5 1 4 5
#4 2 5 8 15 2 5 7 2 5 7 2 5 7
#5 3 6 9 18 3 6 9 3 6 9 3 6 9
Upvotes: 2
Reputation: 2364
You have to repeat 'EB' four times and the others only three times. I think now it is correct:
colnames(mydata) <- paste0(c(rep(mydata[1,1], 4), rep(mydata[1, 2:4], each = 3)), mydata[2,])
mydata2 <- apply(mydata[-c(1:2),], 2, as.numeric)
> mydata2
EBL EBT EBR EBTotal WBL WBTR WBTotal NBL NBR NBTotal SBL SBTR SBTotal
[1,] 1 4 7 12 1 4 5 1 4 5 1 4 5
[2,] 2 5 8 15 2 5 7 2 5 7 2 5 7
[3,] 3 6 9 18 3 6 9 3 6 9 3 6 9
Upvotes: 1