Reputation: 3369
I have a data frame that comes in the following format
c1 c2 c3
1 A 1 D
2 A 2 D
3 A 3 D
4 X 4 D
5 A 5 D
6 X 6 D
7 X 7 D
8 A 8 D
I need to make it so that any rows with "X" in c1
are merged into c3
of the row above, like below
c1 c2 c3
1 A 1 D
2 A 2 D
3 A 3 DX4D
4 A 5 DX6DX7D
5 A 8 D
Any ideas?
Upvotes: 2
Views: 168
Reputation: 8413
though already answered , i would like to explain stepwise my approach :
for this I'm using a different data :
# c1 c2 c3
# A 1 D
# X 2 D
# A 3 D
# X 4 D
# A 5 D
# X 6 D
# X 7 D
# X 8 D
y = which(df1$c1=="X") # which rows are having "X"
z = cumsum(c(0,diff(y))!=1) # which of those are consecutive
# for the consecutive rows, paste all the columns data together
str <- sapply(unique(z), function(i) paste0(unlist(t(df1[y[z == i], ])),collapse = ""))
# which are the rows just occuring before these X's
z = unique(y[z])-1
# substitute the "pasted together" string at the rows just prior to X's
df1$c3[z] = paste(df1$c3[unique(y[z])-1],str,sep="")
# subset to have only non-X's rows
df1[df1$c1!="X",]
# c1 c2 c3
#1: A 1 DX2D
#2: A 3 DX4D
#3: A 5 DX6DX7DX8D
Upvotes: 1
Reputation: 12819
df <- read.table(text = " c1 c2 c3
1 A 1 D
2 A 2 D
3 A 3 D
4 X 4 D
5 A 5 D
6 X 6 D
7 X 7 D
8 A 8 D", stringsAsFactors = FALSE)
desired_output <- read.table(text = " c1 c2 c3
1 A 1 D
2 A 2 D
3 A 3 DX4D
4 A 5 DX6DX7D
5 A 8 D", stringsAsFactors = FALSE)
rownames(desired_output) <- NULL
library(dplyr)
output <-
df %>%
mutate(to_paste = ifelse(c1 == "X", paste0(c1, c2, c3), c3)) %>%
group_by(grp = cumsum(c1 == "A")) %>%
summarise(c1 = first(c1), c2 = first(c2), c3 = paste0(to_paste, collapse = "")) %>%
select(- grp) %>%
as.data.frame()
identical(output, desired_output)
# [1] TRUE
Upvotes: 1
Reputation: 37631
Since you do not provide your data structure, it is unclear if c3 is a factor or a string. Just in case, I am converting it to a string before processing.
dat$c3 = as.character(dat$c3)
for(r in nrow(dat):2) {
if(dat[r,1] == "X") {
dat[r-1,3] = paste(dat[r-1,3], "X", dat[r,2], dat[r,3], sep="")
dat = dat[-r,]
}
}
dat
c1 c2 c3
1 A 1 D
2 A 2 D
3 A 3 DX4D
5 A 5 DX6DX7D
8 A 8 D
Upvotes: 1