Reputation: 107

Error when duplicating a row conditionally - R

I have a data frame with columns A, B, C as follows:

A <- c("NX300", "BT400", "GD200")
B <- c("M0102", "N0703", "M0405")
C <- c(NA, "M0104", "N0404")
df <- data.frame (A,B,C)

Instead, I would like to duplicate a row whenever a value in C is not NA and replace the value of B with NA for the duplicated row. This is the desired output:

A1 <- c("NX300", "BT400", "BT400", "GD200", "GD200")
B1 <- c("M0102", "N0703", NA, "M0405", NA)
C1 <- c(NA, NA, "M0104", NA, "N0404")
df1 <- data.frame(A1,B1,C1)

To achieve this, I tried duplicating the row, without replacing B with NA just yet, but I get the following error code:

rbind(df, df[,is.na(C)==FALSE])

Error: object "C" not found

Can anyone help please?

Upvotes: 1

Answers (3)

GKi

Reputation: 39657

If sorting does not matter, and continuing your first steps you can try:

x <- rbind(df, cbind(df[!is.na(df$C),1:2], C=NA))
x$B[!is.na(x$C)] <- NA

x
#       A     B     C
#1  NX300 M0102  <NA>
#2  BT400  <NA> M0104
#3  GD200  <NA> N0404
#21 BT400 N0703  <NA>
#31 GD200 M0405  <NA>

Upvotes: 1

G. Grothendieck

Reputation: 269481

Define a function newrows which accepts a row x and returns it or the duplicated rows and then apply it to each row. No packages are used.

newrows <- function(x) {
  if (is.na(x$C)) x 
  else rbind(replace(x, "C", NA), replace(x, "B", NA))
}
do.call("rbind", by(df, 1:nrow(df), newrows))

giving:

         A     B     C
1    NX300 M0102  <NA>
2.2  BT400 N0703  <NA>
2.21 BT400  <NA> M0104
3.3  GD200 M0405  <NA>
3.31 GD200  <NA> N0404

Upvotes: 4

akrun

Reputation: 887038

An option would be

library(dplyr)
df %>% 
   mutate(i1 = 1 + !is.na(C)) %>% 
   uncount(i1) %>% 
   mutate(B = replace(B, duplicated(B), NA)) %>% 
   group_by(A) %>%
   mutate(C = replace(C, duplicated(C, fromLast = TRUE), NA))

Upvotes: 2

Error when duplicating a row conditionally - R

Answers (3)

Related Questions