Fill a column's blank spaces contingent on a second column in R

Question

I'd appreciate some help with this one. I have something similar to the data below.

df$A df$B
1    .
1    .
1    .
1    6
2    .
2    .
2    7

What I need to do is fill in df$B with each value that corresponds to the end of the run of values in df$A. Example below.

Any help would be welcome.

akrun · Accepted Answer

It seems to me that the missing values are denoted by .. It is better to read the dataset with na.strings="." so that the missing values will be NA. For the current dataset, the 'B' column would be character/factor class (depending upon whether you used stringsAsFactors=FALSE/TRUE (default) in the read.table/read.csv.

Using data.table, we convert the data.frame to data.table (setDT(df1)), change the 'character' class to 'numeric' (B:= as.numeric(B)). This will also result in coercing the . to NA (a warning will appear). Grouped by "A", we change the "B" values to the last element (B:= B[.N])

library(data.table)
setDT(df1)[,B:= as.numeric(B)][,B:=B[.N] , by = A]
#   A B
#1: 1 6
#2: 1 6
#3: 1 6
#4: 1 6
#5: 2 7
#6: 2 7
#7: 2 7

Or with dplyr

library(dplyr)
df1 %>%
     group_by(A) %>%
     mutate(B= as.numeric(tail(B,1)))

Or using ave from base R

df1$B <- with(df1, as.numeric(ave(B, A, FUN=function(x) tail(x,1))))

data

df1 <- structure(list(A = c(1L, 1L, 1L, 1L, 2L, 2L, 2L), B = c(".", 
".", ".", "6", ".", ".", "7")), .Names = c("A", "B"), 
class = "data.frame", row.names = c(NA, -7L))

Fill a column's blank spaces contingent on a second column in R

Answers (2)

data

Related Questions

Fill a column&#39;s blank spaces contingent on a second column in R

Answers (2)

data

Related Questions

Fill a column's blank spaces contingent on a second column in R