Rafael
Rafael

Reputation: 3196

fill in missing values based on unsorted id

I need to fill in the NA based on matching ID. However, it's impossible to sort this data.frame so that the relevant NA always happen before (or after) the unique ID so I can't use loc.na

It's hard to come up with an example that isn't my full data.frame but I think this works.. sample data after attempted sort:

df
id value col1
1  NA    ABC  
1  2     LMN  
2  3     RGG
2  NA    Z    
3  5     H  
4  7     QRS  

To look like:

id value col1
1  2     ABC  
1  2     LMN  
2  3     RGG
2  3     Z    
3  5     H  
4  7     QRS  

Or is there a way to use loc.na that does not rely on sorting?

Upvotes: 3

Views: 85

Answers (2)

moodymudskipper
moodymudskipper

Reputation: 47320

wouldn't this work ?

matching_table <- unique(df[,c("id","value"))
new_df <- merge(df[,-2],matching_table,all.x = TRUE)

Upvotes: 0

Jeppe Olsen
Jeppe Olsen

Reputation: 1008

df  <- read.table(header = TRUE,
                 text = "id value col1
                  1  NA    ABC  
                  1  2     LMN  
                  2  3     RGG
                  2  NA    Z    
                  3  5     H  
                  4  7     QRS  
")

First you could order the data frame, for example with the package data.table:

library(data.table)
setorder(df, id, value)

> df
  id value col1
1  1    NA  ABC
2  1     2  LMN
4  2    NA    Z
3  2     3  RGG
5  3     5    H
6  4     7  QRS

If you want it to be after, you could just setorder(df, id, value, na.last=T). Finally you can replace NA's:

df$value[is.na(df$value)] <- df$value[which(is.na(df$value))+1]

And then sort it again:

setorder(df, id, col1)

> print(df, row.names = F)
 id value col1
  1     2  ABC
  1     2  LMN
  2     3  RGG
  2     3    Z
  3     5    H
  4     7  QRS

Upvotes: 1

Related Questions