Reputation: 197
My data set in R looks like the one below, where I have multiple IDs and Years but not always the information for Street, State, and Country.
ID Year Street State Country
1 2000 123 Main St CA USA
1 2001 N/A N/A N/A
1 2002 N/A N/A N/A
...
1 2017 N/A N/A N/A
2 2001 123 Bloom Rd CA USA
2 2002 123 Bloom Rd CA USA
2 2003 N/A N/A N/A
...
2 2017 N/A N/A N/A
...
My goal is to fill out the N/As with the appropriate values i.e. the corresponding values per each ID. So for ID "1" there should be "123 Main Street" under the Street N/As and so on.
Thank you!
Upvotes: 0
Views: 46
Reputation: 5673
Here the solution with both data.tbale and dplyr
df <- read.table(text = "ID, Year, Street, State, Country
1, 2000, 123 Main St, CA, USA
1, 2001, N/A, N/A, N/A
1, 2002, N/A, N/A, N/A
1, 2017, N/A, N/A, N/A
2, 2001, 123 Bloom Rd, CA, USA
2, 2002, 123 Bloom Rd, CA, USA
2, 2003, N/A, N/A, N/A
2, 2017, N/A, N/A, N/A",header = T,sep = ",")
library(dplyr)
df %>%
group_by(ID) %>%
mutate_at(vars('Street', 'State', 'Country'), funs(.[!is.na(.)][1]))
library(data.table)
df <- setDT(df)
coltochange <- c("Street", "State", "Country")
df[, c(coltochange) := lapply(.SD,function(x){x[!is.na(x)][1]}),.SDcols = coltochange ,by = ID]
You can consider the na.locf function from zoo:
library(zoo)
na.locf(df)
Upvotes: 2
Reputation: 2246
Try the tidyverse
approach:
df <- read_table("ID Year Street State Country #importing the data
1 2000 123_Main_St CA USA
1 2001 N/A N/A N/A
1 2002 N/A N/A N/A
1 2017 N/A N/A N/A
2 2001 123_Bloom_Rd CA USA
2 2002 123_Bloom_Rd CA USA
2 2003 N/A N/A N/A
2 2017 N/A N/A N/A") %>%
separate("ID Year Street State Country", c("ID", "Year", "Street", "State", "Country"), sep = " ") %>% # cleaning the columns
group_by(ID) %>% # grouping by vars with same ID(Information)
mutate_at(vars('Street', 'State', 'Country'), funs(.[.!= "N/A"][1])) # replace NA with information of same ID without NA (remember NA is still a string from import)
Upvotes: 0