Reputation: 523
I am trying to merge rows in my data.frame
based on <NA>
value.
Here is my data frame.
new <- data.frame (
Location = c(rep("Loc 1", 4), rep("Loc 2", 4)),
Place = c("Powder Springs_Original", "Bridge_Other County", "Airport", "County1", "City 4 - Duplicated", "South", "County2", "Formal place"),
Val1 = c(109, 123, NA, 117, 143, NA, 151, 142),
Val2 = c(102, 115, NA, 45, 135, NA, 144, 125),
Val3 = c(99, 112, NA, 26, 127, NA, 140, 132),
Val4 = c(90, 103, NA, 57, 125, NA, 135, 201))
I am expecting something like,
Location Place Val1 Val2 Val3 Val4
Loc 1 Powder Springs - Original 109 102 99 90
Loc 1 Bridge _ Other County 123 115 112 103
Loc 1 Airport County1 117 45 26 57
Loc 2 City 4 - Duplicated 143 135 127 125
Loc 2 South County2 151 144 140 135
Loc 2 Formal place 142 125 132 201
I want to remove the NA rows and merge data with the next row. Location for these values is same. Can someone please help me here.
Thanks in advance.
Upvotes: 2
Views: 782
Reputation: 523
Thanks for your help and support. After lot of trails, I got the below required output. (As suggested by @Robert Krzyzanowski, I renamed my data.frame
to Test
).
This is what I did. Please suggest, if anything weird is observed.
> new_DF <- subset(Test, is.na(Test$Val1))
> new_DF
Location Place Val1 Val2 Val3 Val4
3 Loc 1 Airport NA NA NA NA
6 Loc 2 South NA NA NA NA
>
> row.names(new_DF)
[1] "3" "6"
> x.num <- as.numeric(row.names(new_DF))
>
> Test$Place <- as.character(Test$Place)
> Test$Place[x.num + 1] <- paste(Test$Place[x.num], Test$Place[x.num + 1])
> Test <- Test[-x.num, ]
> Test
Location Place Val1 Val2 Val3 Val4
1 Loc 1 Powder Springs_Original 109 102 99 90
2 Loc 1 Bridge_Other County 123 115 112 103
4 Loc 1 Airport County1 117 45 26 57
5 Loc 2 City 4 - Duplicated 143 135 127 125
7 Loc 2 South County2 151 144 140 135
8 Loc 2 Formal place 142 125 132 201
Once again, thank you all for your support and your time for looking into this.
Upvotes: 0
Reputation: 6449
nu <- data.frame (
Location = c(rep("Loc 1", 4), rep("Loc 2", 4)),
Place = c("Powder Springs_Original", "Bridge_Other County", "Airport", "County1", "City 4 - Duplicated", "South", "County2", "Formal place"),
Val1 = c(109, 123, NA, 117, 143, NA, 151, 142),
Val2 = c(102, 115, NA, 45, 135, NA, 144, 125),
Val3 = c(99, 112, NA, 26, 127, NA, 140, 132),
Val4 = c(90, 103, NA, 57, 125, NA, 135, 201), stringsAsFactors=FALSE)
# notice stringsAsFactors = FALSE
# if there was justice in the world, it should be FALSE by default in R
# in any case, nu$Place should be character rather than factor so in real data
# you may need to do nu$Place <- as.character(nu$Place)
ic <- which(!complete.cases(nu))
nu$Place[ic-1] <- paste(nu$Place[ic-1], nu$Place[ic])
nu <- nu[-ic,]
Does this do what you need?
Upvotes: 0
Reputation: 9344
First off, you shouldn't be using new
as your variable name since it's a built-in R function. Second, you could do something like this:
# Find which rows are NA
na_rows <- which(apply(new, 1, function(x) all("NA" == (x[paste0('Val', 1:4)]))))
# Set correct place names
new$Place <- as.character(new$Place)
new$Place[na_rows + 1] <- paste(new$Place[na_rows], new$Place[na_rows + 1])
# Remove NAs
new <- new[-na_rows, ]
# Location Place Val1 Val2 Val3 Val4
# 1 Loc 1 Powder Springs_Original 109 102 99 90
# 2 Loc 1 Bridge_Other County 123 115 112 103
# 4 Loc 1 Airport County1 117 45 26 57
# 5 Loc 2 City 4 - Duplicated 143 135 127 125
# 7 Loc 2 South County2 151 144 140 135
# 8 Loc 2 Formal place 142 125 132 201
Upvotes: 1