Reputation: 79
I have a dataframe with 2 columns that I want to merge:
Region PA
1 Mbeya Ruaha National Park
2 Mbeya Ruaha National Park
3 Mbeya Ruaha National Park
4 Mbeya Ruaha National Park
5 Mbeya Ruaha National Park
6 Mbeya Ruaha National Park
7 Mbeya NA
8 Mbeya NA
9 Mbeya NA
10 Mbeya NA
This could be merged by either taking PA values and overwriting Region values in the rows, or replacing all NA's in PA with the values in Region for that row.
I've tried:
Carcass.cleaned$New<-rowSums(Carcass.cleaned[, c("PA", "Region")], na.rm=T)
Error in base::rowSums(x, na.rm = na.rm, dims = dims, ...) :
'x' must be numeric
with(Carcass.cleaned,ifelse(is.na(PA),Region,PA))
(returns list of numbers)
and coalesce(Carcass.cleaned$PA, Carcass.cleaned$Region)
unite(Carcass.cleaned, new, PA:Region, sep='')
(both merge the columns names instead of replacing)
Upvotes: 2
Views: 59
Reputation: 388797
You can replace for NA
values in PA
with corresponding Region
value.
df$PA[is.na(df$PA)] <- df$Region[is.na(df$PA)]
df
# Region PA
#1 Mbeya Ruaha National Park
#2 Mbeya Ruaha National Park
#3 Mbeya Ruaha National Park
#4 Mbeya Ruaha National Park
#5 Mbeya Ruaha National Park
#6 Mbeya Ruaha National Park
#7 Mbeya Mbeya
#8 Mbeya Mbeya
#9 Mbeya Mbeya
#10 Mbeya Mbeya
Upvotes: 0
Reputation: 886938
We could use coalesce
from dplyr
library(dplyr)
df %>%
mutate(PA = coalesce(PA, Region))
# Region PA
#1 Mbeya Ruaha National Park
#2 Mbeya Ruaha National Park
#3 Mbeya Ruaha National Park
#4 Mbeya Ruaha National Park
#5 Mbeya Ruaha National Park
#6 Mbeya Ruaha National Park
#7 Mbeya Mbeya
#8 Mbeya Mbeya
#9 Mbeya Mbeya
#10 Mbeya Mbeya
Or using fcoalesce
in data.table
library(data.table)
setDT(df)[, PA := fcoalesce(PA, Region)]
df <- structure(list(Region = c("Mbeya", "Mbeya", "Mbeya", "Mbeya",
"Mbeya", "Mbeya", "Mbeya", "Mbeya", "Mbeya", "Mbeya"), PA = c("Ruaha National Park",
"Ruaha National Park", "Ruaha National Park", "Ruaha National Park",
"Ruaha National Park", "Ruaha National Park", NA, NA, NA, NA)), row.names = c(NA,
-10L), class = "data.frame")
Upvotes: 1
Reputation: 39585
Try with mutate()
from dplyr
:
library(tidyr)
library(dplyr)
#Code
df <- df %>% group_by(Region) %>%
mutate(PA=ifelse(is.na(PA),Region,PA))
Output:
# A tibble: 10 x 2
# Groups: Region [1]
Region PA
<chr> <chr>
1 Mbeya Ruaha National Park
2 Mbeya Ruaha National Park
3 Mbeya Ruaha National Park
4 Mbeya Ruaha National Park
5 Mbeya Ruaha National Park
6 Mbeya Ruaha National Park
7 Mbeya Mbeya
8 Mbeya Mbeya
9 Mbeya Mbeya
10 Mbeya Mbeya
Some data used:
#Data
df <- structure(list(Region = c("Mbeya", "Mbeya", "Mbeya", "Mbeya",
"Mbeya", "Mbeya", "Mbeya", "Mbeya", "Mbeya", "Mbeya"), PA = c("Ruaha National Park",
"Ruaha National Park", "Ruaha National Park", "Ruaha National Park",
"Ruaha National Park", "Ruaha National Park", NA, NA, NA, NA)), row.names = c(NA,
-10L), class = "data.frame")
Upvotes: 2
Reputation: 786
You can use a simple if else statement:
df$Region <- ifelse(is.na(df$PA), df$Region, df$PA)
Basically wherever PA is NA you leave Region untouched, and where PA has a value you overwrite the value in Region. Afterwords you can delete PA if you want
Upvotes: 2