Reputation: 377
I play with the following two simple datasets:
(myData <- data.frame(ID=c(1:7, 5), Sum=c(10, 20, 30, 40, 50, 60, 700, 200)))
# ID Sum
# 1 1 10
# 2 2 20
# 3 3 30
# 4 4 40
# 5 5 50
# 6 6 60
# 7 7 700
# 8 5 200
and
(myMap <- data.frame(ID=c(1:5, 7), Name=c("a", "b", "c", "d", "e", "g")))
# ID Name
# 1 1 a
# 2 2 b
# 3 3 c
# 4 4 d
# 5 5 e
# 6 7 g
I will map the data with the map, this way:
myData$Name<-myMap$Name[match(myData$ID, myMap$ID)]
However since there is no map entry for the ID == 6
, the output is:
ID Sum Name
1 1 10 a
2 2 20 b
3 3 30 c
4 4 40 d
5 5 50 e
6 6 60 <NA>
7 7 700 g
8 5 200 e
What I am trying to do now: in the record where Name
is NA
, the Name
should become ID
.
My attempts:
myData$Dummy<-ifelse( is.na(myData$Name),myData$ID, myData$Name)
or
for (i in 1:length(myData$Name) )
if (is.na(myData$Name[i]))
{
x <- myData$ID[i]
# print(x)
myData$Name[i]<- as.factor(x)
print(myData$Name[i])
}
are wrong. Could you please give me a hint?
Upvotes: 2
Views: 604
Reputation: 887951
An option using data.table
library(data.table)#1.9.5+
setkey(setDT(myData), ID)[myMap, Name:=i.Name][is.na(Name),
Name:= as.character(ID)]
# ID Sum Name
#1: 1 10 a
#2: 2 20 b
#3: 3 30 c
#4: 4 40 d
#5: 5 50 e
#6: 5 200 e
#7: 6 60 6
#8: 7 700 g
NOTE: As commented by @Arun, in the devel version v1.9.5
, we can also set the key as an argument inside setDT
, i.e. setDT(myData, key='ID')
Upvotes: 2
Reputation: 78842
It's the fact that the column you think is character is really a factor. Either use stringsAsFactors=FALSE
when creating the data frame or you'll need to account for it when manipulating the data. I've provided dplyr
+ piping and base R solutions below. Note the use of left_join
(dplyr
) or merge
(base) vs your subset & matching:
library(dplyr)
myData <- read.csv(text="ID;Sum
1;10
2;20
3;30
4;40
5;50
6;60
7;700
5;200", sep=";")
myMap <- read.csv(text="ID;Name
1;a
2;b
3;c
4;d
5;e
7;g", sep=";")
# dplyr -------------------------------------------------------------------
myData %>%
left_join(myMap) %>%
mutate(Name=as.character(Name),
Name=ifelse(is.na(Name), ID, Name)) -> dplyr_myData
## Joining by: "ID"
dplyr_myData
## ID Sum Name
## 1 1 10 a
## 2 2 20 b
## 3 3 30 c
## 4 4 40 d
## 5 5 50 e
## 6 6 60 6
## 7 7 700 g
## 8 5 200 e
# base --------------------------------------------------------------------
base_myData <- merge(myData, myMap, all.x=TRUE)
base_myData$Name <- as.character(base_myData$Name)
base_myData$Name <- ifelse(is.na(base_myData$Name),
base_myData$ID, base_myData$Name)
base_myData
## ID Sum Name
## 1 1 10 a
## 2 2 20 b
## 3 3 30 c
## 4 4 40 d
## 5 5 50 e
## 6 5 200 e
## 7 6 60 6
## 8 7 700 g
Upvotes: 5