Reputation: 13
I need to change individual identifiers that are currently alphabetical to numerical. I have created a data frame where each alphabetical identifier is associated with a number
individuals num.individuals (g4)
1 ZYO 64
2 KAO 24
3 MKU 32
4 SAG 42
What I need to replace ZYO with the number 64 in my main data frame (g3) and like wise for all the other codes.
My main data frame (g3) looks like this
SAG YOG GOG BES ATR ALI COC CEL DUN EVA END GAR HAR HUX ISH INO JUL
1 2
2 2 EVA
3 SAG 2 EVA
4 2
5 SAG 2
6 2
Now on a small scale I can write a code to change it like I did with ATR
g3$ATR <- as.character(g3$ATR)
g3[g3$target == "ATR" | g3$ATR == "ATR","ATR"] <- 2
But this is time consuming and increased chance of human error.
I know there are ways to do this on a broad scale with NAs
I think maybe we could do a for loop for this, but I am not good enough to write one myself.
I have also been trying to use this function which I feel like may work but I am not sure how to logically build this argument, it was posted on the questions board here Fast replacing values in dataframe in R
df <- as.data.frame(lapply(df, function(x){replace(x, x <0,0)})
I have tried to work my data into this by
df <- as.data.frame(lapply(g4, function(g3){replace(x, x <0,0)})
Upvotes: 1
Views: 476
Reputation: 4615
Here is one approach using the data.table
package:
First, create a reproducible example similar to your data:
require(data.table)
ref <- data.table(individuals=1:4,num.individuals=c("ZYO","KAO","MKU","SAG"),g4=c(64,24,32,42))
g3 <- data.table(SAG=c("","SAG","","SAG"),KAO=c("KAO","KAO","",""))
Here is the ref
table:
individuals num.individuals g4
1: 1 ZYO 64
2: 2 KAO 24
3: 3 MKU 32
4: 4 SAG 42
And here is your g3
table:
SAG KAO
1: KAO
2: SAG KAO
3:
4: SAG
And now we do our find and replacing:
g3[ , lapply(.SD,function(x) ref$g4[chmatch(x,ref$num.individuals)])]
And the final result:
SAG KAO
1: NA 24
2: 42 24
3: NA NA
4: 42 NA
And if you need more speed, the fastmatch
package might help with their fmatch
function:
require(fastmatch)
g3[ , lapply(.SD,function(x) ref$g4[fmatch(x,ref$num.individuals)])]
SAG KAO
1: NA 24
2: 42 24
3: NA NA
4: 42 NA
Upvotes: 1