Reputation: 1
I have a data frame with several columns. One of those contains Plotids like AEG1, AEG2,..., AEG50, HEG1, HEG2,..., HEG50, SEG1, SEG2,..., SEG50. So, the data frame has 150 rows. Now I want to change only some of these Plotids, so that there is AEG01, AEG02,... instead of AEG1, AEG2, ... So, I just want to add a "0" to some of the column entries. I tried it by using lapply, a for loop, writing a function,... but nothing did the job. There was always the error message:
In if (nchar(as.character(dat_merge$EP_Plotid)) == 4)
paste(substr(dat_merge$EP_Plotid, ... :
the condition has length > 1 and only the first element will be used
So, this was my last try:
Plotid_func <- function(x) {
if(nchar(as.character(dat_merge$EP_Plotid))==4)
paste(substr(dat_merge$EP_Plotid, 1, 3), "0", substr(dat_merge$EP_Plotid, 4, 4), sep="")
}
dat_merge$Plotid <- sapply(dat_merge$EP_Plotid, Plotid_func)
Therewith, I wanted to select only those column entries with four digits. And to only those selected entries, I wanted to add a 0
. Can anybody help me? dat_merge
is the name of my data frame and EP_Plotid
is the column I want to edit. Thanks in advance
Upvotes: 0
Views: 76
Reputation: 887118
Or use a combination of formatC
with str_extract
from library(stringr)
library(stringr)
x from Ananda's
post.
Extract alphabets and numbers separately.
Flag 0's to the numbers with formatC
paste together
paste0(str_extract(x, "[[:alpha:]]+"), formatC(as.numeric(str_extract(x,"\\d+")), width=2, flag=0))
#[1] "AEG01" "AEG02" "AEG03" "AEG04" "AEG05" "AEG06" "AEG07" "AEG08" "AEG09"
#[10] "AEG10" "AEG11" "AEG12" "HEG01" "HEG02" "HEG03" "HEG04" "HEG05" "HEG06"
#[19] "HEG07" "HEG08" "HEG09" "HEG10" "HEG11" "HEG12"
Upvotes: 0
Reputation: 92292
Or you can just modify your function to actually use the input variable x
(which is not happening in your original function)
dat_merge <- data.frame(EP_Plotid = c("AEG1", "AEG2", "AEG50", "HEG1", "HEG2", "HEG50", "SEG1", "SEG2", "SEG50"))
Plotid_func <- function(x) {
if(nchar(as.character(x)) == 4){
paste(substr(x, 1, 3), "0", substr(x, 4, 4), sep="")
} else as.character(x)
}
dat_merge$Plotid <- sapply(dat_merge$EP_Plotid, Plotid_func)
dat_merge
# EP_Plotid Plotid
# 1 AEG1 AEG01
# 2 AEG2 AEG02
# 3 AEG50 AEG50
# 4 HEG1 HEG01
# 5 HEG2 HEG02
# 6 HEG50 HEG50
# 7 SEG1 SEG01
# 8 SEG2 SEG02
# 9 SEG50 SEG50
A vectorized version of your function (which is much better than using sapply
which is just a for
loop) would be
dat_merge$Plotid <- ifelse(nchar(as.character(dat_merge$EP_Plotid))==4, paste(substr(dat_merge$EP_Plotid, 1, 3), "0", substr(dat_merge$EP_Plotid, 4, 4), sep=""), as.character(dat_merge$EP_Plotid))
Upvotes: 1
Reputation: 193517
Just extract the "string" portion and the "numeric" portion and paste them back together after using sprintf
on the numeric portion.
An example:
## "x" is the "column" of plot ids. Here I go up to 12
## to demonstrate the zero padding that it sounds like
## you're looking for
x <- c(paste0("AEG", 1:12), paste0("HEG", 1:12))
## Extract the string values
Strings <- gsub("([A-Z]+)(.*)", "\\1", x)
## Extract the numeric values
Nums <- gsub("([A-Z]+)(.*)", "\\2", x)
## Put them back together
paste0(Strings, sprintf("%02d", as.numeric(Nums)))
# [1] "AEG01" "AEG02" "AEG03" "AEG04" "AEG05" "AEG06"
# [7] "AEG07" "AEG08" "AEG09" "AEG10" "AEG11" "AEG12"
# [13] "HEG01" "HEG02" "HEG03" "HEG04" "HEG05" "HEG06"
# [19] "HEG07" "HEG08" "HEG09" "HEG10" "HEG11" "HEG12"
Upvotes: 1