user2983121
user2983121

Reputation: 1

R: edit column values by using if condition

I have a data frame with several columns. One of those contains Plotids like AEG1, AEG2,..., AEG50, HEG1, HEG2,..., HEG50, SEG1, SEG2,..., SEG50. So, the data frame has 150 rows. Now I want to change only some of these Plotids, so that there is AEG01, AEG02,... instead of AEG1, AEG2, ... So, I just want to add a "0" to some of the column entries. I tried it by using lapply, a for loop, writing a function,... but nothing did the job. There was always the error message:

In if (nchar(as.character(dat_merge$EP_Plotid)) == 4)
paste(substr(dat_merge$EP_Plotid,  ... :
the condition has length > 1 and only the first element will be used

So, this was my last try:

Plotid_func <- function(x) {
if(nchar(as.character(dat_merge$EP_Plotid))==4)
paste(substr(dat_merge$EP_Plotid, 1, 3), "0", substr(dat_merge$EP_Plotid, 4, 4), sep="")
}

dat_merge$Plotid <- sapply(dat_merge$EP_Plotid, Plotid_func)

Therewith, I wanted to select only those column entries with four digits. And to only those selected entries, I wanted to add a 0. Can anybody help me? dat_merge is the name of my data frame and EP_Plotid is the column I want to edit. Thanks in advance

Upvotes: 0

Views: 76

Answers (3)

akrun
akrun

Reputation: 887118

Or use a combination of formatC with str_extract from library(stringr)

 library(stringr)

x from Ananda's post. Extract alphabets and numbers separately. Flag 0's to the numbers with formatC paste together

 paste0(str_extract(x, "[[:alpha:]]+"), formatC(as.numeric(str_extract(x,"\\d+")), width=2, flag=0))
 #[1] "AEG01" "AEG02" "AEG03" "AEG04" "AEG05" "AEG06" "AEG07" "AEG08" "AEG09"
 #[10] "AEG10" "AEG11" "AEG12" "HEG01" "HEG02" "HEG03" "HEG04" "HEG05" "HEG06"
 #[19] "HEG07" "HEG08" "HEG09" "HEG10" "HEG11" "HEG12"

Upvotes: 0

David Arenburg
David Arenburg

Reputation: 92292

Or you can just modify your function to actually use the input variable x (which is not happening in your original function)

dat_merge <- data.frame(EP_Plotid = c("AEG1", "AEG2", "AEG50", "HEG1", "HEG2", "HEG50", "SEG1", "SEG2", "SEG50"))

Plotid_func <- function(x) {
  if(nchar(as.character(x)) == 4){
    paste(substr(x, 1, 3), "0", substr(x, 4, 4), sep="") 
  } else as.character(x)
}

dat_merge$Plotid <- sapply(dat_merge$EP_Plotid, Plotid_func)
dat_merge

#   EP_Plotid Plotid
# 1      AEG1  AEG01
# 2      AEG2  AEG02
# 3     AEG50  AEG50
# 4      HEG1  HEG01
# 5      HEG2  HEG02
# 6     HEG50  HEG50
# 7      SEG1  SEG01
# 8      SEG2  SEG02
# 9     SEG50  SEG50

A vectorized version of your function (which is much better than using sapply which is just a for loop) would be

dat_merge$Plotid <- ifelse(nchar(as.character(dat_merge$EP_Plotid))==4, paste(substr(dat_merge$EP_Plotid, 1, 3), "0", substr(dat_merge$EP_Plotid, 4, 4), sep=""), as.character(dat_merge$EP_Plotid))

Upvotes: 1

A5C1D2H2I1M1N2O1R2T1
A5C1D2H2I1M1N2O1R2T1

Reputation: 193517

Just extract the "string" portion and the "numeric" portion and paste them back together after using sprintf on the numeric portion.

An example:

## "x" is the "column" of plot ids. Here I go up to 12
##    to demonstrate the zero padding that it sounds like
##    you're looking for
x <- c(paste0("AEG", 1:12), paste0("HEG", 1:12))

## Extract the string values
Strings <- gsub("([A-Z]+)(.*)", "\\1", x)

## Extract the numeric values
Nums <- gsub("([A-Z]+)(.*)", "\\2", x)

## Put them back together
paste0(Strings, sprintf("%02d", as.numeric(Nums)))
#  [1] "AEG01" "AEG02" "AEG03" "AEG04" "AEG05" "AEG06"
#  [7] "AEG07" "AEG08" "AEG09" "AEG10" "AEG11" "AEG12"
# [13] "HEG01" "HEG02" "HEG03" "HEG04" "HEG05" "HEG06"
# [19] "HEG07" "HEG08" "HEG09" "HEG10" "HEG11" "HEG12"

Upvotes: 1

Related Questions