Reputation: 105
I'm trying to fix a column in a data frame, but it's taking too long. I want to find entries which are equal to 4 characters and paste a zero in the beginning. The data frame has 2608475 rows.
I've written this code in R:
i <- NULL
for (i in 1:length(cest07$CNAE.2.0.Classe)) {
if (nchar(cest07$CNAE.2.0.Classe[i])==4) {
cest07$CNAE.2.0.Classe[i] <- paste("0", cest07$CNAE.2.0.Classe[i], sep="")
}
}
Could someone help?
Upvotes: 0
Views: 1252
Reputation: 68819
Here is a vectorized version:
### create example data set
set.seed(1)
str_len <- rpois(25, 1.2) + 1
tmp <- sapply(str_len, function(x) paste(LETTERS[seq_len(x)], collapse=""))
tmp
# [1] "A" "AB" "AB" "ABCD" "A" "ABCD" "ABCD" "AB" "AB"
# [10] "A" "A" "A" "ABC" "AB" "ABC" "AB" "ABC" "ABCDE"
# [19] "AB" "ABC" "ABCD" "A" "AB" "A" "A"
### prepend '0'
ind <- (nchar(tmp) == 4)
tmp[ind] <- paste0("0", tmp[ind])
tmp
# [1] "A" "AB" "AB" "0ABCD" "A" "0ABCD" "0ABCD" "AB" "AB"
# [10] "A" "A" "A" "ABC" "AB" "ABC" "AB" "ABC" "ABCDE"
# [19] "AB" "ABC" "0ABCD" "A" "AB" "A" "A"
Upvotes: 3