Reputation: 763
I have the following column from a dataframe
df <- data.frame(
crime = as.character(c(115400, 171200, 91124, 263899, 67601, 51322)),
stringsAsFactors=FALSE
)
I am using a function to extract the first two digits based on some condition as seen on the function below
for (i in df$crime){
if (nchar(i)==6){
print(substring(i,1,2))}
else {print(substring(i,1,1))
}
}
when I run this function I get the following output which is what I want
[1] "11"
[1] "17"
[1] "9"
[1] "26"
[1] "6"
[1] "5"
However, I want this to be saved as stand along vector. how do I do that?
Upvotes: 1
Views: 809
Reputation: 6486
I can imagine some situations where keeping the extracted codes within the original data frame is useful.
I'll use the data.table
package as it's fast, which may be handy if your data is big.
library(data.table)
# convert your data.frame to data.table
setDT(df)
# filter the rows where crime length is 6,
# and assign the first two characters of
# it into a new variable "extracted".
# some rows now have NAs in the new
# field. The last [] prints it to screen.
df[nchar(crime) == 6, extracted := substring(crime, 1, 2)][]
Upvotes: 0
Reputation: 388807
Using regex :
output <- with(df, ifelse(nchar(crime) == 6, sub("(..).*", "\\1", crime),
sub("(.).*", "\\1", crime)))
output
#[1] "11" "17" "9" "26" "6" "5"
It becomes a little simpler with str_extract
from stringr
with(df, ifelse(nchar(crime) == 6, stringr::str_extract(crime, ".."),
stringr::str_extract(crime, ".")))
Upvotes: 0
Reputation: 101034
Here is a base R solution with ifelse
+ substring
res <- with(df, substring(crime,1,ifelse(nchar(crime) == 6, 2, 1)))
such that
> res
[1] "11" "17" "9" "26" "6" "5"
Upvotes: 2
Reputation: 886938
substr/substring
are vectorized, so we can use ifelse
v1 <- with(df1, ifelse(nchar(crime) == 6, substr(crime, 1, 2), substr(crime, 1, 1)))
v1
#[1] "11" "17" "9" "26" "6" "5"
In the OP's for loop, a vector
can be initialized to store the output in each of the iterations
v1 <- character(nrow(df1))
for (i in seq_along(df1$crime)){
if (nchar(df1$crime[i])==6){
v1[i] <- substring(df1$crime[i],1,2)
} else {
v1[i] <- substring(df1$crime[i],1,1)
}
}
Upvotes: 2