Reputation: 1642
how can I select only the first two digits of a string of the value that starts with a letter and do nothing otherwise?
Value = I240 G460 1560 S50
I would like to get:
Value = I24 G46 1560 S50
So far I am trying:
df <- df %>% mutate(Value = ifelse(str_detect(Value, '[A-Z]'~'\\d{2}')))
Upvotes: 1
Views: 559
Reputation: 101189
A base R option using gsub
> gsub("(^[[:alpha:]]\\d{2}).*","\\1",v)
[1] "I24" "G46" "1560" "S50"
data
v <- c("I240", "G460", "1560", "S50")
Upvotes: 0
Reputation: 887018
We could use coalesce
with str_extract
library(stringr)
library(dplyr)
df %>%
mutate(Value = coalesce(str_extract(Value, "^[A-Z]\\d{2}"), Value))
Or using str_remove
with a regex lookaround
df %>%
mutate(Value = str_remove(Value, "(?<=^[A-Z]\\d{2}).*"))
df <- structure(list(Value = c("I240", "G460", "1560", "S50")),
class = "data.frame", row.names = c(NA, -4L))
Upvotes: 0
Reputation: 21400
A base R
solution is this:
sub("^([A-Z]\\d\\d).*", "\\1", df$Value)
[1] "I24" "G46" "1560" "S50"
Here, we pick out the initial letter and the two digits following it in a capture group, which we select through backreference \\1
in the replacement argument to sub
.
Data:
df <- data.frame(Value = c("I240", "G460", "1560", "S50"))
Upvotes: 2
Reputation: 8880
library(tidyverse)
Value = c("I240", "G460", "1560", "S50")
df <- data.frame(Value = Value)
df %>%
mutate(out = ifelse(str_detect(Value, "^[A-Z]"), str_sub(Value, 1, 3), Value))
#> Value out
#> 1 I240 I24
#> 2 G460 G46
#> 3 1560 1560
#> 4 S50 S50
Created on 2021-01-28 by the reprex package (v1.0.0)
base
df$out <- with(df, ifelse(grepl("^[A-Z]", Value), substring(Value, 1, 3), Value))
df
#> Value out
#> 1 I240 I24
#> 2 G460 G46
#> 3 1560 1560
#> 4 S50 S50
Created on 2021-01-28 by the reprex package (v1.0.0)
Upvotes: 2
Reputation: 7385
Here's a hack solution - there is likely a better way:
library(dplyr)
df <- data.frame(Value = c("I240", "G460", "1560", "S50"))
df %>%
mutate(temp = substr(Value,1,1),
temp = is.na(as.numeric(as.character(temp))),
Value = ifelse(temp == TRUE, substr(Value,1,3), Value)) %>%
select(-temp)
This gives us:
Value
1 I24
2 G46
3 1560
4 S50
Upvotes: 1