Economist_Ayahuasca
Economist_Ayahuasca

Reputation: 1642

Select only the first two digits of a string of the value that starts with a letter and do nothing otherwise

how can I select only the first two digits of a string of the value that starts with a letter and do nothing otherwise?

Value = I240 G460 1560 S50

I would like to get:

Value = I24 G46 1560 S50

So far I am trying:

df <- df %>% mutate(Value = ifelse(str_detect(Value, '[A-Z]'~'\\d{2}')))

Upvotes: 1

Views: 559

Answers (5)

ThomasIsCoding
ThomasIsCoding

Reputation: 101189

A base R option using gsub

> gsub("(^[[:alpha:]]\\d{2}).*","\\1",v)
[1] "I24"  "G46"  "1560" "S50"

data

v <- c("I240", "G460", "1560", "S50")

Upvotes: 0

akrun
akrun

Reputation: 887018

We could use coalesce with str_extract

library(stringr)
library(dplyr)
df %>% 
    mutate(Value = coalesce(str_extract(Value, "^[A-Z]\\d{2}"), Value))

Or using str_remove with a regex lookaround

df %>%
     mutate(Value = str_remove(Value, "(?<=^[A-Z]\\d{2}).*"))

data

df <- structure(list(Value = c("I240", "G460", "1560", "S50")),
  class = "data.frame", row.names = c(NA, -4L))

Upvotes: 0

Chris Ruehlemann
Chris Ruehlemann

Reputation: 21400

A base R solution is this:

sub("^([A-Z]\\d\\d).*", "\\1", df$Value)
[1] "I24"  "G46"  "1560" "S50" 

Here, we pick out the initial letter and the two digits following it in a capture group, which we select through backreference \\1 in the replacement argument to sub.

Data:

df <- data.frame(Value = c("I240", "G460", "1560", "S50"))

Upvotes: 2

Yuriy Saraykin
Yuriy Saraykin

Reputation: 8880

library(tidyverse)

Value = c("I240", "G460", "1560", "S50")

df <- data.frame(Value = Value)

df %>% 
  mutate(out = ifelse(str_detect(Value, "^[A-Z]"), str_sub(Value, 1, 3), Value))
#>   Value  out
#> 1  I240  I24
#> 2  G460  G46
#> 3  1560 1560
#> 4   S50  S50

Created on 2021-01-28 by the reprex package (v1.0.0)

base

df$out <- with(df, ifelse(grepl("^[A-Z]", Value), substring(Value, 1, 3), Value))
df
#>   Value  out
#> 1  I240  I24
#> 2  G460  G46
#> 3  1560 1560
#> 4   S50  S50

Created on 2021-01-28 by the reprex package (v1.0.0)

Upvotes: 2

Matt
Matt

Reputation: 7385

Here's a hack solution - there is likely a better way:

library(dplyr)

df <- data.frame(Value = c("I240", "G460", "1560", "S50"))

df %>% 
  mutate(temp = substr(Value,1,1),
         temp = is.na(as.numeric(as.character(temp))),
         Value = ifelse(temp == TRUE, substr(Value,1,3), Value)) %>% 
  select(-temp)

This gives us:

 Value
1   I24
2   G46
3  1560
4   S50

Upvotes: 1

Related Questions