Fiona
Fiona

Reputation: 477

How to test if the first three characters in a string are letters or numbers in r?

The example of the dataset I have is given below, note that I have more than two columns in the total dataset.

ID   X
1   MJF34
2   GA249D
3   DEW235R
4   4SDFR3
5   DAS3

I want to test whether the first three characters in X are letters, if they are then I want to replace that value to show only the first three letters. If the first three characters aren't letters then I want to replace those values with "FR". Hence the result would be as follows.

ID    X
1    MJF
2    FR
3    DEW
4    FR
5    DAS

Currently X is a character data type.

Thanks in advance for any help.

Upvotes: 2

Views: 3642

Answers (2)

csgillespie
csgillespie

Reputation: 60452

You can use standard base R commands

# Your data, dt$X in your case
x = c("MJF34", "GA249D", "DEW235R", "4SDFR3", "DAS3")

First use substr to extract characters 1 to 3

sub_str = substr(x, 1, 3)

Then test for a number

has_numbers = grep("[0-9]", sub_str)

Then replace

sub_str[has_numbers] = "FR"

Upvotes: 2

mt1022
mt1022

Reputation: 17289

I would try:

x <- substr(dt$X, 1, 3)
dt$X <- ifelse(grepl('[0-9]', x), 'FR', x)
dt
#   ID   X
# 1  1 MJF
# 2  2  FR
# 3  3 DEW
# 4  4  FR
# 5  5 DAS

The data:

structure(list(ID = 1:5, X = c("MJF34", "GA249D", "DEW235R", 
"4SDFR3", "DAS3")), .Names = c("ID", "X"), class = "data.frame", 
row.names = c(NA, 
-5L))

Upvotes: 5

Related Questions