Ryan
Ryan

Reputation: 1068

Removing parts of a string from data frame rows in R

I have a data frame column with a string followed by a decimal and a number (e.g., data <- data.frame(ID = c("alpha.1","alpha.2", "alpha.3", "beta.1", "beta.2", "beta.3")). How can I remove just the decimal and the number, and leave the word?

Upvotes: 1

Views: 38

Answers (1)

akrun
akrun

Reputation: 887891

We can use sub to match the . followed by one or more digits (\\d+) at the end ($) of the string and replace with blank ("")

data$ID <- sub("\\.\\d+$", "", data$ID)
data$ID
#[1] "alpha" "alpha" "alpha" "beta"  "beta"  "beta"

Or another option is trimws

data$ID <- trimws(data$ID, whitespace = "\\..*")

Or using word

library(stringr)
word(data$ID, 1, sep=fixed("."))
#[1] "alpha" "alpha" "alpha" "beta"  "beta"  "beta" 

Upvotes: 2

Related Questions