Dswede43
Dswede43

Reputation: 351

How do I remove characters from items in a column?

I want to be able to remove specific characters from items in my data frame.

Sample <- c("A1.1","B1.1","C1.1","A1.2","B1.2","C1.2")
X <- c(1,1,2,4,3,5)
df <- data.frame(Sample, X)

  Sample X
1   A1.1 1
2   B1.1 1
3   C1.1 2
4   A1.2 4
5   B1.2 3
6   C1.2 5

I want to remove the ".1" and ".2" (or the 3rd and 4th character) from each item in the Sample column.

df$Sample <- gsub(".1","",as.character(df$Sample))
df
  Sample X
1        1
2        1
3        2
4     .2 4
5     .2 3
6     .2 5

This is what I've tried so far but doesn't do what I want. Is there a way to just remove the 3rd and 4th character from each item in the Sample column?

Upvotes: 3

Views: 142

Answers (2)

Anoushiravan R
Anoushiravan R

Reputation: 21908

You can also use the following solution:

library(stringr)

df %>%
  mutate(Sample = str_remove(Sample, "\\.\\d+"))

  Sample X
1     A1 1
2     B1 1
3     C1 2
4     A1 4
5     B1 3
6     C1 5

Upvotes: 2

akrun
akrun

Reputation: 886938

Instead of .1, escape the . with \\. as it is a metacharacter in regex and can match any character. Here, we need just sub i.e. match once and replace with blank. The pattern below matches the . followed by one or more digits (\\d+) at the end ($) of the string

df$Sample <- sub("\\.\\d+$", "", df$Sample)

Upvotes: 3

Related Questions