LucasSeveryn
LucasSeveryn

Reputation: 6262

How to remove last n characters from every element in the R vector

I am very new to R, and I could not find a simple example online of how to remove the last n characters from every element of a vector (array?)

I come from a Java background, so what I would like to do is to iterate over every element of a$data and remove the last 3 characters from every element.

How would you go about it?

Upvotes: 157

Views: 286977

Answers (6)

gagolews
gagolews

Reputation: 13046

The same may be achieved with the stringi package:

library('stringi')
char_array <- c("foo_bar","bar_foo","apple","beer")
a <- data.frame("data"=char_array, "data2"=1:4)
(a$data <- stri_sub(a$data, 1, -4))  # from the first to the (last-4)-th character
## [1] "foo_" "bar_" "ap"   "b" 

Upvotes: 15

ExploreR
ExploreR

Reputation: 343

friendly hint when working with n characters of a string to cut off/replace:

--> be aware of whitespaces in your strings!

use base::gsub(' ', '', x, fixed = TRUE) to get rid of unwanted whitespaces in your strings. i spent quite some time to find out why the great solutions provided above did not work for me. thought it might be useful for others as well ;)

Upvotes: 1

Blaszard
Blaszard

Reputation: 31963

Although this is mostly the same with the answer by @nfmcclure, I prefer using stringr package as it provdies a set of functions whose names are most consistent and descriptive than those in base R (in fact I always google for "how to get the number of characters in R" as I can't remember the name nchar()).

library(stringr)
str_sub(iris$Species, end=-4)
#or 
str_sub(iris$Species, 1, str_length(iris$Species)-3)

This removes the last 3 characters from each value at Species column.

Upvotes: 62

krads
krads

Reputation: 1369

Similar to @Matthew_Plourde using gsub

However, using a pattern that will trim to zero characters i.e. return "" if the original string is shorter than the number of characters to cut:

cs <- c("foo_bar","bar_foo","apple","beer","so","a")
gsub('.{0,3}$', '', cs)
# [1] "foo_" "bar_" "ap"   "b"    ""    ""

Difference is, {0,3} quantifier indicates 0 to 3 matches, whereas {3} requires exactly 3 matches otherwise no match is found in which case gsub returns the original, unmodified string.

N.B. using {,3} would be equivalent to {0,3}, I simply prefer the latter notation.

See here for more information on regex quantifiers: https://www.regular-expressions.info/refrepeat.html

Upvotes: 5

Matthew Plourde
Matthew Plourde

Reputation: 44614

Here's a way with gsub:

cs <- c("foo_bar","bar_foo","apple","beer")
gsub('.{3}$', '', cs)
# [1] "foo_" "bar_" "ap"   "b"

Upvotes: 112

nfmcclure
nfmcclure

Reputation: 3141

Here is an example of what I would do. I hope it's what you're looking for.

char_array = c("foo_bar","bar_foo","apple","beer")
a = data.frame("data"=char_array,"data2"=1:4)
a$data = substr(a$data,1,nchar(a$data)-3)

a should now contain:

  data data2
1 foo_ 1
2 bar_ 2
3   ap 3
4    b 4

Upvotes: 180

Related Questions