Reputation: 93

extract last 2 chars from a column in a data.frame

I am new to R programming and have searched SO for many hours. I would appreciate your help.

I have a dataframe, with 3 columns (Date,Description, Debit)

      Date         Description   Debit
2014-01-01      "abcdef    VA"      15
2014-01-01     "ghijkl"    NY"      56

I am trying to extract the last 2 chars of the second (Description) column (i.e. the 2 letter state abbreviation). I am not very comfortable with apply-type functions.

I have tried using

 l <- lapply(a$Description, function(x) {substr(x, nchar(x)-2+1, nchar(x))})

but get the following error message

Error in nchar(x) : invalid multibyte string, element 1

I have tried multiple other approaches, but with the same error.

I am quite sure that I am missing something very basic, so would appreciate your help

thanks

Upvotes: 5

Answers (4)

akrun

Reputation: 887971

We can use sub

df$State <- sub(".*\\s+", "", df[,2])
df$State
#[1] "VA" "FL" "GA"

Upvotes: 0

Andrew Lavers

Reputation: 4378

Here's a regex version, using Brandon S's sample data. The regex captures everything after the last whitespace character to the end of the string.

df <- data.frame(date = c("2015-01-01", "2015-02-01", "2015-01-15"),
                 jumble = c("12345 VA", "123 FL", "12354567732 GA"),
                 debit = c(15, 36, 20))

df$state <- gsub(".+\\s(.+)$", "\\1", df$jumble)

df

        date         jumble debit state
1 2015-01-01       12345 VA    15    VA
2 2015-02-01         123 FL    36    FL
3 2015-01-15 12354567732 GA    20    GA

Upvotes: 0

bshelt141

Reputation: 1223

df <- data.frame(date = c("2015-01-01", "2015-02-01", "2015-01-15"),
             jumble = c("12345 VA", "123 FL", "12354567732 GA"),
             debit = c(15, 36, 20))

df$jumble <- as.character(df$jumble)

df$state <- substr(df$jumble, nchar(df$jumble)-1, nchar(df$jumble))

df
        date         jumble debit state
1 2015-01-01       12345 VA    15    VA
2 2015-02-01         123 FL    36    FL
3 2015-01-15 12354567732 GA    20    GA

Upvotes: 1

Sotos

Reputation: 51612

library(stringr)
str_sub(a$Description,-2,-1)

Upvotes: 10

extract last 2 chars from a column in a data.frame

Answers (4)

Related Questions