Night
Night

Reputation: 85

R get index of letters

I'm a noob with R. I want the alphabet index of each letter in a word. I don't understand what I'm doing wrong, since the individual command works perfectly...

word <- "helloworld"
l <- numeric(nchar(word))
for (i in 0:nchar(word)) {
  l[i] <- match(substr(word,i,i+1), letters)
}
l

returns a weird [1] NA NA NA NA NA NA NA NA NA 4

when match(substr(word,0,1), letters) returns the appropriate [1] 8

Upvotes: 1

Views: 3150

Answers (3)

MarkusN
MarkusN

Reputation: 3223

You tested the only constellation that could work...

  • vectors start counting with 1 in R, your loop would go from 1 to nchar(word)
  • have a look at ?substr. You have to define start and stop, therefore you have to use substr(word, i, i)

But you don't have to use a loop here. as Richard Telford suggested, you can transform your string into a character-vector. Then you match every element of this vector to the letters vector

lapply(strsplit(word, ""), match, letters)

Upvotes: 0

nicola
nicola

Reputation: 24520

The error lies in the i+1: you are getting two character strings and so no match is found. Use substring which is vectorized:

match(substring(word,1:nchar(word),1:nchar(word)),letters)
#[1]  8  5 12 12 15 23 15 18 12  4

Another (nerdy) way is to get the offset of the ASCII value of each character to the value of the char a:

as.integer(charToRaw(word))-as.integer(charToRaw("a"))+1
#[1]  8  5 12 12 15 23 15 18 12  4

Upvotes: 1

Richard Telford
Richard Telford

Reputation: 9933

The problem with your code is two fold.

First R indices start at 1 so when i = 0, l[i] is undefined.

Second, it doesn't just pull off a single letter at a time

i = 1
substr(word,i,i+1)
[1] "he"

A different approach

setNames(1:26, letters)[ strsplit("hello", NULL )[[1]] ]

Upvotes: 3

Related Questions