ulima2_
ulima2_

Reputation: 1336

In R, remove all dots from string apart from the last

I have a list of strings like this:

mystr <- c("16.142.8",          
       "52.135.1",         
       "40.114.4",          
       "83.068.8",         
       "83.456.3",         
       "55.181.5",         
       "76.870.2",         
       "96.910.2",         
       "17.171.9",         
       "49.617.4",         
       "38.176.1",         
       "50.717.7",         
       "19.919.6")

I know that the first dot . is just a thousands separator, while the second one is the decimal operator.

I want to convert the strings to numbers, so the first one should become 16142.8, the second 52135.1, and so on.

I suspect that it migh be done with regular expressions, but I'm not sure how. Any ideas?

Upvotes: 6

Views: 14162

Answers (2)

Sagar
Sagar

Reputation: 2914

A simple "sub" can achieve the same, as it will only replace the first matching pattern. Example,

sub("\\.", "", mystr)

Upvotes: 7

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626758

You need a lookahead based PCRE regex with gsub:

gsub("\\.(?=[^.]*\\.)", "", mystr, perl=TRUE)

See an online R demo

Details

  • \\. - a dot
  • (?=[^.]*\\.) - that is followed with 0 or more chars other than . (matched with [^.]*) and then a literal .. The (?=...) is a positive lookahead that requires some pattern to appear immediately to the right of the current location, but is not added to the match value and the regex index stays at the one and the same place, i.e. is not advanced.

Upvotes: 11

Related Questions