Jaap
Jaap

Reputation: 83245

How to replace only the spaces between numbers with dots

I have a factor variable with several levels indicating the wealth of people. Unfortunately the thousands in the numbers are indicated by spaces:

> levels(bron$vermogen)
 [1] "negatief"                   "0 tot 5 000 euro"           "5 000 tot 10 000 euro"     
 [4] "10 000 tot 20 000 euro"     "20 000 tot 50 000 euro"     "50 000 tot 100 000 euro"   
 [7] "100 000 tot 200 000 euro"   "200 000 tot 500 000 euro"   "500 000 tot 1 miljoen euro"
[10] "1 miljoen euro en meer"    

I want to replace those spaces with dots, while simultaneously keeping the spaces between the numbers and the words. I can do this with for example:

bron$vermogen <- gsub("5 000 tot 10 000 euro", "5.000 tot 10.000 euro", bron$vermogen)

By using this method, I have to repeat this procedure 8 times. How can I do this more efficiently?

A dput of the levels:

c("negatief", "0 tot 5 000 euro", "5 000 tot 10 000 euro", "10 000 tot 20 000 euro", "20 000 tot 50 000 euro", "50 000 tot 100 000 euro", "100 000 tot 200 000 euro", "200 000 tot 500 000 euro", "500 000 tot 1 miljoen euro", "1 miljoen euro en meer")

Upvotes: 3

Views: 810

Answers (3)

Sven Hohenstein
Sven Hohenstein

Reputation: 81693

You can replace the space with a dot:

gsub("\\d\\K (?=\\d)", ".", bron$vermogen, perl = TRUE)

 # [1] "negatief"                   "0 tot 5.000 euro"          
 # [3] "5.000 tot 10.000 euro"      "10.000 tot 20.000 euro"   
 # [5] "20.000 tot 50.000 euro"     "50.000 tot 100.000 euro"  
 # [7] "100.000 tot 200.000 euro"   "200.000 tot 500.000 euro" 
 # [9] "500.000 tot 1 miljoen euro" "1 miljoen euro en meer"

Upvotes: 4

David Arenburg
David Arenburg

Reputation: 92292

Another similar option would be using lookahead/behind

gsub("(?<=\\d)\\s(?=\\d)", ".", bron$vermogen, perl = TRUE)
# [1] "negatief"                   "0 tot 5.000 euro"           "5.000 tot 10.000 euro"      "10.000 tot 20.000 euro"    
# [5] "20.000 tot 50.000 euro"     "50.000 tot 100.000 euro"    "100.000 tot 200.000 euro"   "200.000 tot 500.000 euro"  
# [9] "500.000 tot 1 miljoen euro" "1 miljoen euro en meer"    

Upvotes: 3

agstudy
agstudy

Reputation: 121578

For example :

gsub('([0-9]) ([0-9])','\\1.\\2',bron$vermogen)

 [1] "negatief"                   "0 tot 5.000 euro"           "5.000 tot 10.000 euro"     
 [4] "10.000 tot 20.000 euro"     "20.000 tot 50.000 euro"     "50.000 tot 100.000 euro"   
 [7] "100.000 tot 200.000 euro"   "200.000 tot 500.000 euro"   "500.000 tot 1 miljoen euro"
[10] "1 miljoen euro en meer"   

Upvotes: 8

Related Questions