Reputation: 83245
I have a factor variable with several levels indicating the wealth of people. Unfortunately the thousands in the numbers are indicated by spaces:
> levels(bron$vermogen)
[1] "negatief" "0 tot 5 000 euro" "5 000 tot 10 000 euro"
[4] "10 000 tot 20 000 euro" "20 000 tot 50 000 euro" "50 000 tot 100 000 euro"
[7] "100 000 tot 200 000 euro" "200 000 tot 500 000 euro" "500 000 tot 1 miljoen euro"
[10] "1 miljoen euro en meer"
I want to replace those spaces with dots, while simultaneously keeping the spaces between the numbers and the words. I can do this with for example:
bron$vermogen <- gsub("5 000 tot 10 000 euro", "5.000 tot 10.000 euro", bron$vermogen)
By using this method, I have to repeat this procedure 8 times. How can I do this more efficiently?
A dput
of the levels:
c("negatief", "0 tot 5 000 euro", "5 000 tot 10 000 euro", "10 000 tot 20 000 euro", "20 000 tot 50 000 euro", "50 000 tot 100 000 euro", "100 000 tot 200 000 euro", "200 000 tot 500 000 euro", "500 000 tot 1 miljoen euro", "1 miljoen euro en meer")
Upvotes: 3
Views: 810
Reputation: 81693
You can replace the space with a dot:
gsub("\\d\\K (?=\\d)", ".", bron$vermogen, perl = TRUE)
# [1] "negatief" "0 tot 5.000 euro"
# [3] "5.000 tot 10.000 euro" "10.000 tot 20.000 euro"
# [5] "20.000 tot 50.000 euro" "50.000 tot 100.000 euro"
# [7] "100.000 tot 200.000 euro" "200.000 tot 500.000 euro"
# [9] "500.000 tot 1 miljoen euro" "1 miljoen euro en meer"
Upvotes: 4
Reputation: 92292
Another similar option would be using lookahead/behind
gsub("(?<=\\d)\\s(?=\\d)", ".", bron$vermogen, perl = TRUE)
# [1] "negatief" "0 tot 5.000 euro" "5.000 tot 10.000 euro" "10.000 tot 20.000 euro"
# [5] "20.000 tot 50.000 euro" "50.000 tot 100.000 euro" "100.000 tot 200.000 euro" "200.000 tot 500.000 euro"
# [9] "500.000 tot 1 miljoen euro" "1 miljoen euro en meer"
Upvotes: 3
Reputation: 121578
For example :
gsub('([0-9]) ([0-9])','\\1.\\2',bron$vermogen)
[1] "negatief" "0 tot 5.000 euro" "5.000 tot 10.000 euro"
[4] "10.000 tot 20.000 euro" "20.000 tot 50.000 euro" "50.000 tot 100.000 euro"
[7] "100.000 tot 200.000 euro" "200.000 tot 500.000 euro" "500.000 tot 1 miljoen euro"
[10] "1 miljoen euro en meer"
Upvotes: 8