Reputation: 1131
I am looking for a regex for gsub
to remove all the unwanted commas:
Data:
,,,,,,,12345
12345,1345,1354
123,,,,,,
12345,
,12354
Desired result:
12345
12345,1345,1354
123
12345
12354
This is the progress I have made so far:
(,(?!\d+))
Upvotes: 3
Views: 138
Reputation: 18681
You can also use str_extract
from stringr
. Thanks to greedy matching, you don't have to specify how many times a digit occurs, the longest match is automatically chosen:
library(dplyr)
library(stringr)
df %>%
mutate(V1 = str_extract(V1, "\\d.+\\d"))
or if you prefer base R
:
df$V1 = regmatches(df$V1, gregexpr("\\d.+\\d", df$V1))
Result:
V1
1 12345
2 12345,1345,1354
3 123
4 12345
5 12354
Data:
df = read.table(text = ",,,,,,,12345
12345,1345,1354
123,,,,,,
12345,
,12354")
Upvotes: 2
Reputation: 627087
You seem to want to remove all leading and trailing commas.
You may do it with
gsub("^,+|,+$", "", x)
See the regex demo
The regex contans two alternations, ^,+
matches 1 or more commas at the start and ,+$
matches 1+ commas at the end, and gsub
replaces these matches with empty strings.
See R demo
x <- c(",,,,,,,12345","12345,1345,1354","123,,,,,,","12345,",",12354")
gsub("^,+|,+$", "", x)
## [1] "12345" "12345,1345,1354" "123" "12345"
## [5] "12354"
Upvotes: 3