Reputation: 57
I have this string:
151228-▶ Guido's Lounge Cafe Broadcast 0124 It Will Be Alright (20140718) by Guido's Lounge Café
I want to delete the numbers - "0124" and "20140718" inside the string, but leave the number "151228" at the beginning untouched by regex. I tried so many times, but still couldn't find a way to do that using only one expression. The best I could do was this:
151228-▶ Guido's Lounge Cafe Broadcast It Will Be Alright ) by Guido's Lounge Café
by the expression: [^\d+]\d+
That is almost a success, but the open parenthesis of "20140718" is also deleted.
I am not very good at regex, and that string is just a test for myself. I want to know whether there is a single expression that can deal with it, or if I have to do multiple. Can anybody recommend some articles for me about regex as well? I read some, but many are not very detailed.
i use php, and do some replacing work by preg_replace(regex, "", "$str"). that string showing here is randomly chosen. so there won't be some special constraints. actually, i just want to delete the numbers inside the string to test my regex comprehension. then i failed...
Upvotes: 0
Views: 92
Reputation: 59671
It seems that you always want the first number (left of the "-▶") to remain, and all other numbers to be removed. Assuming the language Python you should be able to use a negative look-ahead as follows:
print re.sub(r'\d+(?!.*-▶)', '', "151228-▶ Guido's Lounge Cafe Broadcast 0124 It Will Be Alright (20140718) by Guido's Lounge Café")
# output
# 151228-▶ Guido's Lounge Cafe Broadcast It Will Be Alright () by Guido's Lounge Café
How it works: It replaces any group of numbers with an empty string except for the first group. The first group is defined as being to the left of the -▶
character sequence.
EDIT (in PHP):
$output = preg_replace("/\d+(?!.*-)/", "", "151228- Guido's Lounge Cafe Broadcast 0124 It Will Be Alright (20140718) by Guido's Lounge Caf");
which returns:
151228- Guido's Lounge Cafe Broadcast It Will Be Alright () by Guido's Lounge Caf
Upvotes: 0
Reputation: 33631
This is really better done with multiple regexes but here's a single one:
s/(\d+)([^0-9]+)\s+\d+([^(]+)[(]\d+[)]\s+(.+)$/$1$2$3$4/;
And the output is:
151228-▶ Guido's Lounge Cafe Broadcast It Will Be Alright by Guido's Lounge Café
Upvotes: 0
Reputation: 135
If it always has the word Broadcast and Alright you can just specify it:
toDelete = re.findall('Broadcast ([0-9]+)', line)
toDelete2 = re.findall('Alright ([(0-9)]+)', line)
that should pull those numbers out, with the specific data you could then make a function to delete whatever is in toDelete from the line. (by 'line' I mean the line where the string you want to delete things from is) I'd write it but don't know what language you're using.
Upvotes: 1