Jerry  Zhang
Jerry Zhang

Reputation: 57

Delete a number in a string with regex

I have this string:

151228-▶ Guido's Lounge Cafe Broadcast 0124 It Will Be Alright (20140718) by Guido's Lounge Café

I want to delete the numbers - "0124" and "20140718" inside the string, but leave the number "151228" at the beginning untouched by regex. I tried so many times, but still couldn't find a way to do that using only one expression. The best I could do was this:

151228-▶ Guido's Lounge Cafe Broadcast It Will Be Alright ) by Guido's Lounge Café

by the expression: [^\d+]\d+

That is almost a success, but the open parenthesis of "20140718" is also deleted.

I am not very good at regex, and that string is just a test for myself. I want to know whether there is a single expression that can deal with it, or if I have to do multiple. Can anybody recommend some articles for me about regex as well? I read some, but many are not very detailed.

i use php, and do some replacing work by preg_replace(regex, "", "$str"). that string showing here is randomly chosen. so there won't be some special constraints. actually, i just want to delete the numbers inside the string to test my regex comprehension. then i failed...

Upvotes: 0

Views: 92

Answers (3)

Martin Konecny
Martin Konecny

Reputation: 59671

It seems that you always want the first number (left of the "-▶") to remain, and all other numbers to be removed. Assuming the language Python you should be able to use a negative look-ahead as follows:

print re.sub(r'\d+(?!.*-▶)', '', "151228-▶ Guido's Lounge Cafe Broadcast 0124 It Will Be Alright (20140718) by Guido's Lounge Café")

# output
# 151228-▶ Guido's Lounge Cafe Broadcast  It Will Be Alright () by Guido's Lounge Café

How it works: It replaces any group of numbers with an empty string except for the first group. The first group is defined as being to the left of the -▶ character sequence.

EDIT (in PHP):

$output = preg_replace("/\d+(?!.*-)/", "", "151228- Guido's Lounge Cafe Broadcast 0124 It Will Be Alright (20140718) by Guido's Lounge Caf");

which returns:

151228- Guido's Lounge Cafe Broadcast  It Will Be Alright () by Guido's Lounge Caf

Upvotes: 0

Craig Estey
Craig Estey

Reputation: 33631

This is really better done with multiple regexes but here's a single one:

s/(\d+)([^0-9]+)\s+\d+([^(]+)[(]\d+[)]\s+(.+)$/$1$2$3$4/;

And the output is:

151228-▶ Guido's Lounge Cafe Broadcast It Will Be Alright by Guido's Lounge Café

Upvotes: 0

DrSocket
DrSocket

Reputation: 135

If it always has the word Broadcast and Alright you can just specify it:

toDelete = re.findall('Broadcast ([0-9]+)', line)
toDelete2 = re.findall('Alright ([(0-9)]+)', line)

that should pull those numbers out, with the specific data you could then make a function to delete whatever is in toDelete from the line. (by 'line' I mean the line where the string you want to delete things from is) I'd write it but don't know what language you're using.

Upvotes: 1

Related Questions