trejder
trejder

Reputation: 17495

Find everything in parenthesis not containing a number

I need to do something exactly opposite to what is asked in this question. I need to find everything in parenthesis (including parenthesis itself and optionally including any prepending or trailing space) not containing a four digit number starting with 19 or 20 (year).

I have a large collection of audio and video files that contains some "garbage" in parenthesis, i.e.:

Abel the Kid - a Piece of My Love (feat. Snoop Dogg)
Alex Gaudino feat. Maxine Ashley - I'm in Love (I-039-M)

And I want to get rid of it:

Abel the Kid - a Piece of My Love
Alex Gaudino feat. Maxine Ashley - I'm in Love

But in the same time, I want to "save" all these entries that contains a release date (so four digit number starting with 19 or 20:

56K feat. Bejay - Save a Prayer (2003)
Mariah Carey feat. Trey Lorenz - I'll Be There (2012)
Savage Garden - To the Moon and Back (1997)

I am a complete noob to the regular expressions, so I have come only as little as this pattern:

([(][^)]*[)])

But, it "catches" both year and "garbage" and doesn't "catch" any prepending or trailing space.

Upvotes: 1

Views: 52

Answers (3)

Jan
Jan

Reputation: 43169

Yet another one - for engines that support (*SKIP)(*FAIL) could be:

\s*\((?:19|20)\d{2}\)(*SKIP)(*FAIL)|s*\([^()]+\)

See a demo on regex101.com.

Upvotes: 2

izlin
izlin

Reputation: 2138

\h*\((?!\h?\d{2,4}\h?\)).*\)\h*
\h*                               - match white spaces or tabs
   \(                             - match opening (
     (?!               )          - ignore the following group:
        \h?                           - optional white space
           \d{2,4}                    - 2 to 4 numbers for years like 2003 or 97
                  \h?                 - optional white space
                     \)               - closing )
                        .*\)      - match anything inside the bracket
                            \h*   - match white spaces or tabs

The \h* in the beginning and end is to remove all unwanted white spaces and tabs that are before and after the brackets.

Try it here.

Upvotes: 2

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626903

You can use

\s*\((?!\d+\))[^()]*\)

See the regex demo.

Details:

  • \s* - zero or more whitespaces
  • \( - a ( char
  • (?!\d+\)) - fail the match if there are one or more digits and a ) immediately to the right of the current location
  • [^()]* - zero or more chars other than ( and )
  • \) - a ) char.

Upvotes: 0

Related Questions