gigaman
gigaman

Reputation: 27

How to exclude brackets at the end of the Url

I am new to regex, so any help is really appreciated. I have an expression to identify a URL : (http[^'\"]+)

Unfortunately on some URLs, I get additional square brackets at the end For instance "http://example.com]]"

As the result want to receive "http://example.com"

How do I get rid of those brackets with the help of the regex I wrote above?

Upvotes: 2

Views: 306

Answers (2)

Ryszard Czech
Ryszard Czech

Reputation: 18611

Stop the match between a word and nonword character:

(http[^'"]+)\b

See regex proof.

EXPLANATION

--------------------------------------------------------------------------------
  (                        group and capture to \1:
--------------------------------------------------------------------------------
    http                     'http'
--------------------------------------------------------------------------------
    [^'"]+                   any character except: ''', '"' (1 or
                             more times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
  )                        end of \1
--------------------------------------------------------------------------------
  \b                       the boundary between a word char (\w) and
                           something that is not a word char

Upvotes: 2

Jan
Jan

Reputation: 43169

What you actually have is called a negated character class, so just add characters that should not be matched. In addition, there's not really a need for a capturing group. That said, you could use

http[^'"\]\[]+
#       ^^^^

Note that this will exclude square brackets anywhere in your possible url not just at the end. See a demo on regex101.com.

Upvotes: 2

Related Questions