hello_its_me
hello_its_me

Reputation: 793

Logstash Filter to Extract URL from Text Field into a New Field Called URL

I'm inputting a field called text. this field may at times contain a URL.What I would like to do is extract the URL's from text, and put them in a new field called URL.

I tried grok, but it seems like grok patterns need a specific log format in order for it to work. For an example, the following will work:

5546 hello www.google.com
{id} {text} {URL}

But the following wouldn't

4324 hello my name is Ryan www.yahoo.com
{id} {text} {URL}

instead, it would take hello as text, and not take www.yahoo.com as the URL. Is there a way around this? Please note that sometimes, the text might look like the following:

www.gmail.com hello everyone

What filter can I use in order to extract the URL from the text coming into Logstash?

Thank you.

Upvotes: 0

Views: 836

Answers (1)

Alain Collins
Alain Collins

Reputation: 16362

grok{} is the correct filter to take an input string and parse it into field. The trick is to make one or more patterns that meet your requirements.

Please check out the grok debugger, which is a very useful tool for building your own patterns. Start slowly, working your way from left to right, making sure things are matched the way you want before moving on to the next piece of input.

The debugger also has a link to the standard grok patterns, with which you should familiarize yourself. Your example doesn't contain a URL per se, but does contain a host, which is often matched with %{HOSTNAME}.

To match an unknown amount of stuff before the host, try %{DATA}.

Upvotes: 1

Related Questions