valera5505
valera5505

Reputation: 3487

Get links in list from string

I need some help with regexp in Python. I have string such as:

17:25:31;http://example1.com/viewtopic.php?f=8&t=189;example1.com;127.0.0.1 2013-10-19
17:22:32;http://example2.com;example2.com;127.0.0.1 2013-10-19 
20:18:28;http://example3.com/threads/example-text-in-url.27304/;example3.com;127.0.0.1 2013-10-19

How can I get this list?

['http://example1.com/viewtopic.php?f=8&t=189', 'http://example2.com', 'http://example3.com/threads/example-text-in-url.27304/']

Upvotes: 0

Views: 100

Answers (3)

Alex Tape
Alex Tape

Reputation: 2291

just try this. maybe it fit your needs :)

Regex

/^(.*;)/gm

String

17:25:31;http://example1.com/viewtopic.php?f=8&t=189;example1.com;127.0.0.1 2013-10-19
17:22:32;http://example2.com;example2.com;127.0.0.1 2013-10-19 
20:18:28;http://example3.com/threads/example-text-in-url.27304/;example3.com;127.0.0.1 2013-10-19

Matches

1.  [0-66]    `17:25:31;http://example1.com/viewtopic.php?f=8&t=189;example1.com;`
2.  [87-129]  `17:22:32;http://example2.com;example2.com;`
3.  [151-228] `20:18:28;http://example3.com/threads/example-text-in-url.27304/;example3.com

Upvotes: 1

user2555451
user2555451

Reputation:

I'm going to give a Regex solution since that is what you asked for. Basically, all you need to do is capture text between http:// and ;. Below is a demonstration:

from re import findall

mystr = """
17:25:31;http://example1.com/viewtopic.php?f=8&t=189;example1.com;127.0.0.1 2013-10-19
17:22:32;http://example2.com;example2.com;127.0.0.1 2013-10-19 
20:18:28;http://example3.com/threads/example-text-in-url.27304/;example3.com;127.0.0.1  2013-10-19
"""

print findall("(http://.+?);", mystr)

Output:

['http://example1.com/viewtopic.php?f=8&t=189', 'http://example2.com', 'http://example3.com/threads/example-text-in-url.27304/']

Upvotes: 1

Thomas Orozco
Thomas Orozco

Reputation: 55215

You don't need regex here, use a csv parser.

Assuming your data is in a file called data.csv:

import csv
reader = csv.reader(open("data.csv"), delimiter=";")
referers = [line[1] for line in reader]

Upvotes: 3

Related Questions