Reputation: 167
I have an array with data and I notice that I have each data twice. Is there any method to remove the duplicate data to simplify the array content? Below is the code that I made in python:
import requests
import re
import bs4
r = requests.get("http://as.com/tag/moto_gp/a/")
r.raise_for_status()
html = r.text
matches = re.findall(r"http://motor\.as\.com/motor/\d+/\d+/\d+/motociclismo/\d+_\d+.html", html)
print (matches)
Upvotes: 1
Views: 138
Reputation: 16081
I hope your matches
is a list.Then you can use simple method.
In [1]: a = [1,1,2,2,3,3,4,4,5]
In [2]: list(set(a))
Out[2]: [1, 2, 3, 4, 5]
For your code only one modification.
matches = list(set(re.findall(r"http://motor\.as\.com/motor/\d+/\d+/\d+/motociclismo/\d+_\d+.html", html)))
Upvotes: 7