Pekafu
Pekafu

Reputation: 55

Storing and accessing URLs

I have a list of URLs which I want to access with python. Which is the best way to store those? There are ~40 URLs and the list is quite constant, but I would like to be able to update the complete list (I have a scraper checking for those URLs from certain website. Currently URLs are stored like this:

class urlList(object):     
 A = 'url1'
 B = 'url2'
 C = 'url3'

Accessing those are quite simple when you know the name 'A'. But I have this script for checking

wanted_urls = ['A','C']
def urlScrapper(wanted_urls):
 listing = urlList
 temp = [attr for attr in dir(listing) if not callable(getattr(listing, attr)) and not attr.startswith("__")]
 for (wanted_url in wanted_urls):
  for (url in temp):
   if (url == wanted_url):
    url_to_be_used = listing.url

listing.url doesn't work since listing doesn't have such object. Is there any other way of coping with this than writing everything open like this:

if (wanted_url == 'A')
 url_to_be_used = listing.A

Also if there is a better way for storing those URLs suggestions are welcome.

Upvotes: 1

Views: 1924

Answers (3)

Pekafu
Pekafu

Reputation: 55

Using dictionary as MedAli suggested seems like a nice solution:

urllist = {'A':'url1', 'B':'url2', 'C':'url3'}
temp = urllist.items()

for (wanted_url in wanted_urls):
 for (url in temp):
  if (url[0] == wanted_url):
   url_to_be_used = listing.url[1]

Upvotes: 0

Burak Özdemir
Burak Özdemir

Reputation: 560

@MedAli's solution is really nice if you don't want to make your URLs persistent. I'd suggest you use TinyDB solution for keeping your URLs organized and persistent.

Firstly, install TinyDB with PIP.

Then;

from tinydb import TinyDB, Query

url_db = TinyDB('/path/to/URLs.json')
URL = Query()


url_db.insert({'URLName': 'Foo', 'Address': 'http://foo.bar'})
url_db.search(URL.URLName == 'Foo')

Output:

[{u'URLName': u'Foo', u'Address': u'http://foo.bar'}]

Upvotes: 1

Mohamed Ali JAMAOUI
Mohamed Ali JAMAOUI

Reputation: 14689

You can use a dictionary for that:

urllist = {'A':'url1', 'B':'url2', 'C':'url3'}

if you want to access the url for A, you do:

urllist["A"]

To get the list of all url names

>>> urllist = {'A':'url1', 'B':'url2', 'C':'url3'} 
>>> urllist.keys() 
['A', 'C', 'B']

To get the list of all available urls:

>>> urllist.values() 
['url1', 'url3', 'url2']
>>> 

To add a new url,

urllist["D"] = "url4"

Upvotes: 2

Related Questions