Reputation:

python performance - list of tuples or dictionary for relationships?

I want to define a simple replace dictionary, that will be iterated through to clean up a string. For example, to clean up an address, which is better practice (performance, style, etc)?

dictionary = {'North': 'N', 'South': 'S', 'East': 'E', 'West': 'W'}
address = 'North South East West'
for key in dictionary:
    address = address.replace(key, dictionary[key])

or b)

tuple_list = [('North', 'N'), ('South', 'S'), ('East', 'E'), ('West', 'W')]
address = 'North South East West'
for tuple in tuple_list:
    address = address.replace(tuple[0], tuple[1])

Thanks!

Upvotes: 1

Answers (3)

Martijn Pieters

Reputation: 1123770

There is not going to be much of a speed difference between the two; you are iterating over two sequences, and only the exact datatype of those structures differs.

Your dictionary loop could be ever so slightly more efficient by using the .iteritems() method:

dictionary = {'North': 'N', 'South': 'S', 'East': 'E', 'West': 'W'}
address = 'North South East West'
for key, value in dictionary.iteritems():
    address = address.replace(key, value)

Since .iteritems() gives you a iterable of (key, value) pairs this method is exactly the same as using a tuple.

Using the timeit module, you can see there is no real difference between the two methods:

>>> import timeit
>>> def dictionary(address, d={'North': 'N', 'South': 'S', 'East': 'E', 'West': 'W'}):
...     for s, repl in d.iteritems():
...         address = address.replace(s, repl)
... 
>>> def tuples(address, t=[('North', 'N'), ('South', 'S'), ('East', 'E'), ('West', 'W')]):
...     for s, repl in t:
...         address = address.replace(s, repl)
... 
>>> timeit.timeit("test('North South East West')", 'from __main__ import dictionary as test')
2.5873939990997314
>>> timeit.timeit("test('North South East West')", 'from __main__ import tuples as test')
2.5879111289978027

Upvotes: 4

Iguananaut

Reputation: 23346

%%timeit
dictionary = {'North': 'N', 'South': 'S', 'East': 'E', 'West': 'W'}
address = 'North South East West'
for key in dictionary:
    address = address.replace(key, dictionary[key])
1000000 loops, best of 3: 1.84 us per loop

%%timeit
tuple_list = [('North', 'N'), ('South', 'S'), ('East', 'E'), ('West', 'W')]
address = 'North South East West'
for tuple in tuple_list:
    address = address.replace(tuple[0], tuple[1])
100000 loops, best of 3: 1.9 us per loop

Like Martijn said, there's effectively no difference.

Upvotes: 0

CashCow

Reputation: 31445

For just iterating over it, you would use a list.

For looking up keys, use a dict.

Not necessarily slower in the first instance, just that it is not what a dict is intended for.

Looking up by key will be considerably faster (than trying to linearly find an element) if you use a dict, and therefore if you are going to use the collection for that purpose, use one. Otherwise do not.

In your case you are not finding "North" "South" "East" and "West" in your dictionary, you are doing the converse - finding them in your "address" string.

Your fastest algorithm might be to tokenise (split) your address string, run through each element and look up in the dict to see if it should be replaced, and then rejoin.

Not only will it be more efficient but it would avoid clbuttic replaces, unless you want those of course.

Upvotes: 2

python performance - list of tuples or dictionary for relationships?

Answers (3)

Related Questions