Reputation: 123
I have a file that i'm reading in, then creating nested lists that i want to then sort on the 4 element(zipcode)
jk43:23 Marfield Lane:Plainview:NY:10023
axe99:315 W. 115th Street, Apt. 11B:New York:NY:10027
jab44:23 Rivington Street, Apt. 3R:New York:NY:10002
ap172:19 Boxer Rd.:New York:NY:10005
jb23:115 Karas Dr.:Jersey City:NJ:07127
jb29:119 Xylon Dr.:Jersey City:NJ:07127
ak9:234 Main Street:Philadelphia:PA:08990
Here is my code:
ex3_3 = open('ex1.txt')
exw = open('ex2_sorted.txt', 'w')
data = []
for line in ex3_3:
items = line.rstrip().split(':')
data.append(items)
print sorted(data, key=operator.itemgetter(4))
Output:
[['jb23', '115 Karas Dr.', 'Jersey City', 'NJ', '07127'], ['jb29', '119 Xylon Dr.', 'Jersey City', 'NJ', '07127'], ['ak9', '234 Main Street', 'Philadelphia', 'PA', '08990'], ['jab44', '23 Rivington Street, Apt. 3R', 'New York', 'NY', '10002'], ['ap172', '19 Boxer Rd.', 'New York', 'NY', '10005'], ['jk43', '23 Marfield Lane', 'Plainview', 'NY', '10023'], ['axe99', '315 W. 115th Street, Apt. 11B', 'New York', 'NY', '10027']]
this all works fine, I just wonder if there is a way to do this without using "import operator"?
Upvotes: 1
Views: 3330
Reputation: 30943
A rough workalike would be:
print sorted(data, key=lambda items: items[4])
but operator.itemgetter
is a bit faster. I'm using this program to benchmark both approaches:
#!/usr/bin/env python
import timeit
withlambda = 'lst.sort(key=lambda items: items[4])'
withgetter = 'lst.sort(key=operator.itemgetter(4))'
setup = """\
import random
import operator
random.seed(0)
lst = [(random.randrange(100000), random.randrange(100000), random.randrange(100000), random.randrange(100000) ,random.randrange(100000))
for _ in range(10000)]
"""
n = 10000
print "With lambda:"
print timeit.timeit(withlambda, setup, number=n)
print "With getter:"
print timeit.timeit(withgetter, setup, number=n)
It creates a random list of 100,000 5-item tuples and then runs sort()
on the list 1,000 times. On my MacBook Pro with Python 2.7.2, the withlambda
version runs in about 55.4s and withgetter
runs in about 46.1s.
Note that as the lists grow large, the time spent in the sorting algorithm itself grows faster than the time spent fetching keys. Therefore, the difference is much greater if you're sorting lots of little lists. Running the same test with a 1,000 item list repeated 100,000 times yields 22.4s for withlambda
vs. 12.5s for withgetter
.
Upvotes: 4
Reputation: 184141
Construct or reorganize your sublist so that the thing you want to sort on is first. In your case, ZIP code, instead of being element 4, should be element 0. Then you can just sort them.
Of course the suitability of this ordering for other uses of the data must also be considered.
Upvotes: 0