Reputation: 752
I'm using a Python object with a couple attributes to organize a data model, but I'm wondering if this is any less efficient than using a key based dictionary. My model stores MP3 tag data and looks as such:
class Mp3Model:
def __init__(self, path, filename):
self.path = path
self.filename = filename
self.artist = ''
self.title = ''
self.album = ''
self.tracknumber = ''
self.genre = ''
self.date = ''
The model is used as such:
mp3s = []
for file in files:
if os.path.splitext(file)[1] == '.mp3':
# Append a new Mp3Model to the mp3s list for each file found
mp3s.append(Mp3Model(os.path.join(self.dir, file), file))
Would using a key based dictionary, or even a simple list provide much performance enhancement? The length of the mps[] object list is highly variable depending on how many files are found in a given directory, and the program can slow to a crawl (I haven't implemented any threading yet) when I scan directories with tons of files.
Upvotes: 2
Views: 178
Reputation: 13
I don't know if another method is better. you can do these as follows;
from collections import namedtuple
Mp3Model = namedtuple("Mp3Model", "path filename artist title")
it can create simple Mp3Model class.
Upvotes: 0
Reputation: 2909
Before you conclude that any particular part of your code is the slow part, profile the program. It's probably your innermost loop, but test that assumption, don't just leap to it.
For CPU-bound loads, try Pypy.
For I/O-bound loads, try caching, or aggregating lots of small files into a smaller number of large files somehow. Open tends to be slow compared to reading a bit of sequential data.
HTH
Upvotes: 0
Reputation: 86844
Unless you declare __slots__
for your object, object attributes are stored in an underlying dict
anyway so using a dict
would be marginally faster than an object. However the difference would be negligible compare to the rest of your code.
The selection of a data structure should depend on various other factors:
Optimising for your use case would lead to much higher returns.
Upvotes: 5
Reputation: 9711
Using a dict will be more efficient than using a class. You avoid dealing with all the overhead of classes, attribute access etc. Not to mention, accessing items in a dictionary via a key is one of the most efficient and optimized pieces of code in python.
A few caveats, though:
Upvotes: 2
Reputation: 15990
This is just a guess, but I believe the bottleneck is probably in reading the files from the os, not in building the list.
That being said, you can test it simply by just creating a list with all the filenames and comparing the performance to building a list of objects with the file names.
Upvotes: 0