physlexic
physlexic

Reputation: 858

Python: Other options versus using '__contains__`? I was told I should not use it

I have a working file [below], but I would like to know if there is a better solution to the first three lines.

I have several files in a folder, and a script that processes them based on a particular and conserved <string> in each file's name. However, I was told I should not use __contains__ (I am not a CS major, and don't fully understand why). Is there a better option? I could not find any other concise solution.

Thanks.

files = os.listdir (work_folder)
for i in files:
    if i.__contains__('FOO'):
        for i in range (number_of_files):
            old_file = 'C:/path/to/file'
            with open(merged_file, 'a+') as outfile:
                with open(old_file) as infile:
                    for line in infile:
                        outfile.write(line)

Upvotes: 2

Views: 183

Answers (3)

abarnert
abarnert

Reputation: 365707

As Daniel Roseman explains, the double-underscore methods aren't there to be called by you, they're there to be called by the Python interpreter or standard library.

So, that's the main reason you shouldn't call them: It's not idiomatic, so it will confuse readers.


But all you know is that there must be some operation that you are intended to use, which Python will implement by calling the __contains__ method. You have no idea what that operation is. How do you find it?

Well, you could just go to Stack Overflow, and someone helpful like Daniel Roseman will tell you, of course. But you can also search for __contains__ in the Python documentation. What you'll find is this:

object.__contains__(self, item)

Called to implement membership test operators. Should return true if item is in self, false otherwise.

So, self.__contains__(item) is there for Python to implement item in self.

And now you know what to write: 'FOO' in i.


And if you read on in those linked docs, you'll see that it isn't actually quite true that i.__contains__('FOO') does the same thing as 'FOO' in i. That's true for the most common cases (including where i is a string, as it is here), but if i doesn't have a __contains__ method, but is an iterable, or an old-style sequence, in will use those instead.

So, that's another reason not to directly call __contains__. If you later add some abstraction on top of strings, maybe a virtual iterable of grapheme clusters or something, it may not implement __contains__, but in will still work.

Upvotes: 2

wim
wim

Reputation: 362687

It would be more usual to write

if 'FOO' in i:

instead of

if i.__contains__('FOO'):

However, I would go one further than that and suggest your use case is more suited to glob

import glob
foo_files = glob.glob(os.path.join(work_folder, '*FOO*'))

Upvotes: 5

Daniel Roseman
Daniel Roseman

Reputation: 599600

Generally in Python, double-underscore methods should not be called directly; you should use the global functions or operators that correspond to them. In this case, you would do if 'FOO' in i.

Upvotes: 8

Related Questions