Simon A. Eugster
Simon A. Eugster

Reputation: 4264

Does the “in” operator have side effects?

I have code which writes bytes to a serial port (pySerial 3.5) and then receives bytes.

The code should work for micropython too, which uses UART instead of pySerial. With UART, I have to add a small delay before reading.

The user should not have to pass an additional flag whether to add that delay, because the serial_port object is already platform specific, for example the UART implementation provides a .any() method which the pySerial implementation does not have.

So my first attempt is to check for this method, and only delay when it exists.

def __init__(self, serial_port):
    self.serial_port.write(my_bytes)

    # When checking for any on serial_port, I receive no bytes.
    if "any" in self.serial_port:
        print("UART specific delay will happen here")
    # When instead checking with getattr(self.serial_port, "any", None), bytes come in

    raw_config = self.serial_port.read(128)

As soon as I add this "any" in self.serial_port check, the read() method returns an empty byte array.

When I remove the check, I get bytes again. When I replace the check by getattr(self.serial_port, "any", None), I get bytes too. When I just run time.sleep() or anything else, I get bytes. When I add the in check, bytes are gone.

Why on earth? Isn't an in check supposed to be side effect free?

(All runs were executed with pySerial ports only.)

Upvotes: 3

Views: 102

Answers (4)

kaya3
kaya3

Reputation: 51037

This is not mentioned in the PySerial documentation, but it is a consequence of the fact that the class inherits from io.IOBase in the standard library. The documentation for IOBase states:

IOBase (and its subclasses) supports the iterator protocol, meaning that an IOBase object can be iterated over yielding the lines in a stream.

So although the PySerial class does not actually implement the __contains__ method, and in fact neither do any of the other classes in its hierarchy including IOBase, the iterator protocol provides a different mechanism for the in operator to work:

For user-defined classes which do not define __contains__() but do define __iter__(), x in y is True if some value z, for which the expression x is z or x == z is true, is produced while iterating over y.

So, the in operator iterates over the stream in order to test membership, and that operation has the side-effect of consuming from the stream (because that's how iterating over an IO socket works in Python).

Upvotes: 6

Kostas Mouratidis
Kostas Mouratidis

Reputation: 1255

As people in the comments said, __contains__ is often called. In this particular case, it's probably consuming some sort of iterator/generator. Here's a simple example with a file handle:

file.txt:

hello
hello
>>> f = open("file.txt")
>>> f.read()
'hello\nhello\n'

>>> f = open("file.txt")
>>> "hello\n" in f  # iterates over all lines
True
>>> f.read()
''
>>> f.seek(0)
0
>>> f.read()
'hello\nhello\n'

Upvotes: 2

Robin Gugel
Robin Gugel

Reputation: 1196

If self.serial_port is lazily evaluated/can only be iterated once this makes total sense.. Check this out and you'll see:

a = (x for x in (1,2,3,4,5)) # generator comprehension - will be evaluated lazily
print(list(a)) # [1,2,3,4,5]
a = (x for x in (1,2,3,4,5))
5 in a
print(list(a)) # []
a = (x for x in (1,2,3,4,5))
2 in a
print(list(a)) # [3,4,5]

hope you see how in can mess up your evaluation here :)

Note if you'd replace the generator comprehensions (x for x in ...) by list comprehensions [x for x in ...] then in wouldn't have any side effects like that because a list can be iterated multiple times and a list isn't created lazily.

Upvotes: 1

wovano
wovano

Reputation: 5073

As mentioned in a comment by deceze, in just calls the __contains__() method. Depending on how this is implemented, it can have side effects.

A simple example to demonstrate this:

class Example:

    def __contains__(self, x):
        print('Side effect!')


x = Example()
if 'something' in x:
    print('Found')
else:
    print('Not found')

Output:

Side effect!
Not found

In the case of your serial_port example, I guess the __contains__() method is implemented in such a way that it reads bytes until "any" is found, or until there are no more bytes. Consequently, all bytes are already consumed and your function returns an empty array.

NB: According to the documentation:

For objects that don’t define __contains__(), the membership test first tries iteration via __iter__(), then the old sequence iteration protocol via __getitem__().

Upvotes: 3

Related Questions