Suresh Kota
Suresh Kota

Reputation: 315

use poll instead of select in pexpect spawn

I have the following test code,

import pexpect
import time

session = {}
try:
        for i in range(1030):
                print(i)
                child = pexpect.spawn(cmd,encoding='utf-8')
                child.expect("mgmt",200)
                session[i]=child
                print(child)
                with open("command.txt","w") as fobj:
                        child.logfile_read=fobj
                        child.sendline ("server 0")
                        child.expect ("server0", 200)
                with open("command.txt","r") as temp:
                        command_output=temp.read()
                        print(command_output)
        time.sleep(5000)
except Exception as e:
        print("mgmt launch failed")
        print(e)

This code is opening more than 1024 file descriptors and producing following traceback,

server0>
1018
<pexpect.pty_spawn.spawn object at 0x7f5f8ddf6f28>
buffer (last 100 chars): '>'
after: 'mgmt'
match: <_sre.SRE_Match object; span=(452, 461), match='mgmt'>
match_index: 0
exitstatus: None
flag_eof: False
pid: 11126
child_fd: 1023
closed: False
timeout: 30
delimiter: <class 'pexpect.exceptions.EOF'>
logfile: None
logfile_read: None
logfile_send: None
maxread: 2000
ignorecase: False
searchwindowsize: None
delaybeforesend: 0.05
delayafterclose: 0.1
delayafterterminate: 0.1
server 0
server0>

1019
mgmt launch failed
filedescriptor out of range in select()
Traceback (most recent call last):
  File "test1.py", line 9, in <module>
    child.expect("mgmt",200)
  File "/usr/lib/python3/dist-packages/pexpect/spawnbase.py", line 321, in expect
    timeout, searchwindowsize, async)
  File "/usr/lib/python3/dist-packages/pexpect/spawnbase.py", line 345, in expect_list
    return exp.expect_loop(timeout)
  File "/usr/lib/python3/dist-packages/pexpect/expect.py", line 99, in expect_loop
    incoming = spawn.read_nonblocking(spawn.maxread, timeout)
  File "/usr/lib/python3/dist-packages/pexpect/pty_spawn.py", line 452, in read_nonblocking
    r, w, e = select_ignore_interrupts([self.child_fd], [], [], timeout)
  File "/usr/lib/python3/dist-packages/pexpect/utils.py", line 138, in select_ignore_interrupts
    return select.select(iwtd, owtd, ewtd, timeout)
ValueError: filedescriptor out of range in select()

I have read that, to overcome this poll() should be used instead of select() but could not find an example on how to use poll() when using pexpect.spawn(). How can I explicitly say Python to use poll() instead of socket()?

Upvotes: 0

Views: 1235

Answers (2)

Havok
Havok

Reputation: 5882

Adding to the response, since pexpect version 4.5 the option use_poll can be specified when creating the spawn object:

https://pexpect.readthedocs.io/en/stable/api/pexpect.html#pexpect.spawn.init

Quoting:

The use_poll attribute enables using select.poll() over select.select() for socket handling. This is handy if your system could have > 1024 fds

In your case:

child = pexpect.spawn(cmd, encoding='utf-8', use_poll=True)

Upvotes: 2

Gil Hamilton
Gil Hamilton

Reputation: 12357

The pexpect module doesn't support doing that out-of-the-box. However, it wouldn't be terribly difficult to monkey-patch the spawn object's __select method, which is where the system select is actually called.

Monkey-patching means to replace an object's method at run-time with your own version of it. That's easy to do in python if the method to be replaced has a clean interface. In this case, it's very straight-forward because pexpect has isolated the select functionality to this one method, which has a very logical and clean interface.

An implementation would look something like the below code. Note that the majority of the my_select function here is duplicating the current __select's treatment of EINTR. Also, note that a more general solution would handle the owtd and ewtd arguments correctly as well. It's not necessary here simply because those arguments are always passed as empty lists in the pexpect module I'm looking at. Final caveat: no warranty provided :). None of this has been tested.

import select
import sys
import errno

def my_select(self, iwtd, owtd, ewtd, timeout=None):

    if timeout is not None:
        end_time = time.time() + timeout

    poll_obj = select.poll()
    for fd in iwtd:
        poll_obj.register(fd, select.POLLIN | select.POLLPRI | select.POLLERR | select.POLLHUP)

    while True:
        poll_obj.poll(timeout)
        try:
            poll_fds = poll_obj.poll(timeout)
            return ([fd for fd, _status in poll_fds], [], [])
        except select.error:
            err = sys.exc_info()[1]
            if err.args[0] == errno.EINTR:
                # if we loop back we have to subtract the
                # amount of time we already waited.
                if timeout is not None:
                    timeout = end_time - time.time()
                    if timeout < 0:
                        return([], [], [])
            else:
                # something else caused the select.error, so
                # this actually is an exception.
                raise

# Your main code...
child = pexpect.spawn(cmd,encoding='utf-8')
# Monkey-patch my_select method into place
child.__select = my_select 
child.expect("mgmt",200)
...

There are downsides to monkey-patching. If the system version of the module gets upgraded and reorganized, the monkey-patch may no longer make sense. So if you're not comfortable with that risk, you could simply copy the module into your own source hierarchy (possibly renaming it do avoid confusion) and then making the same changes directly to its __select method.

Upvotes: 1

Related Questions