auntyellow
auntyellow

Reputation: 2573

Selector.select(timeout) returns 0 before timeout

According to Javadoc,

It returns only after at least one channel is selected, this selector's wakeup method is invoked, the current thread is interrupted, or the given timeout period expires, whichever comes first.

But occasionally it returns without any of these 4 cases:

  1. at least one channel is selected: it returns 0
  2. wakeup method is invoked: wakeup is not called
  3. the current thread is interrupted: Thread.interrupted() returns false
  4. given timeout period expires: not expired according to logs

UPDATED 2016-03-15

In my source at line 392 and line 402 I added some logs: https://github.com/xqbase/tuna/blob/debug/core/src/main/java/com/xqbase/tuna/ConnectorImpl.java

public boolean doEvents(long timeout) {
    Log.v("Before Select: " + timeout);
    int keySize;
    try {
        keySize = timeout == 0 ? selector.selectNow() :
                timeout < 0 ? selector.select() : selector.select(timeout);
    } catch (IOException e) {
        throw new RuntimeException(e);
    }
    Set<SelectionKey> selectedKeys = selector.selectedKeys();
    if (keySize == 0) {
        Log.v("After Select(0): selectedKeys=" + selectedKeys.size() + ", " +
                "interrupt=" + Thread.interrupted());
        invokeQueue();
        return false;
    }

    for (SelectionKey key : selectedKeys) {
        ...

Here is the log:

...
2016-03-15 23:07:49.695 com.xqbase.tuna.ConnectorImpl doEvents
FINE: Before Select: 8120
2016-03-15 23:07:49.696 com.xqbase.tuna.ConnectorImpl doEvents
FINE: After Select(0): selectedKeys=0, interrupt=false
2016-03-15 23:07:49.696 com.xqbase.tuna.ConnectorImpl doEvents
FINE: Before Select: 8119
2016-03-15 23:07:49.696 com.xqbase.tuna.ConnectorImpl doEvents
FINE: After Select(0): selectedKeys=0, interrupt=false
2016-03-15 23:07:49.700 com.xqbase.tuna.ConnectorImpl doEvents
FINE: Before Select: 8115
2016-03-15 23:07:49.701 com.xqbase.tuna.ConnectorImpl doEvents
FINE: After Select(0): selectedKeys=0, interrupt=false
2016-03-15 23:07:49.701 com.xqbase.tuna.ConnectorImpl doEvents
FINE: Before Select: 8114
2016-03-15 23:07:49.702 com.xqbase.tuna.ConnectorImpl doEvents
FINE: After Select(0): selectedKeys=0, interrupt=false
...

That is very strange: no selected keys, no interruption, no timeout and no wakeup, but it returned.

Is there a bug in Java? My Java version is 1.8.0_51-b16 (64-Bit Server VM), and run on a CentOS 6.5 x64 linode.

Upvotes: 6

Views: 3140

Answers (2)

auntyellow
auntyellow

Reputation: 2573

This may really be a bug in JDK. It seems that Netty and Mina also encounter such a problem and they rebuild the selector as a workaround.

See latest Netty code https://github.com/netty/netty/blob/4.1/transport/src/main/java/io/netty/channel/nio/NioEventLoop.java L641-681:

            if (selectedKeys != 0 || oldWakenUp || wakenUp.get() || hasTasks() || hasScheduledTasks()) {
                // - Selected something,
                // - waken up by user, or
                // - the task queue has a pending task.
                // - a scheduled task is ready for processing
                break;
            }
            ...
            } else if (SELECTOR_AUTO_REBUILD_THRESHOLD > 0 &&
                    selectCnt >= SELECTOR_AUTO_REBUILD_THRESHOLD) {
                // The selector returned prematurely many times in a row.
                // Rebuild the selector to work around the problem.
                logger.warn(
                        "Selector.select() returned prematurely {} times in a row; rebuilding selector.",
                        selectCnt);

                rebuildSelector();
                selector = this.selector;

                // Select again to populate selectedKeys.
                selector.selectNow();
                selectCnt = 1;
                break;
            }

See Mina 2.0 code https://github.com/apache/mina/blob/2.0/mina-core/src/main/java/org/apache/mina/core/polling/AbstractPollingIoProcessor.java L1070-1092:

                if (!wakeupCalled.getAndSet(false) && (selected == 0) && (delta < 100)) {
                    // Last chance : the select() may have been
                    // interrupted because we have had an closed channel.
                    if (isBrokenConnection()) {
                        LOG.warn("Broken connection");
                    } else {
                        LOG.warn("Create a new selector. Selected is 0, delta = " + (t1 - t0));
                        // Ok, we are hit by the nasty epoll
                        // spinning.
                        // Basically, there is a race condition
                        // which causes a closing file descriptor not to be
                        // considered as available as a selected channel,
                        // but
                        // it stopped the select. The next time we will
                        // call select(), it will exit immediately for the
                        // same
                        // reason, and do so forever, consuming 100%
                        // CPU.
                        // We have to destroy the selector, and
                        // register all the socket on a new one.
                        registerNewSelector();
                    }
                }

Therefore, registering a new selector may be the best practice if select() returns an unexpected zero.

Upvotes: 3

Jim Garrison
Jim Garrison

Reputation: 86774

The Javadoc is pretty clear.

During each selection operation, keys may be added to and removed from a selector's selected-key set ... . Selection is performed by the select(), select(long), and selectNow() methods, and involves three steps:

  1. ...

  2. The underlying operating system is queried for an update as to the readiness of each remaining channel to perform any of the operations identified by its key's interest set as of the moment that the selection operation began. For a channel that is ready for at least one such operation, one of the following two actions is performed:

    1. If the channel's key is not already in the selected-key set then it is added to that set and its ready-operation set is modified to identify exactly those operations for which the channel is now reported to be ready. Any readiness information previously recorded in the ready set is discarded.

    2. Otherwise the channel's key is already in the selected-key set, so its ready-operation set is modified to identify any new operations for which the channel is reported to be ready. Any readiness information previously recorded in the ready set is preserved; in other words, the ready set returned by the underlying system is bitwise-disjoined into the key's current ready set.

What is happening is that on the select that returns zero, the selection key wass already in the selected-key set, so no change in the number of ready keys occurred.

Note also in the section for the select(int timeout) method (my emphasis):

Returns:

  • The number of keys, possibly zero, whose ready-operation sets were updated

Upvotes: 2

Related Questions