Manish Goel
Manish Goel

Reputation: 893

Cython giving 'unnecessary' warning

The following cython script, results in comparison between signed and unsigned integer expressions warning.

%%cython
# distutils: language = c++

from libcpp.vector cimport vector as cpp_vec
import numpy as np

cdef cpp_vec[long] a  = np.array([1,2,3], np.int)
cdef Py_ssize_t i

for i in range(a.size()):        # Using a.size() as parameter of range() causes the warning
    pass

Is this warning necessary? If so, why? Also, is it possible to silence these unsigned vs signed comparison warnings?

Also, why only the first for loop results in the warning?

%%cython
cdef:
    ssize_t i
    unsigned long m = 10
    unsigned int n = 10
    long o = 10

for i in range(m):
    pass
for i in range(n):
    pass
for i in range(o):
    pass

Upvotes: 1

Views: 1154

Answers (1)

ead
ead

Reputation: 34347

This is a warning from your g++-compiler, and as every compiler-warning should be taken seriously.

The for-loop of your Cython code is transformed to the following/similar cpp-code:

 std::vector<long>::size_type __pyx_t_7;
 Py_ssize_t __pyx_t_8;
 __pyx_t_7 = __pyx_v_xxxx_a.size();
 for (__pyx_t_8 = 0; __pyx_t_8 < __pyx_t_7; __pyx_t_8+=1) {
     ....
 }

The problem is, that std::vector<long>::size_type is 64bit unsigned integer ona 64-bit system, and Py_ssize_t a 64bit signed integer.

C++ uses implicit conversion for built-in types, in order to be able to evaluate __pyx_t_8 < __pyx_t_7, the rules can be found for example in this SO-post.

There are multiple reason, why this warning makes sense - the rules are quite complicated and experience has shown, that programmers often handle them wrong. For example -1<1U evaluates to false (see live), but

signed char a =-1;
unsigned char b = 1;
std::cout<<(a<b)<<"\n";

prints 1, i.e. (a<b) evaluates to true (see live). Such quirks can easily leads to hard-to-find bugs.

Even more, historically, before the standard was established, different compilers handled those conversions differently - when switching to a different compiler, it was a nice thing to see all the places where the behavior could be changed - but this not really important nowdays.

There are different strategies to avoid this warning.

1. Go with the flow:

Isn't easier to make the i a size_t instead of Py_ssize_t, i.e cdef size_t i?

2. Cast and check:

You can explicitly cast and then check, whether the assumptions are OK, e.g.

...
cdef Py_ssize_t i

cdef Py_ssize_t n = <Py_ssize_t>(a.size())
if n<0 :
    raise ValueError("too big");

for i in range(n):
...

The above becomes fast quite cumbersome, so one usually extract this into a utility-function.

3. Just cast:

One could ask "how probable it is that a vector has more than 2^63 entries?" and just skip the checking:

...
for i in range(<Py_ssize_t>(a.size())):        
    print(i) 

and then... see the code fail 30 years later:)

4. Disable compiler warning:

One also could pass -Wno-sign-compare-option to the compiler via extra_compile_args in setup-file or -c=-Wno-sign-compare in IPython, which would turn off the warning in the whole code. It is probably safer to use the "Just cast"-solution, which mutes the warning only at this one place and not everywhere.


The warning is not issued if signed integer is of larger size than the unsigned integer: In this case the unsigned integer is converted to larger signed integer and fits into positive range of the larger type.

For example ssize_t has 64bit and unsigned int has 32bit, thus 32bit-unsigned bit is converted to 64bit-ssize_t before the comparison - the will be no overflow, because positive numbers up to 63 can be represented by `ssize_t´.

Upvotes: 1

Related Questions