Reputation: 3876
I was wondering why python pandas does not provide .whl files for pip install on Linux. Whl files are available for Mac and Windows, though. See: https://pypi.python.org/pypi/pandas/0.18.1
I could do
pip install pandas
But that involves a time-consuming process of building from the source. I have a continuous-integration system that includes pandas as a build dependency, so I'd like to have the benefit of fast install from a binary .whl file without building from source.
Upvotes: 4
Views: 3031
Reputation: 2163
Armin Ronacher discusses this at some length. Fundamentally, Linux distributions are not sufficiently uniform and you can't depend on the presence of particular libraries to link against; even the python library may be inconsistent.
You can build your own wheel for the environments you are using and install them as many times as you like, which should work fine for a continuous integration system:
$ pip wheel pandas
Collecting pandas
Downloading pandas-0.18.1.tar.gz (7.3MB)
100% |████████████████████████████████| 7.3MB 131kB/s
...
Successfully built pandas
$ ls pandas*
pandas-0.18.1-cp35-cp35m-linux_x86_64.whl
$ pip install pandas-0.18.1-cp35-cp35m-linux_x86_64.whl
Processing ./pandas-0.18.1-cp35-cp35m-linux_x86_64.whl
...
Successfully installed pandas-0.18.1 python-dateutil-2.5.3 pytz-2016.4 six-1.10.0
Note that numpy does provides linux wheel files on pypi. Looking inside (they are simple zip files) we can see that they bundle the usual numpy libraries such as lapack_lite plus their dependencies (gfortran and openblas in this case):
$ unzip -l numpy-1.11.0-cp35-cp35m-manylinux1_x86_64.whl | grep [.]so
...
38407360 2016-04-12 21:01 numpy/.libs/libopenblasp-r0-39a31c03.2.18.so
1017104 2016-04-12 21:01 numpy/.libs/libgfortran-ed201abd.so.3.0.0
108200 2016-04-12 21:01 numpy/linalg/lapack_lite.cpython-35m-x86_64-linux-gnu.so
...
By contrast the numpy installed by the O/S has lapack_lite linked to /usr/lib/libblas, which via the debian alternatives system links against the optimized libatlas, so it should have better performance (untested).
Looking at the lapack links, they are using pretty standard libraries, so it should work on many 64-bit linux systems:
$ ldd lapack_lite.cpython-35m-x86_64-linux-gnu.so
linux-vdso.so.1 => (0x00007fffadf7a000)
libopenblasp-r0-39a31c03.2.18.so => /tmp/numpy/linalg/./../.libs/libopenblasp-r0-39a31c03.2.18.so (0x00007f5efb8bc000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f5efb69e000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f5efb2d9000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f5efafd3000)
libgfortran-ed201abd.so.3.0.0 => /tmp/numpy/linalg/./../.libs/libgfortran-ed201abd.so.3.0.0 (0x00007f5efacda000)
/lib64/ld-linux-x86-64.so.2 (0x00007f5efe0d4000)
Presumably someone could volunteer to create wheels for pandas as well, and keep them up to date across the different versions of python. Note that the pandas .so files link against libpython, but somehow numpy avoids doing so despite making python calls.
Edit 2016-05-25: added instructions for building and installing wheels.
Upvotes: 6