Reputation: 600
I'm running a python script which submits a pyspark job on a cluster. However, the job fails with GLIBC issues.
Log Contents:
dev-env/bin/python: /lib64/libc.so.6: version 'GLIBC_2.14' not found (required by dev-env/bin/python)
dev-env/bin/python: /lib64/libc.so.6: version 'GLIBC_2.17' not found (required by dev-env/bin/python)
The problem, I think, is that the GLIBC version on my machine is 2.17. GLIBC version on pyspark cluster is 2.10. I obtained this by opening python and running the command
>>> import platform
>>> platform.libc_ver()
('glibc', '2.17') # This is my machine
One way to resolve this problem IMO is to make sure python on my machine uses GLIBC 2.10. However, I don't know how to do it. I'm using anaconda to create python virtual env. How should I approach this?
Upvotes: 2
Views: 1889
Reputation: 213754
How to use an older version of GLIBC in python (anaconda)?
Your problem is not that your pyspark
is using GLIBC-2.17 on your machine. Your problem is that your pyspark
was built against GLIBC-2.17 (or later).
You need to download a different version of pyspark
, one appropriate for running on GLIBC-2.10 machines. Such version will run fine on GLIBC-2.10 and all later versions.
Upvotes: 2