Reputation: 726
I am following questions like this and blogs like this one but I cannot make the connection work due to (I think) library versions conflict.
I want to use this library, which afaik is the most used and referenced one, to connect from a aws glue job to Salesforce. This code works on my local machine, but on glue I get the following message when I use "python shell" configuration: ERROR: botocore 1.12.232 has requirement urllib3<1.26,>=1.20; python_version >= "3.4", but you'll have urllib3 1.26.2 which is incompatible.
Or, if I use "spark" option: Traceback (most recent call last): File "/tmp/bp-etl-crm-sparkV2", line 1, in <module> from simple_salesforce import Salesforce File "/tmp/simple_salesforce-1.10.1-py2.py3-none-any.whl/simple_salesforce/__init__.py", line 4, in <module> from .api import Salesforce, SFType File "/tmp/simple_salesforce-1.10.1-py2.py3-none-any.whl/simple_salesforce/api.py", line 18, in <module> from .login import SalesforceLogin File "/tmp/simple_salesforce-1.10.1-py2.py3-none-any.whl/simple_salesforce/login.py", line 16, in <module> from authlib.jose import jwt ModuleNotFoundError: No module named 'authlib'
The code is as simple as a connection and a query, which again, I've tested and the credentials and connection do work when I try from a local console and not aws glue:
from simple_salesforce import Salesforce
def main():
print("INIT")
sf = Salesforce(username='username', password='pw', security_token='securitytoken', domain='test')
res_bulk = sf.bulk.Account.query('SELECT Id, Name FROM Table')
print(res_bulk)
if __name__ == "__main__":
main()
What have I tried so far:
As I said, I have tried to both configure the job as python shell, with Glue 1.0, and Spark with Glue 2.0. Both fail due to dependencies problems.
I have tried downgrading the simple-salesforce version. So far none have worked, it keeps throwing ERROR: botocore 1.12.232 has requirement urllib3<1.26,>=1.20; python_version >= "3.4", but you'll have urllib3 1.26.2 which is incompatible.
I have tried getting urllib version lower than 1.26.2, uploading it to S3, and adding it to the list of libraries to be used by my code. This has not worked so far, but I am not sure why, since I do not know what does Glue do when ordered to use a certain version of a library it is designed to use regardless of what you do, like urllib.
Any ideas as to what could I be doing wrong, or what else could I try to make it work.
Upvotes: 0
Views: 1634
Reputation: 101
simple_salesforce module has dependencies on authlib, cryptography etc. On Glue version 2.0 (Spark 2.4, python 3), you will need to add below parameters to the glue job:
--additional-python-modules : cryptography==3.0,simple-salesforce==1.11.1 image_screenshot
You have to edit the job and under "Security configuration, script libraries, and job parameters (optional)" you can find the "Job Parameters" option.
(these versions are compatible and have worked for me)
Upvotes: 5