Reputation: 273
How to make connection from mongo-spark connector to mongodb when only TLS/ssl enabled for mongo DB ?
How to pass the uri and collection name in read config to make connection with TLS/ssl enabled mongodb instance?
Thanks in advance ?
Upvotes: 1
Views: 4949
Reputation: 663
To make the ssl connection from Spark to the Mongo server you will need to trust the Mongo certificate, or the CA (certificate authority) that has signed that certificate. This is the most important part, and the trickiest one for me to figure it out.
Spark is a Java application, so it get the certificates from a jks trustStore. you will need to import the Mongo certificate (only the public part) into a trustStore to make it available for spark. To do so:
Get the Mongo certificate: Ask the DBA or the sysadmin who has setup the mongo to provide the certificate to you. Other aproach is to get it with openssl:
$ openssl s_client -connect mongodb:27017
CONNECTED(00000003)
depth=0 C = ES, ST = Madrid, L = Madrid, O = HOME, OU = HOME, CN=mongodb mongo.hostname.local
verify error:num=19:self signed certificate in certificate chain
verify return:0
---
Certificate chain
0 s:/C=ES/ST=Madrid/L=Madrid/O=COMPANY/OU=AREA/CN=mongo.hostname.local
i:/C=ES/ST=Madrid/L=Madrid/O=COMPANY/OU=AREA/CN=mongo.hostname.localIssuing CA
---
Server certificate
-----BEGIN CERTIFICATE-----
[..... A bunch of base64 text....]
-----END CERTIFICATE-----
Get the part from the -----BEGIN CERTIFICATE-----
to -----BEGIN CERTIFICATE-----
and save it in a .cert
file
Now, you have your server certificate imported. If you need mutual TLS you will need to provide a valid client certificate. This certificate, and the certificate private key, should be in a jks keyStore (it could be in the same trustStore file you have stored the Mongo server certificate because it uses the same format). If are not going to use mutual TLS you don't need to do this, but you have to check that the MongoDB instance is able to accept connections without client certificates. This is with the flag sslAllowConnectionsWithoutCertificates
The next step is specifying in the connection URI that you want to use TLS. This is fairly simple, just add the ?ssl=true
to your connection string. So the connection URI will be something like this
mongodb://user:pw@host:port/db.collection?ssl=true
Now you can summit your job. When summiting the job we also need to specify the location of our trustStore, and the libraries for the mongo connector:
/spark/bin/spark-submit \
--master spark://spark-master:7077 \
--packages org.mongodb.spark:mongo-spark-connector_2.11:2.2.0 \
--conf spark.executor.extraJavaOptions="-Djavax.net.ssl.trustStore=/path/to/your/trustStore.jks -Djavax.net.ssl.trustStorePassword=yourPassword" \
--conf spark.driver.extraJavaOptions="-Djavax.net.ssl.trustStore=/path/to/your/trustStore.jks -Djavax.net.ssl.trustStorePassword=yourPassword" \
/yourJob.jar
We use the extraJavaOptions for the driver and the executor to pass these parameters. If you are using mutual TLS, include the following extra java options:
-Djavax.net.ssl.keyStore=/path/to/your/keyStore.jks
-Djavax.net.ssl.keyStorePassword=yourPassword
The /path/to/your/keyStore.jks
is where you have stored your client certificates.
If the spark connector library is not already installed, you may run into trouble. The spark process will go to maven to download the library, but it will not be able to verify the maven certificates because we have specified another keyStore with just our certificate. One workaround is to import our certificate directly into the default keystore located at $JAVA_HOME/jre/lib/security/cacerts
. The default password is changeit
. Remember to do this in every worker node too.
I hope it helps!
Sources: https://github.com/brunocfnba/spark-mongo-ssl https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.5/bk_spark-component-guide/content/spark-encryption.html https://community.hortonworks.com/articles/147113/how-to-configure-your-spark-application-to-use-mon.html https://mapr.com/support/s/article/Unable-to-find-valid-certification-path-to-requested-target-error-while-accessing?language=en_US
Upvotes: 2