Reputation: 163
getting
INFO AmazonHttpClient: Unable to execute HTTP request: Connection refused (Connection refused)
java.net.ConnectException: Connection refused (Connection refused)
when trying to read data from minIO via spark. I am running my spark jar via spark operator on Kubernetes on WSL2 + Docker Desktop. MinIO is also run on Kubernetes in a separate namespace.
my spark context settings:
val s3endPointLoc = "http://127.0.0.1:9000"
spark.sparkContext.hadoopConfiguration.set("fs.s3a.endpoint", s3endPointLoc)
spark.sparkContext.hadoopConfiguration.set("fs.s3a.access.key", s3accessKeyAws)
spark.sparkContext.hadoopConfiguration.set("fs.s3a.secret.key", s3secretKeyAws)
spark.sparkContext.hadoopConfiguration.set("fs.s3a.connection.timeout", connectionTimeOut)
spark.sparkContext.hadoopConfiguration.set("spark.sql.debug.maxToStringFields", "100")
spark.sparkContext.hadoopConfiguration.set("fs.s3a.path.style.access", "true")
spark.sparkContext.hadoopConfiguration.set("fs.s3a.impl", "org.apache.hadoop.fs.s3a.S3AFileSystem")
spark.sparkContext.hadoopConfiguration.set("fs.s3a.connection.ssl.enabled", "true")
What might be the reason for the refused connection? thanks
Upvotes: 0
Views: 1001
Reputation: 1063
You cannot simply connect via http://127.0.0.1:9000
You should provide the MinIO service DNS name to Spark: <MinIO-ServiceName>.<MinIO-Namespace>.svc.cluster.local:9000
Upvotes: 3