Reputation: 3544
I have deployed MSK cluster and MSK S3 sink connector in us-west-2
.
Sink bucket is in us-east-1
.
When MSK connector starts up, it fails with timeout, waits for 25-30 minutes, retries, and fails over and over again.
Here are the logs (my-sink-stg
is connector name):
[Worker-0830298caeb23471c] (io.confluent.connect.storage.partitioner.PartitionerConfig:361)
"[Worker-0830298caeb23471c] [2023-07-17 05:49:33,870] INFO [my-sink-stg|task-0] Returning new credentials provider based on the configured credentials provider class (io.confluent.connect.s3.storage.S3Storage:173)"
"[Worker-0830298caeb23471c] [2023-07-17 05:54:55,056] ERROR [my-sink-stg|task-0] WorkerSinkTask{id=my-sink-stg-0} Task threw an uncaught and unrecoverable exception. Task is being killed and will not recover until manually restarted (org.apache.kafka.connect.runtime.WorkerTask:191)"
"[Worker-0830298caeb23471c] org.apache.kafka.connect.errors.ConnectException: com.amazonaws.SdkClientException: Unable to execute HTTP request: Connect to s3.amazonaws.com:443 [s3.amazonaws.com/52.216.209.88, s3.amazonaws.com/54.231.199.40, s3.amazonaws.com/54.231.170.48, s3.amazonaws.com/52.217.40.110, s3.amazonaws.com/52.217.228.224, s3.amazonaws.com/52.216.40.168, s3.amazonaws.com/52.217.201.152, s3.amazonaws.com/52.216.89.29] failed: connect timed out"
"[Worker-0830298caeb23471c] at io.confluent.connect.s3.S3SinkTask.start(S3SinkTask.java:140)"
"[Worker-0830298caeb23471c] at org.apache.kafka.connect.runtime.WorkerSinkTask.initializeAndStart(WorkerSinkTask.java:308)"
"[Worker-0830298caeb23471c] at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:196)"
"[Worker-0830298caeb23471c] at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:189)"
"[Worker-0830298caeb23471c] at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:238)"
"[Worker-0830298caeb23471c] at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)"
"[Worker-0830298caeb23471c] at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)"
"[Worker-0830298caeb23471c] at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)"
"[Worker-0830298caeb23471c] at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)"
"[Worker-0830298caeb23471c] at java.base/java.lang.Thread.run(Thread.java:829)"
"[Worker-0830298caeb23471c] Caused by: com.amazonaws.SdkClientException: Unable to execute HTTP request: Connect to s3.amazonaws.com:443 [s3.amazonaws.com/52.216.209.88, s3.amazonaws.com/54.231.199.40, s3.amazonaws.com/54.231.170.48, s3.amazonaws.com/52.217.40.110, s3.amazonaws.com/52.217.228.224, s3.amazonaws.com/52.216.40.168, s3.amazonaws.com/52.217.201.152, s3.amazonaws.com/52.216.89.29] failed: connect timed out"
"[Worker-0830298caeb23471c] at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleRetryableException(AmazonHttpClient.java:1219)"
"[Worker-0830298caeb23471c] at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1165)"
"[Worker-0830298caeb23471c] at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:814)"
"[Worker-0830298caeb23471c] at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:781)"
"[Worker-0830298caeb23471c] at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:755)"
"[Worker-0830298caeb23471c] at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:715)"
"[Worker-0830298caeb23471c] at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:697)"
"[Worker-0830298caeb23471c] at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:561)"
"[Worker-0830298caeb23471c] at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:541)"
"[Worker-0830298caeb23471c] at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5456)"
"[Worker-0830298caeb23471c] at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5403)"
"[Worker-0830298caeb23471c] at com.amazonaws.services.s3.AmazonS3Client.getAcl(AmazonS3Client.java:4062)"
"[Worker-0830298caeb23471c] at com.amazonaws.services.s3.AmazonS3Client.getBucketAcl(AmazonS3Client.java:1278)"
"[Worker-0830298caeb23471c] at com.amazonaws.services.s3.AmazonS3Client.getBucketAcl(AmazonS3Client.java:1268)"
"[Worker-0830298caeb23471c] at com.amazonaws.services.s3.AmazonS3Client.doesBucketExistV2(AmazonS3Client.java:1406)"
"[Worker-0830298caeb23471c] at io.confluent.connect.s3.storage.S3Storage.bucketExists(S3Storage.java:184)"
"[Worker-0830298caeb23471c] at io.confluent.connect.s3.S3SinkTask.start(S3SinkTask.java:114)"
"[Worker-0830298caeb23471c] ... 9 more"
"[Worker-0830298caeb23471c] Caused by: org.apache.http.conn.ConnectTimeoutException: Connect to s3.amazonaws.com:443 [s3.amazonaws.com/52.216.209.88, s3.amazonaws.com/54.231.199.40, s3.amazonaws.com/54.231.170.48, s3.amazonaws.com/52.217.40.110, s3.amazonaws.com/52.217.228.224, s3.amazonaws.com/52.216.40.168, s3.amazonaws.com/52.217.201.152, s3.amazonaws.com/52.216.89.29] failed: connect timed out"
"[Worker-0830298caeb23471c] at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:151)"
"[Worker-0830298caeb23471c] at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:376)"
"[Worker-0830298caeb23471c] at jdk.internal.reflect.GeneratedMethodAccessor137.invoke(Unknown Source)"
"[Worker-0830298caeb23471c] at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)"
"[Worker-0830298caeb23471c] at java.base/java.lang.reflect.Method.invoke(Method.java:566)"
"[Worker-0830298caeb23471c] at com.amazonaws.http.conn.ClientConnectionManagerFactory$Handler.invoke(ClientConnectionManagerFactory.java:76)"
"[Worker-0830298caeb23471c] at com.amazonaws.http.conn.$Proxy47.connect(Unknown Source)"
"[Worker-0830298caeb23471c] at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:393)"
"[Worker-0830298caeb23471c] at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236)"
"[Worker-0830298caeb23471c] at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)"
"[Worker-0830298caeb23471c] at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)"
"[Worker-0830298caeb23471c] at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)"
"[Worker-0830298caeb23471c] at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)"
"[Worker-0830298caeb23471c] at com.amazonaws.http.apache.client.impl.SdkHttpClient.execute(SdkHttpClient.java:72)"
"[Worker-0830298caeb23471c] at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1346)"
"[Worker-0830298caeb23471c] at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1157)"
"[Worker-0830298caeb23471c] ... 24 more"
[Worker-0830298caeb23471c] Caused by: java.net.SocketTimeoutException: connect timed out
"[Worker-0830298caeb23471c] at java.base/java.net.PlainSocketImpl.socketConnect(Native Method)"
"[Worker-0830298caeb23471c] at java.base/java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:412)"
"[Worker-0830298caeb23471c] at java.base/java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:255)"
"[Worker-0830298caeb23471c] at java.base/java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:237)"
"[Worker-0830298caeb23471c] at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)"
"[Worker-0830298caeb23471c] at java.base/java.net.Socket.connect(Socket.java:609)"
"[Worker-0830298caeb23471c] at org.apache.http.conn.ssl.SSLConnectionSocketFactory.connectSocket(SSLConnectionSocketFactory.java:368)"
"[Worker-0830298caeb23471c] at com.amazonaws.http.conn.ssl.SdkTLSSocketFactory.connectSocket(SdkTLSSocketFactory.java:142)"
"[Worker-0830298caeb23471c] at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:142)"
"[Worker-0830298caeb23471c] ... 39 more"
"[Worker-0830298caeb23471c] [2023-07-17 05:54:55,058] INFO [my-sink-stg|task-0] Metrics scheduler closed (org.apache.kafka.common.metrics.Metrics:668)"
"[Worker-0830298caeb23471c] [2023-07-17 05:54:55,058] INFO [my-sink-stg|task-0] Closing reporter org.apache.kafka.common.metrics.JmxReporter (org.apache.kafka.common.metrics.Metrics:672)"
"[Worker-0830298caeb23471c] [2023-07-17 05:54:55,059] INFO [my-sink-stg|task-0] Metrics reporters closed (org.apache.kafka.common.metrics.Metrics:678)"
"[Worker-0830298caeb23471c] [2023-07-17 05:54:55,060] INFO [my-sink-stg|task-0] App info kafka.consumer for connector-consumer-my-sink-stg-0 unregistered (org.apache.kafka.common.utils.AppInfoParser:83)"
Connector's role already has a permission policy for accessing the sink bucket:
{
"Statement": [
{
"Action": "s3:ListAllMyBuckets",
"Effect": "Allow",
"Resource": "arn:aws:s3:::*",
"Sid": ""
},
{
"Action": [
"s3:PutObject",
"s3:ListMultipartUploadParts",
"s3:ListBucketMultipartUploads",
"s3:ListBucket",
"s3:GetObject",
"s3:GetBucketLocation",
"s3:DeleteObject",
"s3:AbortMultipartUpload"
],
"Effect": "Allow",
"Resource": [
"arn:aws:s3:::my-bucket-stg/*",
"arn:aws:s3:::my-bucket-stg"
],
"Sid": ""
}
],
"Version": "2012-10-17"
}
Interestingly, connector runs fine if MSK cluster and MSK connector are in the same region as the sink bucket, i.e. us-east-1
Connector's security group has "allow all" for both inbound and outbound rules.
Anyone knows what could be causing this timeout issue?
UPD. Connector uses IAM authentication and has the following configs set:
"sasl.mechanism": "AWS_MSK_IAM",
"sasl.jaas.config": "software.amazon.msk.auth.iam.IAMLoginModule required;",
"sasl.client.callback.handler.class": "software.amazon.msk.auth.iam.IAMClientCallbackHandler",
UPD2. Tried attaching internet gateway to connector's subnet, and updating route table to forward packets with S3-related IPs to that gateway. I am able to ping all IP addresses shown in the error logs, from EC2 instance launched in connector's VPC. Completely clueless now
Upvotes: 0
Views: 429
Reputation: 191743
You need to configure the region in the connector config. See s3.region
. You may also need to modify s3.credentials.provider.class
, in case the IAM policy isn't taking
Upvotes: 0