bneelon
bneelon

Reputation: 107

Kafka Connect with MySQL Source

Before I begin, I'd like to start by saying I am completely new to Kafka and am fairly new to Linux, so if this ends up being a ridiculously simple answer, please be kind! :)

The high level idea of what I'm trying to do is use Confluent's Kafka Connect to read from a MySQL database that is having sensor data streamed to it on a minute or sub-minute basis and then use Kafka as an "ETL pipeline" to instantly route that data to a Data Warehouse and/or MongoDB for reporting or even tie in directly to Kafka from our web-app.

I am using Robin Moffatt's series as well as Confluent's JDBC Source Connector Quickstart as my initial guide. As far as where these are hosted, I am using an Amazon RDS MySQL database and a separate AWS EC2 t2.large instance with Ubuntu 16.04.2 to run Kafka Connect.

Using Robin's workflow, I am to the point where I have created the configuration file, but I am not using the json format he uses. I am using the format from the quickstart article.

name=jdbc_source_mysql_4427_Data       
connector.class=io.confluent.connect.jdbc.JdbcSourceConnector
key.converter=io.confluent.connect.avro.AvroConverter
key.converter.schema.registry.url=http://localhost:8081
value.converter=io.confluent.connect.avro.AvroConverter
value.converter.schema.registry.url=http://localhost:8081
connection.url=jdbc:mysql://lndbtest.cdveaddpnevv.us-east-2.rds.amazonaws.com:3306/LNDBv1?user=adminRDS&password=*****                
table.whitelist=4427_Data              
mode=timestamp                
timestamp.column.name=TmStamp               
validate.non.null=false                
topic.prefix=mysql-

And that is saved at:

/etc/kafka-connect-jdbc/kafka-connect-jdbc-source.properties

I then run:

/usr/bin/confluent load jdbc_source_mysql_4427_Data  -d /etc/kafka-connect-jdbc/kafka-connect-jdbc-source.properties

and get this error:

{
  "error_code": 400,
  "message": "Connector configuration is invalid and contains the following 2 error(s):\nInvalid value java.sql.SQLException: No suitable driver found for jdbc:mysql://lndbtest.cdveaddpnevv.us-east-2.rds.amazonaws.com:3306/LNDBv1?user=adminRDS&password=*** for configuration Couldn't open connection to jdbc:mysql://lndbtest.cdveaddpnevv.us-east-2.rds.amazonaws.com:3306/LNDBv1?user=adminRDS&password=***\nInvalid value java.sql.SQLException: No suitable driver found for jdbc:mysql://lndbtest.cdveaddpnevv.us-east-2.rds.amazonaws.com:3306/LNDBv1?user=adminRDS&password=*** for configuration Couldn't open connection to jdbc:mysql://lndbtest.cdveaddpnevv.us-east-2.rds.amazonaws.com:3306/LNDBv1?user=adminRDS&password=***\nYou can also find the above list of errors at the endpoint `/{connectorType}/config/validate`"
}

It seems to be a driver issue. My question at this point is, "Do I need to download the MySQL JDBC driver to my EC2 instance, or should that have been included in the Confluent Platform package?"

Also, does my overall idea sound like a good fit for Kafka Connect?

As I mentioned earlier, I am new to these technologies, but have found the best way to learn something is to jump right in and try to solve a problem. Any ideas and suggestions would be more than welcome. Thank you!

Upvotes: 0

Views: 2301

Answers (3)

Xin
Xin

Reputation: 133

It's because there's no mysql driver in the distribution of confluent. I think you can solve the problem by downloading a mysql driver jar file, then putting it in confluent/share/java/kafka-connect-jdbc folder and re-run the program.

Upvotes: 1

Robin Moffatt
Robin Moffatt

Reputation: 32090

As @dawsaw says, you do need to make the MySQL JDBC driver available to the connector.

My observation here would be–given a free hand in all the application and architecture you describe– it would be best to stream from the sensor into Kafka, and then from there Kafka into MySQL, Mongo, webapp, etc.

Streaming into a DB to then stream out of the DB is not a perfect choice, if you have the option.

Upvotes: 1

dawsaw
dawsaw

Reputation: 2313

The overall concept makes sense to me. You do need to download the driver and add it to your worker classpath. It isn't packaged for licensing reasons I assume.

Upvotes: 2

Related Questions