Reputation: 197
I am doing an JDBC connection to Mysql database in Azure databricks env. Then trying retrieve the count(id) with date range of 24 hrs and IN filter with specific country, but its taking hell lot of time. How can I improve the performance?
Query:
pushdown_query = """(select count(id) from Mysql_database where time > "{}" and time < "{}" and country IN ('FRA','AUT','DEU','CZE')) alias""".format(From_date,To_date)
df = spark.read.jdbc(url=jdbcUrl, table=pushdown_query, properties=connectionProperties)
display(df)
Upvotes: 0
Views: 163
Reputation: 345
Is Mysql_database really large? If so you may need to add some indexes to the table if they are not there already
Upvotes: 1