Read SQL-Server table in pyspark (databricks) with conditions, not the entire table

Question

Is there any way to read data into pyspark dataframe from sql-server table based on condition, eg read only rows where column 'time_stamp' has current date?

Alternativey, I want to translate :

select * from table_name where time_stamp=cast(getdate() as date)

into pyspark dataframe.

I am using :

remote_table = (spark.read.format("sqlserver")
    .option("host", "host_name")
    .option("user", "use_name")
    .option("password", "password")
    .option("database", "database_name")
    .option("dbtable", "dbo.table_name")
    .load() )

which reads entire table 'table_name'. I just need to read rows that satisfy a condition, like 'where' clause in SQL.

Read SQL-Server table in pyspark (databricks) with conditions, not the entire table

Answers (1)

Related Questions