Reputation: 247
I have created dataframe using pyspark and trying to query based on dynamic variable, its giving empty rows.Can any help me to how pass dynamic variable in below query ?
start_dt = '2022-1-15'
df.printSchema()
-- state
--- county
--- population
---- pdate --- string
df = df.filter((df.state == 'CA') & (df.pdate == start_dt))
df.show()
Upvotes: 1
Views: 1080
Reputation: 26676
pass explicit value using pysparks literal function. Code below
df = spark.createDataFrame([
('https:john', 'john', 1.1, 'httpsasd'),
('https:john', 'john', 1.1, 'kafka'),
('https:john', 'john', 1.2, 'httpsasd')
], ['website', 'name', 'value', 'other']
)
df.show(truncate=False)
selection ='httpsasd'
df = df.filter((df.value == 1.1) & (df.other == lit(selection)))
df.show()
Outcome
+----------+----+-----+--------+
| website|name|value| other|
+----------+----+-----+--------+
|https:john|john| 1.1|httpsasd|
+----------+----+-----+--------+
Upvotes: 1