N9909
N9909

Reputation: 247

dynamic variable in pyspark dataframe

I have created dataframe using pyspark and trying to query based on dynamic variable, its giving empty rows.Can any help me to how pass dynamic variable in below query ?

start_dt = '2022-1-15'
df.printSchema()
-- state
--- county
--- population
---- pdate --- string

df = df.filter((df.state == 'CA') & (df.pdate == start_dt))
df.show()

Upvotes: 1

Views: 1080

Answers (1)

wwnde
wwnde

Reputation: 26676

pass explicit value using pysparks literal function. Code below

df = spark.createDataFrame([
    ('https:john', 'john', 1.1, 'httpsasd'), 
    ('https:john', 'john', 1.1, 'kafka'),
    ('https:john', 'john', 1.2, 'httpsasd')
], ['website', 'name', 'value', 'other']
)

df.show(truncate=False)

selection ='httpsasd'
df = df.filter((df.value == 1.1) & (df.other == lit(selection)))
df.show()

Outcome

+----------+----+-----+--------+
|   website|name|value|   other|
+----------+----+-----+--------+
|https:john|john|  1.1|httpsasd|
+----------+----+-----+--------+

Upvotes: 1

Related Questions