Jim
Jim

Reputation: 619

Make a dataframe filter a variable so i can set it once and re use

Hi I've got a script that re uses the same filter a few times something like this:

df_filtered = df.filter( col("columnname" == valueparam))

how can i make it so i set it once at the top of my script so i can change it easily everywhere it occurs, so something like this:

filter_val =  'col("columnname" == valueparam))'
df_filtered = df.filter(filter_val) 

to be honest Python/pyspark is all very new to me so if i'm going about this the wrong way i'm open to options.

Upvotes: 1

Views: 596

Answers (1)

fskj
fskj

Reputation: 964

You can do like this:

df = spark.createDataFrame([(1,), (2,),(3,)], ["Test"])
df.show()
valueparam = 3
col_filter = (F.col("Test") == valueparam)
df.filter(col_filter).show()

Output:

+----+
|Test|
+----+
|   1|
|   2|
|   3|
+----+

+----+
|Test|
+----+
|   3|
+----+

Upvotes: 2

Related Questions