Simon Lindgren
Simon Lindgren

Reputation: 2041

Filter Scala dataframe by column of arrays

My scala dataframe has a column that has the datatype array(element: String). I want to display those rows of the dataframe that has the word "hello" in that column.

I have this:

display(df.filter($"my_column".contains("hello")))

I get an error because of data mismatch. It says that argument 1 requires string type, however, 'my:column' is of array<string> type.

Upvotes: 0

Views: 368

Answers (1)

Manoj Kumar Dhakad
Manoj Kumar Dhakad

Reputation: 1892

You can use array_contains function

import org.apache.spark.sql.functions._

df.filter(array_contains(df.col("my_column"), "hello")).show

Upvotes: 1

Related Questions