Reputation: 29
I need to create a new dataframe in Synapse Analytics using column names from another dataframe. The new dataframe will have just one column (column header:col_name and the columns names from the other dataframe are the cell values. Here's my code:
df1= df.columns
colName =[]
for e in df1:
list1 = [e]
colName.append(list1)
col=['col_name']
df2=spark.createDataFrame(colName,col)
display(df2)
The output table created look like below: With the output dataframe, i can do the following count, display or withColumn command.
df2.count()
df2=df2.withColumn('index',lit(1))
But when i start doing the below filter command, i ended up with 'list' object not callable error message.
display(df2.filter(col('col_name')=='dob'))
I am just wondering if anyone know what I am missing and how I can solve this.At the end i'd like to add a conditional column based on the value in the col_name column.
Upvotes: 0
Views: 77
Reputation: 15258
The problem is that you have two objects called col
.
You did this :
col=['col_name']
therefore, when you do this :
display(df2.filter(col('col_name')=='dob'))
you do not call pyspark.sql.functions.col
anymore but ['col_name']
, hence, TypeError: list object is not callable
.
Simply replace here :
# display(df2.filter(col('col_name')=='dob'))
from pyspark.sql import functions as F
display(df2.filter(F.col('col_name')=='dob'))
Upvotes: 1