How do I stop a pyspark dataframe from changing to a list?

Question

I start with a pyspark dataframe and gets converted to a list after I use .take() on it. How can I keep it a pyspark dataframe?

    df1 = Ce_clean
    print(type(df1))
    df1 = df1.take(1000)
    print(type(df1))

Equinox · Accepted Answer

You can either convert the RDD/list to df or use limit(n)

 df2 = spark.createDataFrame(df1.take(100))
 type(df2)

or

 df3 = df1.limit(100)
 type(df3)

Answers (1)