Gonza
Gonza

Reputation: 177

my Split() function return only NaN or null data

first thank you for your time.

I need to make a cross between two Pandas dataframes with the values that I have in a field. The values come in the following form within a field: [A,B,C,N]

I am trying to apply a SPLIT() function to the dataframe field as follows:

df_test = df_temp["NAME"].str.split(expand=True)

The "Name" field is of type object.

My problem is that for some reason my split() splits the values of my NAME field with Null (NaN) values. I don't understand what I'm doing wrong.

From already thank you very much.

Upvotes: 0

Views: 461

Answers (2)

Dipanjan Mallick
Dipanjan Mallick

Reputation: 1739

Based upon the input -

Input data -

from pyspark.sql.types import *
from pyspark.sql.functions import *

df = spark.createDataFrame([(['A', 'B', 'C', 'D'],), ], schema = ['Name'])
df.show()

+------------+
|        Name|
+------------+
|[A, B, C, D]|
+------------+

Required Output -

df.select(explode(col("Name")).alias("exploded_Name")).show()

+-------------+
|exploded_Name|
+-------------+
|            A|
|            B|
|            C|
|            D|
+-------------+

Upvotes: 1

user7375116
user7375116

Reputation: 213

You should provide a value to separate the column with.

df['Name'].str.split(',', expand=True)

Upvotes: 0

Related Questions