Reputation: 613
I have a dataset like below:
campaign_name,campaign_team
edbol97,other
abc_de_dg,other
de_air,other
Out of this, I have to pick up the campaign_name
where records contain "_"
. Now, I want to split the underscore containing records into words. I was able to do that but, I want the output as below:
Below is my code:
splittedString = list()
campaign_name = list()
df = pandas.DataFrame()
for i in underscoreList:
campaign_name.append(i)
splittedString.append(i.split("_"))
#a_dict= {"campaign_name":campaign_name,"campaign_name1":splittedString}
df["campaign_name"]=campaign_name
df["campaign_name1"]=splittedString
print(splittedString)
print(campaign_name)
print(df)
However, it gives me the below error:
File "/home/siddhesh/Downloads/pyspark/src/sample/exact_match.py", line 38, in <module>
df["campaign_name"]=campaign_name
File "/home/siddhesh/Downloads/pyspark/venv/lib/python3.8/site-packages/pandas/core/frame.py", line 3607, in __setitem__
self._set_item(key, value)
File "/home/siddhesh/Downloads/pyspark/venv/lib/python3.8/site-packages/pandas/core/frame.py", line 3779, in _set_item
value = self._sanitize_column(value)
File "/home/siddhesh/Downloads/pyspark/venv/lib/python3.8/site-packages/pandas/core/frame.py", line 4504, in _sanitize_column
com.require_length_match(value, self.index)
File "/home/siddhesh/Downloads/pyspark/venv/lib/python3.8/site-packages/pandas/core/common.py", line 531, in require_length_match
raise ValueError(
ValueError: Length of values (2) does not match the length of index (1)
I am new to Python and pandas. How can I resolve this error?
Upvotes: 1
Views: 67
Reputation: 9197
You can use .explode() for this:
df['campaign_name1'] = df['campaign_name'].str.split('_')
df.explode('campaign_name1')
Upvotes: 1