Reputation: 15202
In pandas dataframe I have column that looks like this:
+----------------------------------------------+
| carContactTel |
+----------------------------------------------+
| [] |
| ['tel 432424'] |
| ['tel 84958358'] |
| ['tel 5434645', 'tel 534535', 'tel 3242342'] |
+----------------------------------------------+
So some list elements are empty.
I'm trying to split this into new columns: tel1,tel2,tel3,tel4,tel5
.
If list is too short than values in corresponding columns should stay empty.
My last try based on solutions I've found:
carContactDF = pd.DataFrame(carContactDF["carContactTel"].to_list(), columns=["carContactTel1", "carContactTel2", "carContactTel3", "carContactTel4", "carContactTel5"])
Errors are always about shape of list...tried replacing empty lists wit 'Nan'
but that didn't work too.
Lists are properly generated with another python script so there is no mistake in them...checked.
Error:
ValueError: 5 columns passed, passed data had 3 columns
Currently 3 items is top but script will run over larger dataset that will have list items with 5 elements.
Upvotes: 1
Views: 77
Reputation: 71687
Create a new dataframe from the carContactTel
column, then use DataFrame.set_axis
+ DataFrame.add_prefix
to conform the columns according to requirements, finally use DataFrame.fillna
to replace NaN
values with empty string:
df1 = pd.DataFrame(carContactDF['carContactTel'].tolist())
df1 = (
df1.set_axis(df1.columns + 1, 1).add_prefix('carContactTel')
.fillna('').replace('^tel\s*', '', regex=True)
)
Result:
print(df1)
carContactTel1 carContactTel2 carContactTel3
0
1 432424
2 84958358
3 5434645 534535 3242342
Upvotes: 3
Reputation: 152
Filter rows where the len(carContactTel) < 5 and append na values to those lists. Repeat until done. Then split.
Upvotes: 0