Hrvoje
Hrvoje

Reputation: 15202

Spliting variable lenght list in pandas column into columns

In pandas dataframe I have column that looks like this:

+----------------------------------------------+
|                carContactTel                 |
+----------------------------------------------+
| []                                           |
| ['tel 432424']                               |
| ['tel 84958358']                             |
| ['tel 5434645', 'tel 534535', 'tel 3242342'] |
+----------------------------------------------+

So some list elements are empty. I'm trying to split this into new columns: tel1,tel2,tel3,tel4,tel5. If list is too short than values in corresponding columns should stay empty.

My last try based on solutions I've found:

carContactDF = pd.DataFrame(carContactDF["carContactTel"].to_list(), columns=["carContactTel1", "carContactTel2", "carContactTel3", "carContactTel4", "carContactTel5"])

Errors are always about shape of list...tried replacing empty lists wit 'Nan' but that didn't work too.

Lists are properly generated with another python script so there is no mistake in them...checked.

Error:

ValueError: 5 columns passed, passed data had 3 columns

Currently 3 items is top but script will run over larger dataset that will have list items with 5 elements.

Upvotes: 1

Views: 77

Answers (2)

Shubham Sharma
Shubham Sharma

Reputation: 71687

Create a new dataframe from the carContactTel column, then use DataFrame.set_axis + DataFrame.add_prefix to conform the columns according to requirements, finally use DataFrame.fillna to replace NaN values with empty string:

df1 = pd.DataFrame(carContactDF['carContactTel'].tolist())
df1 = (
    df1.set_axis(df1.columns + 1, 1).add_prefix('carContactTel')
    .fillna('').replace('^tel\s*', '', regex=True)
)

Result:

print(df1)
  carContactTel1 carContactTel2 carContactTel3
0                                             
1         432424                              
2       84958358                              
3        5434645         534535        3242342

Upvotes: 3

volante
volante

Reputation: 152

Filter rows where the len(carContactTel) < 5 and append na values to those lists. Repeat until done. Then split.

Upvotes: 0

Related Questions