Matt D
Matt D

Reputation: 11

Text wrapping a dataframe column to multiple columns with differing length constraints

I have a dataframe with multiple columns, the most important of which is a Headline. I need to split this headline into multiple columns (let's say 4 columns), and each of these resulting 4 columns has an individual length constraint (column1 = 10 chars, column2 = 15 chars, column3 = 15, column4 = 25). I have researched ways of using textwrap to do this, but can't determine how to apply textwrap to a dataframe. An iterative process to split the full string into its words and recompile while checking the recompiled length against the constraint could be an option as well.

Example Headline: Act fast. Limited space available.
Result

Column1: Act fast.
Column2: Limited space
Column3: available.
Column4: (blank)

To make this really fun, I'm a neophyte at Python - so please be gentle.

Upvotes: 1

Views: 444

Answers (1)

Jan
Jan

Reputation: 43169

See the whole solution here:

import pandas as pd

d = {'junk': 'Act fast. Limited space available.'}
df = pd.DataFrame(d.values(), columns=['raw_text'])

df = df['raw_text'].str.extract(r'^(?P<column1>.{1,10}\b)(?P<column2>.{1,15}\b)(?P<column3>.{1,15}\b)(?P<column4>.{0,25}\b)', expand=True)
print(df)

This yields

      column1         column2    column3 column4
0  Act fast.   Limited space   available        

Upvotes: 1

Related Questions