Reputation: 105
I am trying to insert a line break in Python, If I encounter more than 1 space in my text, I want to replace with one space and a line break. I have my data in excel cell. This is how my code looks like,
import pandas as pd
import re
def excelcleaner(textstring):
return textstring.replace(" ","\n")
df = pd.read_excel("lbook.xlsx")
df["clean_content"] = df["uncleaned_content"].apply(excelcleaner)
df.to_excel("lbook.xlsx")
Right now, it replaces the specified spaces (Now its 2) with a line break. How can I modify it, so that it detects the number of spaces and replaces with a single line break.
Upvotes: 0
Views: 1411
Reputation: 13106
You can use re.sub
from the regex module:
import re
def excelcleaner(textstring):
# This will find any 2 or more spaces and replace with a newline char
return re.sub('\s{2,}', '\n', textstring)
mystr = "abc 123 efg 111"
print(excelcleaner(mystr))
abc 123
efg
111
In case you aren't familiar with regex syntax, \s
is a whitespace character and {<min>, <max>}
is a range indicator. {2,}
says find two or more occurrences
Upvotes: 4