Reputation: 496
The code below illustrates the trouble I am having reading from an Excel file and removing unnecessary newline characters. How can I remove the newline character in the second book's title?
Data from Books.xlsx
BookID,Title,Author
1,"I, Robot",Rob
2,Robots
and me,Rob
3,I made a robot,Rob
4,Robotics for dummies,Rob
# pandas version is 1.2.0
import pandas
def fromxl():
master_tracker = "data_in\\books.xlsx"
df = pandas.read_excel(master_tracker, "Sheet1", header=0)
# doesn't work
# df.replace('\n', ' ')
# doesn't work
# for r in df.iterrows():
# r[1][1] = r[1][1].replace('\n', ' ')
print(df)
Edit Just for clarification, here is what I get when I print the dataframe
BookID Title Author
0 1 I, Robot Rob
1 2 Robots\nand me Rob
2 3 I made a robot Rob
3 4 Robotics for dummies Rob
Upvotes: 0
Views: 3898
Reputation: 1381
Try this:
df['Title'] = df['Title'].str.replace("\n"," ")
This will replace every line breaker, with a simple space, in every row.
If you want for all columns:
df = df.replace(r'\n',' ', regex=True)
Upvotes: 3
Reputation: 1413
Try this for multiple columns:
for i in df.select_dtypes(include='object').columns:
df[i]=df[i].str.replace("/n",'')
Upvotes: 0