Robert S
Robert S

Reputation: 496

Why can't I replace a newline in my pandas dataframe?

The code below illustrates the trouble I am having reading from an Excel file and removing unnecessary newline characters. How can I remove the newline character in the second book's title?

Data from Books.xlsx

BookID,Title,Author
1,"I, Robot",Rob
2,Robots
and me,Rob
3,I made a robot,Rob
4,Robotics for dummies,Rob
# pandas version is 1.2.0
import pandas

def fromxl():
    master_tracker = "data_in\\books.xlsx"
    
    df = pandas.read_excel(master_tracker, "Sheet1", header=0)

    # doesn't work
    # df.replace('\n', ' ')

    # doesn't work
    # for r in df.iterrows():
    #     r[1][1] = r[1][1].replace('\n', ' ')

    print(df)

Edit Just for clarification, here is what I get when I print the dataframe

   BookID                 Title Author
0       1              I, Robot    Rob
1       2        Robots\nand me    Rob
2       3        I made a robot    Rob
3       4  Robotics for dummies    Rob

Upvotes: 0

Views: 3898

Answers (2)

LoukasPap
LoukasPap

Reputation: 1381

Try this: df['Title'] = df['Title'].str.replace("\n"," ")

This will replace every line breaker, with a simple space, in every row.

If you want for all columns: df = df.replace(r'\n',' ', regex=True)

Upvotes: 3

Suhas Mucherla
Suhas Mucherla

Reputation: 1413

Try this for multiple columns:

for i in df.select_dtypes(include='object').columns:    
    df[i]=df[i].str.replace("/n",'')

Upvotes: 0

Related Questions