PhilMSDS
PhilMSDS

Reputation: 1

Best way to convert free text column to all lowercase in pandas dataframe?

If I have a pandas dataframe like:

import pandas as pd

columns = ['machine_num', 'comments']
data = [(232323, 'BOOMER COMMENT HERE'), 
        (123456, 'GenX comment here'),
        (121212, 'ZoOmEr CoMmEnT hErE')]
df = pd.DataFrame(data, columns=columns)
# df.at['Total marks', 'Marks'] = df['Marks'].sum()

df
machine_num description
232323 BOOMER COMMENT HERE
123456 GenX comment here
121212 ZoOmEr CoMmEnT hErE

What's the most performant way to convert the free text column to all lowercase strings in Python? I'm prepping the data for NLP later.

i.e., so it looks something like this:

machine_num description
232323 boomer comment here
123456 genx comment here
121212 zoomer comment here

I tried this, which runs fine with with a test df...

df['comments'] = df['comments'].str.lower() #convert to all lowercase
df

...but the ultimate goal is to run on millions of rows.

Upvotes: 0

Views: 52

Answers (0)

Related Questions