Reputation: 1
If I have a pandas dataframe like:
import pandas as pd
columns = ['machine_num', 'comments']
data = [(232323, 'BOOMER COMMENT HERE'),
(123456, 'GenX comment here'),
(121212, 'ZoOmEr CoMmEnT hErE')]
df = pd.DataFrame(data, columns=columns)
# df.at['Total marks', 'Marks'] = df['Marks'].sum()
df
machine_num | description |
---|---|
232323 | BOOMER COMMENT HERE |
123456 | GenX comment here |
121212 | ZoOmEr CoMmEnT hErE |
What's the most performant way to convert the free text column to all lowercase strings in Python? I'm prepping the data for NLP later.
i.e., so it looks something like this:
machine_num | description |
---|---|
232323 | boomer comment here |
123456 | genx comment here |
121212 | zoomer comment here |
I tried this, which runs fine with with a test df...
df['comments'] = df['comments'].str.lower() #convert to all lowercase
df
...but the ultimate goal is to run on millions of rows.
Upvotes: 0
Views: 52