saremisona
saremisona

Reputation: 369

Working with texts: change all letters to lowercase in a CSV file

I'm working with a .txt dataset which I read in as a csv file.

data = pd.read_csv('train.txt', delimiter='\t', header=None, names=['category', 'text'], dtype=str)
print data.head()

it prints:

0  MUSIC  Today at the recording studio, John...
1  POLITICS  The tensions inside the government have...
2  NEWS  The new pictures of NASA show...

What I want to do, is change all the letters from the text to lowercase. So, for example, "The new pictures of NASA show..." would become "the new pictures of nasa show...", but "NEWS" remains uppercase as "NEWS".

Any words of advice?

Upvotes: 4

Views: 5344

Answers (1)

erip
erip

Reputation: 16935

You can apply a lambda which will do this for you:

data = pd.read_csv('train.txt', delimiter='\t', header=None, names=['category', 'text'], dtype=str).apply(lambda x: x.astype(str).str.lower())

Using your example data, you'll see this:

>>> import pandas as pd
>>> data = pd.read_csv('train.txt', delimiter='\t', header=None, names=['category', 'text'], dtype=str).apply(lambda x: x.astype(str).str.lower())
>>> data.head()
   category                                        text
0     music      today at the recording studio, john...
1  politics  the tensions inside the government have...
2      news            the new pictures of nasa show...

Upvotes: 4

Related Questions