Reputation: 369
I'm working with a .txt dataset which I read in as a csv file.
data = pd.read_csv('train.txt', delimiter='\t', header=None, names=['category', 'text'], dtype=str)
print data.head()
it prints:
0 MUSIC Today at the recording studio, John...
1 POLITICS The tensions inside the government have...
2 NEWS The new pictures of NASA show...
What I want to do, is change all the letters from the text to lowercase. So, for example, "The new pictures of NASA show..." would become "the new pictures of nasa show...", but "NEWS" remains uppercase as "NEWS".
Any words of advice?
Upvotes: 4
Views: 5344
Reputation: 16935
You can apply a lambda which will do this for you:
data = pd.read_csv('train.txt', delimiter='\t', header=None, names=['category', 'text'], dtype=str).apply(lambda x: x.astype(str).str.lower())
Using your example data, you'll see this:
>>> import pandas as pd
>>> data = pd.read_csv('train.txt', delimiter='\t', header=None, names=['category', 'text'], dtype=str).apply(lambda x: x.astype(str).str.lower())
>>> data.head()
category text
0 music today at the recording studio, john...
1 politics the tensions inside the government have...
2 news the new pictures of nasa show...
Upvotes: 4