Kabilesh
Kabilesh

Reputation: 1012

Lowercase sentences in lists in pandas dataframe

I have a pandas data frame like below. I want to convert all the text into lowercase. How can I do this in python?

Sample of data frame

[Nah I don't think he goes to usf, he lives around here though]                                                                                                                                                                                                                          

[Even my brother is not like to speak with me., They treat me like aids patent.]                                                                                                                                                                                                      

[I HAVE A DATE ON SUNDAY WITH WILL!, !]                                                                                                                                                                                                                                                  

[As per your request 'Melle Melle (Oru Minnaminunginte Nurungu Vettam)' has been set as your callertune for all Callers., Press *9 to copy your friends Callertune]                                                                                                                      

[WINNER!!, As a valued network customer you have been selected to receivea £900 prize reward!, To claim call 09061701461., Claim code KL341., Valid 12 hours only.]

What I tried

def toLowercase(fullCorpus):
   lowerCased = [sentences.lower()for sentences in fullCorpus['sentTokenized']]
   return lowerCased

I get this error

lowerCased = [sentences.lower()for sentences in fullCorpus['sentTokenized']]
AttributeError: 'list' object has no attribute 'lower'

Upvotes: 1

Views: 2651

Answers (4)

Rob
Rob

Reputation: 5481

There is also a nice way to do it with numpy:

fullCorpus['sentTokenized'] = [np.char.lower(x) for x in fullCorpus['sentTokenized']]

Upvotes: 0

Maksim Terpilovskii
Maksim Terpilovskii

Reputation: 851

It is easy:

df.applymap(str.lower)

or

df['col'].apply(str.lower)
df['col'].map(str.lower)

Okay, you have lists in rows. Then:

df['col'].map(lambda x: list(map(str.lower, x)))

Upvotes: 5

rafaelc
rafaelc

Reputation: 59274

Can also make it a string, use str.lower and get back to lists.

import ast
df.sentTokenized.astype(str).str.lower().transform(ast.literal_eval)

Upvotes: 1

niraj
niraj

Reputation: 18208

You can try using apply and map:

def toLowercase(fullCorpus):
   lowerCased = fullCorpus['sentTokenized'].apply(lambda row:list(map(str.lower, row)))
   return lowerCased

Upvotes: 1

Related Questions