divy10
divy10

Reputation: 21

Can we filter data on the basis of specific words?

I am making a web application which reads data from an Excel file. The data set I have has a columns which has data of categories and sub-categories of books such as 'Fiction.Romantic', 'Fiction.Thriller', 'Sports.Imaginative', 'Sports.AutoBiographic' etc.

I want the pandas to filter out data and print the total/combined values as 'Sports', 'Fiction' etc.

Is there a way I could remove the subcategories ('.Imaginative' , '.Thriller') and print/display as desired?

There are a lot of categories and subcategories so doing them manually will take a lot of time.

Actual result:

Name     Category                    Amount
ABC   Sports.Autobiographic.         7485
BCD.  Sports.Imaginative.             20

Expected result:

Name     Category      Amount
ABC      Sports         7485
BCD.     Sports          20

Upvotes: 2

Views: 936

Answers (1)

jezrael
jezrael

Reputation: 862601

Use Series.str.split by first . only by n=1 and select firts lists by indexing:

df['Category'] = df['Category'].str.split('.', n=1).str[0]
print (df)
   Name Category  Amount
0   ABC   Sports    7485
1  BCD.   Sports      20

Or use Series.str.extract:

df['Category'] = df['Category'].str.extract(r'([a-zA-Z]+)\.') 
print (df)
   Name Category  Amount
0   ABC   Sports    7485
1  BCD.   Sports      20

Upvotes: 2

Related Questions