Atom Store
Atom Store

Reputation: 1006

How to divide the dataframes into new dataframes according to the specified condition?

I have a dataframe: For example:

df =

Questions             Answers

Where is Amazon?       Brazil

Is he a scientist?         No

Did he stole my money?    Yes

What does your father do?  Business

He is a great player.      I don't think so.
 
She is my girlfriend.      I too agree.

I want to create three data frames from the above dataframe in the condition:

Condition for df1:

If the first word of the df['Questions'] is from the list:

# list of Yes/No verbs
yn_list = ['Do','Does','Did','do','does','did','Am','Are','Is','Was','Were','am','are','is','was','were',
           'Have','Has','Had','have','has','had','Will','Would','Shall','Should','Can','Could','May',
           'Might','will','would','shall','should','can','could','may','might']

# list of negative Yes/No verbs
yn_negative_list = ["Don't","Doesn't","Didn't","don't","doesn't","didn't","Aren't","Isn't","aren't","isn't",
                    "Wasn't","Weren't","wasn't","weren't","Haven't","Hasn't","Hadn't","haven't","hasn't",
                    "hadn't","Won't","Wouldn't","won't","wouldn't","Shan't","shan't","Shouldn't","Can't",
                    "Couldn't","shouldn't","can't","couldn't","may not","May not","Mightn't","mightn't"]

Condition for df2:

If the first word of the df['Questions'] is from the list:

wh_list = ['who','where','what','when','why','whom','which','whose','how']

Condition for df3:

If the sentence ends with a '.' sign

Upvotes: 1

Views: 76

Answers (2)

ThePyGuy
ThePyGuy

Reputation: 18426

Your 3rd condition:

df[df['Question'].str.endswith('.')]

                Question                   Answer
4  He is a great player.        I don't think so.
5  She is my girlfriend.             I too agree.

2nd condition:

df[df['Question'].str.lower().str.startswith(tuple(wh_list))]

                    Question         Answer
0           Where is Amazon?         Brazil
3  What does your father do?       Business

And the 1st condition:

df[df['Question'].str.lower().str.startswith(tuple(yn_list+yn_negative_list))]

                 Question       Answer
1      Is he a scientist?           No
2  Did he stole my money?          Yes

Upvotes: 5

Andrej Kesely
Andrej Kesely

Reputation: 195553

df1 = df[df["Questions"].str.split(n=1).str[0].isin(yn_list + yn_negative_list)]
print(df1)
print()


df2 = df[df["Questions"].str.lower().str.split(n=1).str[0].isin(wh_list)]
print(df2)
print()

df3 = df[df["Questions"].str.endswith(".")]
print(df3)
print()

Prints:

                Questions Answers
1      Is he a scientist?      No
2  Did he stole my money?     Yes

                   Questions   Answers
0           Where is Amazon?    Brazil
3  What does your father do?  Business

               Questions            Answers
4  He is a great player.  I don't think so.
5  She is my girlfriend.       I too agree.

Upvotes: 4

Related Questions