Reputation: 1006
I have a dataframe: For example:
df =
Questions Answers
Where is Amazon? Brazil
Is he a scientist? No
Did he stole my money? Yes
What does your father do? Business
He is a great player. I don't think so.
She is my girlfriend. I too agree.
I want to create three data frames from the above dataframe in the condition:
Condition for df1:
If the first word of the df['Questions'] is from the list:
# list of Yes/No verbs
yn_list = ['Do','Does','Did','do','does','did','Am','Are','Is','Was','Were','am','are','is','was','were',
'Have','Has','Had','have','has','had','Will','Would','Shall','Should','Can','Could','May',
'Might','will','would','shall','should','can','could','may','might']
# list of negative Yes/No verbs
yn_negative_list = ["Don't","Doesn't","Didn't","don't","doesn't","didn't","Aren't","Isn't","aren't","isn't",
"Wasn't","Weren't","wasn't","weren't","Haven't","Hasn't","Hadn't","haven't","hasn't",
"hadn't","Won't","Wouldn't","won't","wouldn't","Shan't","shan't","Shouldn't","Can't",
"Couldn't","shouldn't","can't","couldn't","may not","May not","Mightn't","mightn't"]
Condition for df2:
If the first word of the df['Questions'] is from the list:
wh_list = ['who','where','what','when','why','whom','which','whose','how']
Condition for df3:
If the sentence ends with a '.' sign
Upvotes: 1
Views: 76
Reputation: 18426
Your 3rd condition:
df[df['Question'].str.endswith('.')]
Question Answer
4 He is a great player. I don't think so.
5 She is my girlfriend. I too agree.
2nd condition:
df[df['Question'].str.lower().str.startswith(tuple(wh_list))]
Question Answer
0 Where is Amazon? Brazil
3 What does your father do? Business
And the 1st condition:
df[df['Question'].str.lower().str.startswith(tuple(yn_list+yn_negative_list))]
Question Answer
1 Is he a scientist? No
2 Did he stole my money? Yes
Upvotes: 5
Reputation: 195553
df1 = df[df["Questions"].str.split(n=1).str[0].isin(yn_list + yn_negative_list)]
print(df1)
print()
df2 = df[df["Questions"].str.lower().str.split(n=1).str[0].isin(wh_list)]
print(df2)
print()
df3 = df[df["Questions"].str.endswith(".")]
print(df3)
print()
Prints:
Questions Answers
1 Is he a scientist? No
2 Did he stole my money? Yes
Questions Answers
0 Where is Amazon? Brazil
3 What does your father do? Business
Questions Answers
4 He is a great player. I don't think so.
5 She is my girlfriend. I too agree.
Upvotes: 4