Reputation: 188
I've tried to figure this one out, but all methods end up with errors.
Quick background info: I have a dataframe with US states and info when a state has adopted a new law. In addition, the dataframe tells which are the neighbouring states. Here's a quick mockup:
(df: treated_states)
| year | state |BC_law |n_ipo |Neighbor1 |Neighbor2 |Neighbor3 |Treated|
| -------- | ----- |-------|------|----------|----------|----------|-------|
| 1980 | AZ |1999 |100 |CA |AK |WV |1 |
| 1999 | AZ |1999 |50 |CA |AK |WV |1 |
(df: not_treated)
| year | state |BC_law |n_ipo |Neighbor1 |Neighbor2 |Neighbor3 |Treated|
| -------- | ----- |-------|------|----------|----------|----------|-------|
| 1980 | WV |0 |100 |CA |AK |WV |0 |
| 1999 | AK |0 |80 |CA |AK |WV |0 |
In reality the dataframe includes all US states and all their neighbouring states in separate columns. There are data points before and and after the change in legislation.
I have filtered states that have adopted the new BC_law (df: treated_states) and those that haven't (df: not_treated). Now I should find out if the adoption of the new law affects the neighbouring states.
My problem: I should filter and store into a new dataframe the treated_states dataframes' Neighbor1, Neighbor2, Neighbor3 that exist in the dataframe not_treated
In case you wonder why I need to filter those: I'll do a difference-in-difference regression to determine if adopting the new law affects the neighboring states that don't adopt the BC_law.
Upvotes: 2
Views: 986
Reputation: 71689
Here is a more concise approach:
Neighbour
like columnsDataFrame.isin
to check whether each element in dataframe is contained in state
column of non_treated
any
cols = treated_states.filter(like='Neigh')
mask = cols.isin(not_treated.state).any(axis=1)
>>> treated_states[mask]
year state BC_law n_ipo Neighbor1 Neighbor2 Neighbor3 Treated
0 1980 AZ 1999 100 CA AK WV 1
1 1999 AZ 1999 50 CA AK WV 1
Upvotes: 1
Reputation: 188
I managed to write this and it seems to work... Maybe it helps someone and a better coder can suggest a more efficient and elegant way to solve this
#creating a list of states that have not adopted the new legislation
not_treated_list = not_treated.STATE
#df with states that have not adopted the new legislation, that are also neighbouring states that have adopted the new legislation
neighbour = treated_states[(treated_states.Neigh1.isin(not_treated_list)) | (treated_states.Neigh2.isin(not_treated_list)) |(treated_states.Neigh3.isin(not_treated_list)) | (treated_states.Neigh4.isin(not_treated_list)) | (treated_states.Neigh5.isin(not_treated_list)) | (treated_states.Neigh6.isin(not_treated_list)) | (treated_states.Neigh7.isin(not_treated_list)) | (treated_states.Neigh8.isin(not_treated_list)) ]
neighbour
Upvotes: 0