Reputation: 1879
I'm trying to access filtered versions of a dataframe, using a list with the filter values.
I'm using a while loop that I thought would plug the appropriate list values into a dataframe filter one by one. This code prints the first one fine but then prints 4 empty dataframes afterwards.
I'm sure this is a quick fix but I haven't been able to find it.
boatID = [342, 343, 344, 345, 346]
i = 0
while i < len(boatID):
df = df[(df['boat_id']==boatID[i])]
#run some code, i'm printing DF.head to test it works
print(df.head())
i = i + 1
Example dataframe:
boat_id activity speed heading
0 342 1 3.34 270.00
1 343 1 0.02 0.00
2 344 1 0.01 270.00
3 345 1 8.41 293.36
4 346 1 0.03 90.00
Upvotes: 1
Views: 29799
Reputation: 862611
I think you overwrite df
by df
in df = df[(df['boat_id']==boatID[i])]
:
Maybe you need change output to new dataframe, e.g. df1
:
boatID = [342, 343, 344, 345, 346]
i = 0
while i < len(boatID):
df1 = df[(df['boat_id']==boatID[i])]
#run some code, i'm printing DF.head to test it works
print(df1.head())
i = i + 1
# boat_id activity speed heading
#0 342 1 3.34 270
# boat_id activity speed heading
#1 343 1 0.02 0
# boat_id activity speed heading
#2 344 1 0.01 270
# boat_id activity speed heading
#3 345 1 8.41 293.36
# boat_id activity speed heading
#4 346 1 0.03 90
If you need filter dataframe df
with column boat_id
by list boatID
use isin
:
df1 = df[(df['boat_id'].isin(boatID))]
print df1
# boat_id activity speed heading
#0 342 1 3.34 270.00
#1 343 1 0.02 0.00
#2 344 1 0.01 270.00
#3 345 1 8.41 293.36
#4 346 1 0.03 90.00
EDIT:
I think you can use dictionary of dataframes
:
print df
boat_id activity speed heading
0 342 1 3.34 270.00
1 343 1 0.02 0.00
2 344 1 0.01 270.00
3 345 1 8.41 293.36
4 346 1 0.03 90.00
boatID = [342, 343, 344, 345, 346]
dfs = ['df' + str(x) for x in boatID]
dicdf = dict()
print dfs
['df342', 'df343', 'df344', 'df345', 'df346']
i = 0
while i < len(boatID):
print dfs[i]
dicdf[dfs[i]] = df[(df['boat_id']==boatID[i])]
#run some code, i'm printing DF.head to test it works
# print(df1.head())
i = i + 1
print dicdf
{'df344': boat_id activity speed heading
2 344 1 0.01 270, 'df345': boat_id activity speed heading
3 345 1 8.41 293.36, 'df346': boat_id activity speed heading
4 346 1 0.03 90, 'df342': boat_id activity speed heading
0 342 1 3.34 270, 'df343': boat_id activity speed heading
1 343 1 0.02 0}
print dicdf['df342']
boat_id activity speed heading
0 342 1 3.34 270
Upvotes: 1