Reputation: 5
I have built a dataframe representing strings of flight and the flights that are in that string.
This is the code (only for information purpose) for obtaining the actual dataframe:
string = 0;
d = []
for i in data_file.index:
for j in data_file.index:
list_strings = find_all_paths(graph,i,j)
for k in range(len(list_strings)):
string = string + 1;
for m in range(len(list_strings[k])):
d.append({'path':list_strings[k][m],'string': string})
The problem I want to solve: the outcome of this code is the following (a sample, since it is pretty big):
path string
-------------
0 1
1 1
2 1
0 2
2 3
4 3
... ...
The outcome means: string 1 is: first flight 1 is operated, followed by flight 1 and finally flight 2. String 2 is flight 0 and then flight 2.
I would like to get a dataframe that contains the extremes of a string, this is the first and last flight of the string.
Expected result:
string first last
---------------------
1 0 2
2 0 0
3 2 4
... ... ...
Upvotes: 0
Views: 60
Reputation: 75080
Try with:
print(df.groupby('string')['path'].agg(['first','last']))
first last
string
1.0 0 2
2.0 0 0
3.0 2 4
Upvotes: 1
Reputation: 9019
You can use pd.concat()
with groupby()
:
pd.concat([df.groupby('string').first(), df.groupby('string').last()], axis=1)
Yields:
path path
string
1 0 2
2 0 0
3 2 4
Upvotes: 1