L Lewis
L Lewis

Reputation: 5

Creating a dataframe with condition

I have built a dataframe representing strings of flight and the flights that are in that string.

This is the code (only for information purpose) for obtaining the actual dataframe:

string = 0;
d = []
for i in data_file.index:
    for j in data_file.index:
        list_strings = find_all_paths(graph,i,j)
        for k in range(len(list_strings)):
            string = string + 1;
            for m in range(len(list_strings[k])):
                d.append({'path':list_strings[k][m],'string': string})

The problem I want to solve: the outcome of this code is the following (a sample, since it is pretty big):

path  string
-------------
0       1
1       1
2       1
0       2
2       3
4       3
...    ...

The outcome means: string 1 is: first flight 1 is operated, followed by flight 1 and finally flight 2. String 2 is flight 0 and then flight 2.

I would like to get a dataframe that contains the extremes of a string, this is the first and last flight of the string.

Expected result:

string  first   last
---------------------
1        0       2
2        0       0
3        2       4
...     ...     ...    

Upvotes: 0

Views: 60

Answers (2)

anky
anky

Reputation: 75080

Try with:

print(df.groupby('string')['path'].agg(['first','last']))

        first last
string           
1.0        0    2
2.0        0    0
3.0        2    4

Upvotes: 1

rahlf23
rahlf23

Reputation: 9019

You can use pd.concat() with groupby():

pd.concat([df.groupby('string').first(), df.groupby('string').last()], axis=1)

Yields:

        path  path
string            
1          0     2
2          0     0
3          2     4

Upvotes: 1

Related Questions