xyhuang
xyhuang

Reputation: 424

How to sort a big dataframe by two columns?

I have a big dataframe which records all price info for stock market.

in this dataframe, there are two index info, which are 'time' and 'con'

here is the example:

In [15]: df = pd.DataFrame(np.reshape(range(20), (5,4)))

In [16]: df
Out[16]: 
    0   1   2   3
0   0   1   2   3
1   4   5   6   7
2   8   9  10  11
3  12  13  14  15
4  16  17  18  19

In [17]: df.columns = ['open', 'high', 'low', 'close']

In [18]: df['tme'] = ['9:00','9:00', '9:01', '9:01', '9:02']

In [19]: df['con'] = ['a', 'b', 'a', 'b', 'a']

In [20]: df
Out[20]: 
   open  high  low  close   tme con
0     0     1    2      3  9:00   a
1     4     5    6      7  9:00   b
2     8     9   10     11  9:01   a
3    12    13   14     15  9:01   b
4    16    17   18     19  9:02   a

what i want is some dataframes like this:

## here is the close dataframe, which only contains close info, indexed by 'time' and 'con'
Out[31]: 
       a     b
9:00   3   7.0
9:01  11  15.0
9:02  19   NaN

How can i get this dataframe?

Upvotes: 1

Views: 60

Answers (2)

Steffi Keran Rani J
Steffi Keran Rani J

Reputation: 4093

One solution is to use pivot_table. Try this out:

 df.pivot_table(index=df['tme'], columns='con', values='close')

The solution is:

enter image description here

Upvotes: 1

Mayank Porwal
Mayank Porwal

Reputation: 34066

Use df.pivot:

In [117]: df.pivot('tme', 'con', 'close')
Out[117]: 
con      a     b
tme             
9:00   3.0   7.0
9:01  11.0  15.0
9:02  19.0   NaN

Upvotes: 2

Related Questions