Reputation: 875
If I make a multi-indexed column dataframe like this:
iterables = [['bar', 'baz', 'foo', 'qux'], ['one', 'two']]
index = pd.MultiIndex.from_product(iterables, names=['first', 'second'])
df = pd.DataFrame(np.random.randn(3, 8), index=['A', 'B', 'C'], columns=index)
first bar baz foo qux \
second one two one two one two one
A -0.119687 -0.518318 0.113920 -1.028505 1.106375 -1.020139 -0.039300
B 0.123480 -2.091120 0.464597 -0.147211 -0.489895 -1.090659 -0.592679
C -1.174376 0.282011 -0.197658 -0.030751 0.117374 1.591109 0.796908
first
second two
A -0.938209
B -0.851483
C 0.442621
and I want to select columns from only the first set of columns using a list,
select_cols=['bar', 'qux']
such that the result would be:
first bar qux
second one two one two
A -0.119687 -0.518318 -0.039300 -0.938209
B 0.123480 -2.091120 -0.592679 -0.851483
C -1.174376 0.282011 0.796908 0.442621
How would I go about doing that? (Thanks ahead of time)
Upvotes: 1
Views: 1517
Reputation: 908
When I found this Q/A I thought I might see a solution that prints the column names. Having figured it out, I thought I might add to the answer. The following prints out the values of the column name for a given level.
df.columns.get_level_values(0)
=> ['bar', 'qux']
- E
Upvotes: 2
Reputation: 13913
Simple column selection works as well:
df[['bar', 'qux']]
# first bar qux
# second one two one two
# A 0.651522 0.480115 -2.924574 0.616674
# B -0.395988 0.001643 0.358048 0.022727
# C -0.317829 1.400970 -0.773148 1.549135
Upvotes: 5
Reputation: 214957
You can use loc
to select columns:
df.loc[:, ["bar", "qux"]]
# first bar qux
# second one two one two
# A 1.245525 -1.469999 -0.399174 0.017094
# B -0.242284 0.835131 -0.400847 -0.344612
# C -1.067006 -1.880113 -0.516234 -0.410847
Upvotes: 4