Reputation: 311
I am looking to create a new DataFrame that corresponds to the results of Devices A and B based on Silicon.
The following is my code for creating the DataFrame:
import numpy as np
import pandas as pd
x = np.array(
[
[0.26, 0.92, 0.05, 0.43],
[1.00, 0.62, 1.00, 1.00],
[1.00, 0.97, 0.04, 1.00],
[0.00, 1.00, 1.00, 0.88],
[1.00, 1.00, 1.00, 0.79],
[0.98, 1.00, 0.79, 0.99],
[0.99, 1.00, 1.00, 1.00],
[0.18, 1.00, 0.26, 1.00],
[0.22, 0.00, 0.34, 0.82],
]
)
rowIndx = pd.MultiIndex.from_product(
[["Slurm", "Zoidberg", "Wernstrom"], ["A", "B", "C"]],
names=["Laboratory", "Device"],
)
colIndex = pd.MultiIndex.from_product(
[["Replicant 1 ", "Replicant 2 "], ["Silicon", "Carbon"]]
)
robot = pd.DataFrame(data=x, index=rowIndx, columns=colIndex)
robot
Here is an image of the table.
This is the code that I thought would somewhat work, but it just gives me errors, so now I don't know what to try,
robot[(robot.Device=="A") & (robot.Device=="B")][["Silicon"]]
Upvotes: 2
Views: 1287
Reputation: 4215
Use slicers like this:
robot.loc[(slice(None), ['A', 'B']), (slice(None), 'Silicon')]
Replicant 1 Replicant 2
Silicon Silicon
Laboratory Device
Slurm A 0.26 0.05
B 1.00 1.00
Zoidberg A 0.00 1.00
B 1.00 1.00
Wernstrom A 0.99 1.00
B 0.18 0.26
or:
idx = pd.IndexSlice
robot.loc[idx[:, ['A', 'B']], idx[:, 'Silicon']]
Upvotes: 1
Reputation: 4821
I think you want something like this:
In [6]: robot.loc[:, (robot.columns.get_level_values(level=1)=='Silicon')]
Out[6]:
Replicant 1 Replicant 2
Silicon Silicon
Laboratory Device
Slurm A 0.26 0.05
B 1.00 1.00
C 1.00 0.04
Zoidberg A 0.00 1.00
B 1.00 1.00
C 0.98 0.79
Wernstrom A 0.99 1.00
B 0.18 0.26
C 0.22 0.34
Two keys things here: The first key is using robot.loc[ _ , _ ]
(specifying two arguments, one for the index and one for the column); this has to be something your MultiIndex-type index and your MultiIndex-type columns can understand.
The second key is the robots.columns.get_level_values(level=1)
, which gets the 4 column labels for level 1 (carbon/silicon) for the 4 columns displayed in the image of the DataFrame:
In [7]: robot.columns.get_level_values(level=1)
Out[7]: Index(['Silicon', 'Carbon', 'Silicon', 'Carbon'], dtype='object')
and it then filters which columns to show based on the given condition:
In [8]: robot.columns.get_level_values(level=1)=='Silicon'
Out[8]: array([ True, False, True, False])
If you had more elements besides Silicon, you could use the |
operator (not the &
operator) like this:
robot.loc[:, (robot.columns.get_level_values(level=1)=='Silicon')|(robot.columns.get_level_values(level=1)=='Carbon')]
or a bit shorter:
lv = robot.columns.get_level_values(level=1)
robot.loc[:, (lv=='Silicon')|(lv=='Carbon')]
UPDATE: If you also want to filter values in the index, you can use robot.index.get_level_values()
instead of robot.columns.get_level_values()
. Here's an example:
lv = robot.columns.get_level_values(level=1)
ilv = robot.index.get_level_values(level=1)
robot.loc[(ilv=='A')|(ilv=='B'), (lv=='Silicon')]
We've replaced the :
(which means all values of all levels of the MultiIndex) with a logical mask to filter indices, the same way we did to filter columns.
Upvotes: 3
Reputation: 3025
your dataframe is MultiIndex , So you need to use the following code to select a row:
result = robot.iloc[(robot.index.get_level_values('Device') == 'A')|(robot.index.get_level_values('Device') == 'B')]
Now, if you only want column Silicon
use the following code:
result.iloc[:, result.columns.get_level_values(1)== "Silicon"]
Upvotes: 1