user294110
user294110

Reputation: 169

pandas to match columns names based on a list object based check

I am trying to match a list object with pandas DataFrame, i've a condition here where the list object which contains the column names so, sometimes the DataFrame may contains the all names which are in matchObj but times when the DataFrame will only have few column names only in that situation it fails to do the Job.

below is my example lists matchObj and matchObj1 for example:

>>> matchObj = ['equity01',  'equity02',  'equity1'  'equity2']
>>> matchObj1 = ['equity01',  'equity02']

Below is the DataFrame:

>>> df
   equity01  equity02  equity03  equity04  equity05
0         1         4         7         2         5
1         2         5         8         3         6
2         3         6         9         4         7

While i'm using the list matchobj1 against df it works as it founds the column names.

>>> print(df[matchObj1])
   equity01  equity02
0         1         4
1         2         5
2         3         6

However, it fails to work with matchobj because df doest not contain the equity1 equity2 thus throws the KeyError: "['equity1equity2'] not in index"

>>> print(df[matchObj])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/grid/common/pkgs/python/v3.6.1/lib/python3.6/site-packages/pandas/core/frame.py", line 2133, in __getitem__
    return self._getitem_array(key)
  File "/grid/common/pkgs/python/v3.6.1/lib/python3.6/site-packages/pandas/core/frame.py", line 2177, in _getitem_array
    indexer = self.loc._convert_to_indexer(key, axis=1)
  File "/grid/common/pkgs/python/v3.6.1/lib/python3.6/site-packages/pandas/core/indexing.py", line 1269, in _convert_to_indexer
    .format(mask=objarr[mask]))
KeyError: "['equity1equity2'] not in index"

Upvotes: 2

Views: 1580

Answers (1)

Vivek Kalyanarangan
Vivek Kalyanarangan

Reputation: 9081

Use -

print(df[[i for i in matchObj if i in df.columns]])

Output

   equity01  equity02
0         1         4
1         2         5
2         3         6

Explanation

[i for i in matchObj if i in df.columns] only fetches columns that are present in df. Ignores all the rest. Hope that helps.

Upvotes: 3

Related Questions