Plutonium smuggler
Plutonium smuggler

Reputation: 339

Pandas select data from row where column names vary

I have multiple tables, which i built from txt files. They have different number of columns with different names. Columns I'm most interested in have names, say "data" or "data7" ... "data0" or "data15" ... "data0".

I need to select rows from these tables with some specific values in "data" columns, say data==1234, but want to display other column info as well, say colA, colB along-with these "data" columns.

What I have so far:

result = df.filter(regex='data|colA|colB|colC') to filter out columns i need to see.

What I need:

From result dataframe, I need to select entries so that data has specific values. Problem is internet tutorials give queries like result[(result.data == abc)] and it errors out saying "data column not found".

Is there any way to execute query on %data% columns of result dataframe?

Upvotes: 1

Views: 139

Answers (1)

mVChr
mVChr

Reputation: 50187

EDIT: To find the value in any column, you can use... any!

>>> df

  col1 col2  data1  data2
0    a    e   1234     12
1    b    f   5678     34
2    c    g   9101   1234
3    d    h   1121     78

>>> df.filter(regex='data1|data2|col1')[df.isin([1234]).any(axis=1)]

  col1  data1  data2
0    a   1234     12
2    c   9101   1234

Docs: any, isin


OLD ANSWER

Just match the column to your value:

>>> df

  col1 col2  data1  data2
0    a    e   1234     12
1    b    f   5678     34
2    c    g   9101     56
3    d    h   1234     78

>>> df.filter(regex='data1|data2|col1')[df['data1'] == 1234]

  col1  data1  data2
0    a   1234     12
3    d   1234     78

Upvotes: 2

Related Questions