Selecting columns based on their column name containing a substring and row of a column meets a condition

Question

my input:

mydata = (pd.DataFrame({'xyz':['Label','NG-ICV','NG-ICV','NG-ICV','ICV','ICV','ICV','NG-ICV','NG-ICV','Label',], 
                'foo':['Label','Label','Label','Label','Label','Label','Label','Label','Label','Label'],
                'bar':['Label','Label','Label','Label','NG-ICV','NG-ICV','ICV','ICV','NG-ICV','NG-ICV']}))

My necessary code to calculate ratio between two patterns and finding in which column is a biggest ratio:

patNG = "NG-ICV"
pat = "ICV"
x=mydata.apply(lambda x: x.str.contains(patNG, regex=True).sum(), axis=0)
y=mydata.apply(lambda x: x.str.contains(pat, regex=True).sum(), axis=0)
z=x/y  
z=z.dropna()
mask=z.index.values.tolist()

Now I want select row with index, within two condition column name from mask and select only rows where patNG and pat is True(where row value equal to 'ICV' or 'NG-ICV'). So I expect for example for column xyz like this:

  xyz 
4 NG-ICV
5 NG-ICCV
6 ICV
7 ICV
8 NG-ICV
9 NG-ICV

My code:

mydata.loc[  :,[mask[0] in i for i in mydata.columns] ]

And here I am stuck, how add one more condition where values of row also equal to my two patterns('pat' and 'patNG')?

P.S. I must to use regex and pattern, because my data is not one label such ICV its long string with other labels.

mozway · Accepted Answer

Following your logic, you should keep the initial detection of pattern before you sum. Then apply a mask on the detected columns to select your rows:

X = mydata.apply(lambda x: x.str.contains(patNG, regex=True), axis=0)
Y = mydata.apply(lambda x: x.str.contains(pat, regex=True), axis=0)

x = X.sum()
y = Y.sum()
z = (x/y).dropna()
mask = z.index.values.tolist()

mydata.loc[(X|Y)[mask].all(1)]

output:

      xyz    foo     bar
4     ICV  Label  NG-ICV
5     ICV  Label  NG-ICV
6     ICV  Label     ICV
7  NG-ICV  Label     ICV
8  NG-ICV  Label  NG-ICV

Selecting columns based on their column name containing a substring and row of a column meets a condition

Answers (1)

Related Questions