pymat
pymat

Reputation: 1192

Select rows in a dataframe based on number of columns equal to True

I'd like to identify all rows, where 4 from 5 columns are True i.e.

    df = pd.DataFrame(
        [
            [0, 0, 0, 0, 0],
            [1, 1, 1, 1, 1],
            [1, 1, 1, 1, 0],
            [0, 0, 0, 0, 0],
            [1, 1, 1, 0, 1],
        ],
        index=["abc", "def", "ghi", "jkl", "mnl"],
        columns=list("abcde")
    ).applymap(bool)

So that....

    df = pd.DataFrame(
        [
            [1, 1, 1, 1, 0],
            [1, 1, 1, 0, 1],
        ],
        index=["ghi", "mnl"],
        columns=list("abcde")
    ).applymap(bool)

How can I resolve this?

Upvotes: 1

Views: 44

Answers (1)

jezrael
jezrael

Reputation: 862791

Use sum of columns and compare by number of values, here 4 with Series.eq and filter by boolean indexing:

print (df[df.sum(axis=1).eq(4)])
        a     b     c      d      e
ghi  True  True  True   True  False
mnl  True  True  True  False   True

Detail:

print (df.sum(axis=1))
abc    0
def    5
ghi    4
jkl    0
mnl    4
dtype: int64

If want 4 or 5 matched Trues:

print (df[df.sum(axis=1).isin([4,5])])
        a     b     c      d      e
def  True  True  True   True   True
ghi  True  True  True   True  False
mnl  True  True  True  False   True

If want greater or equal of 4:

print (df[df.sum(axis=1).ge(4)])
        a     b     c      d      e
def  True  True  True   True   True
ghi  True  True  True   True  False
mnl  True  True  True  False   True

Upvotes: 3

Related Questions