bkcollection
bkcollection

Reputation: 924

Python: Pandas dataframe to eliminate some data in columns

I have a dataframe as below and would like to eliminate column 'Stockocde' with string len is 4 and the column 'type' is NaN

df

    Stockcode             Stockname type
0         ZFSW     2ndChance W230307  NaN
1          502               3Cnergy    L
2         1E0W       3Cnergy W200528  NaN
3          AZG             8Telecom^    J
4          BQC               A-Smart    C
5          BTJ         A-Sonic Aero^    G
6          5GZ                    AA    C
7          A35       ABF SG BOND ETF  NaN
8          533                   ABR    G
9          L5I              Abterra^    K
10         541        Abundance Intl    G
11        1C4W  AbundanceIn eW210130  NaN
12        ADQU      Accordia Golf Tr    R
13         QZG         Accrelist Ltd    G
14         A75     Ace Achieve Info^    J
15         5FW      Acesian Partners    F
16        K3HD           ACH ADR US$    C
17         AYV                 Acma^    C
18         43F               Acromec    Q

desired output is

    Stockcode             Stockname type
1          502               3Cnergy    L
3          AZG             8Telecom^    J
4          BQC               A-Smart    C
5          BTJ         A-Sonic Aero^    G
6          5GZ                    AA    C
7          A35       ABF SG BOND ETF  NaN
8          533                   ABR    G
9          L5I              Abterra^    K
10         541        Abundance Intl    G
12        ADQU      Accordia Golf Tr    R
13         QZG         Accrelist Ltd    G
14         A75     Ace Achieve Info^    J
15         5FW      Acesian Partners    F
16        K3HD           ACH ADR US$    C
17         AYV                 Acma^    C
18         43F               Acromec    Q

Row 0, 2 and 11 is eliminated.

My code is as below

df[~(df['type'].isnull() & df['Stockcode'].str.len()==4)]

df['type'].isnull() and df['Stockcode'].str.len()==4) resulted true if it tested seperately but can't work together to get the desire result. Pleae advise.

Upvotes: 0

Views: 46

Answers (1)

jezrael
jezrael

Reputation: 862451

In your solution are missing parentheses for second condition, reason is priority operators:

df[~(df['type'].isnull() & (df['Stockcode'].str.len()==4))]

Another solution is 'invert' conditions and change & to | like:

df1 = df[df['type'].notnull() | (df['Stockcode'].str.len()!=4)]
print (df1)
   Stockcode          Stockname type
1        502            3Cnergy    L
3        AZG          8Telecom^    J
4        BQC            A-Smart    C
5        BTJ      A-Sonic Aero^    G
6        5GZ                 AA    C
7        A35    ABF SG BOND ETF  NaN
8        533                ABR    G
9        L5I           Abterra^    K
10       541     Abundance Intl    G
12      ADQU   Accordia Golf Tr    R
13       QZG      Accrelist Ltd    G
14       A75  Ace Achieve Info^    J
15       5FW   Acesian Partners    F
16      K3HD        ACH ADR US$    C
17       AYV              Acma^    C
18       43F            Acromec    Q

Upvotes: 1

Related Questions