Jiaqi
Jiaqi

Reputation: 51

How to compare two rows of two pandas series?

I have two python pandas series df10 and df5. I want to compare their values.For example: df10[-1:]< df5[-1:] ,it returns true. df10[-2:-1] > df5[-2:-1] , it returns false.

But if I combine them together, df10[-1:]< df5[-1:] and df10[-2:-1]>df5[-2:-1],it returns

The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()

But I expect it returns false. How can I solve this problem?

Upvotes: 2

Views: 30587

Answers (2)

blackeneth
blackeneth

Reputation: 341

Consider you have the two dataframes from this program:

# Python 3.5.2
import pandas as pd
import numpy as np

# column names for example  dataframe
cats = ['A', 'B', 'C', 'D', 'E']

df5 = pd.DataFrame(data = np.arange(25).reshape(5, 5), columns=cats)
print("Dataframe 5\n",df5,"\n")

df10=pd.DataFrame(data = np.transpose(np.arange(25).reshape(5, 5)), columns=cats)
print("Dataframe 10\n",df10)

The resulting data frames are:

Dataframe 5
     A   B   C   D   E
0   0   1   2   3   4
1   5   6   7   8   9
2  10  11  12  13  14
3  15  16  17  18  19
4  20  21  22  23  24 

Dataframe 10
    A  B   C   D   E
0  0  5  10  15  20
1  1  6  11  16  21
2  2  7  12  17  22
3  3  8  13  18  23
4  4  9  14  19  24

Now let's look at the result of your first comparison:

print(df5[-1:])
print(df10[-1:])

a=df10[-1:]< df5[-1:]

print("\n",a,"\n",type(a))

which results in:

    A   B   C   D   E
4  20  21  22  23  24
   A  B   C   D   E
4  4  9  14  19  24

       A     B     C     D      E
4  True  True  True  True  False 
 <class 'pandas.core.frame.DataFrame'>

Now the second comparison:

print(df5[-2:-1])
print(df10[-2:-1])

b=df10[-2:-1]>df5[-2:-1]
print("\n",b,"\n",type(b))

which has results:

    A   B   C   D   E
3  15  16  17  18  19
   A  B   C   D   E
3  3  8  13  18  23

        A      B      C      D     E
3  False  False  False  False  True 
 <class 'pandas.core.frame.DataFrame'>

The issue:

If we evaluate:

pd.Series([True, True, False, False]) and pd.Series([False, True, False, True])

What is the correct answer?:

  1. pd.Series([False, True, False, False])
  2. False
  3. True
  4. All of the above
  5. Any of the above
  6. It depends

The answer is: 6 - It depends. It depends on what you want.

First, we have to create boolean series for the comparison:

a_new = (df10[-1:] < df5[-1:]).any()
print(a_new,"\n",type(a_new))

b_new = (df10[-2:-1] > df5[-2:-1]).any()
print("\n",b_new,"\n",type(b_new))

The results are:

A     True
B     True
C     True
D     True
E    False
dtype: bool 
 <class 'pandas.core.series.Series'>

A    False
B    False
C    False
D    False
E     True
dtype: bool 
 <class 'pandas.core.series.Series'>

Now, we can compute 3 cases.

Case 1: a.any() and b.any()

a.any() = True if any item in a is True
b.any() = True if any item in b is True
print(a_new.any() and b_new.any())

The result is True.

Case 2: a.all() and b.all()

a.all() = True if every item in a is True   
b.all() = True if every item in b is True
print(a_new.all() and b_new.all())

The result is False.

Case 3: Pairwise comparison

For this, you have to compare every element to each other.

result_pairwise = [a_new and b_new for a_new, b_new in zip(a_new,b_new)]
print(result_pairwise,"\n",type(result_pairwise))

The result is:

[False, False, False, False, False] 
 <class 'list'>

For more details:

Upvotes: 11

rofls
rofls

Reputation: 5115

You can do this with the pandas Series values attribute:

if (df10.values[-2:-1] > df5.values[-2:-1]) and\ 
        (df10.values[-1:] < df5.values[-1:]):
    print("we met the conditions!")

Upvotes: 2

Related Questions