Reputation: 51
I have two python pandas series df10
and df5
. I want to compare their values.For example: df10[-1:]< df5[-1:]
,it returns true. df10[-2:-1] > df5[-2:-1]
, it returns false.
But if I combine them together, df10[-1:]< df5[-1:] and df10[-2:-1]>df5[-2:-1]
,it returns
The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()
But I expect it returns false. How can I solve this problem?
Upvotes: 2
Views: 30587
Reputation: 341
Consider you have the two dataframes from this program:
# Python 3.5.2
import pandas as pd
import numpy as np
# column names for example dataframe
cats = ['A', 'B', 'C', 'D', 'E']
df5 = pd.DataFrame(data = np.arange(25).reshape(5, 5), columns=cats)
print("Dataframe 5\n",df5,"\n")
df10=pd.DataFrame(data = np.transpose(np.arange(25).reshape(5, 5)), columns=cats)
print("Dataframe 10\n",df10)
The resulting data frames are:
Dataframe 5
A B C D E
0 0 1 2 3 4
1 5 6 7 8 9
2 10 11 12 13 14
3 15 16 17 18 19
4 20 21 22 23 24
Dataframe 10
A B C D E
0 0 5 10 15 20
1 1 6 11 16 21
2 2 7 12 17 22
3 3 8 13 18 23
4 4 9 14 19 24
Now let's look at the result of your first comparison:
print(df5[-1:])
print(df10[-1:])
a=df10[-1:]< df5[-1:]
print("\n",a,"\n",type(a))
which results in:
A B C D E
4 20 21 22 23 24
A B C D E
4 4 9 14 19 24
A B C D E
4 True True True True False
<class 'pandas.core.frame.DataFrame'>
Now the second comparison:
print(df5[-2:-1])
print(df10[-2:-1])
b=df10[-2:-1]>df5[-2:-1]
print("\n",b,"\n",type(b))
which has results:
A B C D E
3 15 16 17 18 19
A B C D E
3 3 8 13 18 23
A B C D E
3 False False False False True
<class 'pandas.core.frame.DataFrame'>
The issue:
If we evaluate:
pd.Series([True, True, False, False]) and pd.Series([False, True, False, True])
What is the correct answer?:
pd.Series([False, True, False, False])
False
True
The answer is: 6 - It depends. It depends on what you want.
First, we have to create boolean series for the comparison:
a_new = (df10[-1:] < df5[-1:]).any()
print(a_new,"\n",type(a_new))
b_new = (df10[-2:-1] > df5[-2:-1]).any()
print("\n",b_new,"\n",type(b_new))
The results are:
A True
B True
C True
D True
E False
dtype: bool
<class 'pandas.core.series.Series'>
A False
B False
C False
D False
E True
dtype: bool
<class 'pandas.core.series.Series'>
Now, we can compute 3 cases.
Case 1: a.any() and b.any()
a.any() = True if any item in a is True
b.any() = True if any item in b is True
print(a_new.any() and b_new.any())
The result is True.
Case 2: a.all() and b.all()
a.all() = True if every item in a is True
b.all() = True if every item in b is True
print(a_new.all() and b_new.all())
The result is False.
Case 3: Pairwise comparison
For this, you have to compare every element to each other.
result_pairwise = [a_new and b_new for a_new, b_new in zip(a_new,b_new)]
print(result_pairwise,"\n",type(result_pairwise))
The result is:
[False, False, False, False, False]
<class 'list'>
For more details:
Upvotes: 11
Reputation: 5115
You can do this with the pandas Series values
attribute:
if (df10.values[-2:-1] > df5.values[-2:-1]) and\
(df10.values[-1:] < df5.values[-1:]):
print("we met the conditions!")
Upvotes: 2