Reputation: 986
I have two series and want to check if they are equal with a condition on the combination between 'a' and 'b' is acceptable
first = pd.Series(['a', 'a', 'b', 'c', 'd'])
second = pd.Series(['A', 'B', 'C', 'C', 'K'])
expected output :
0 True
1 True
2 False
3 True
4 False
So far I know eq
can compare the two series but I am not sure how to include the condition
def helper(s1, s2):
return s1.str.lower().eq(s2.str.lower())
Upvotes: 1
Views: 184
Reputation: 68126
You can use bitwise logic operations to include your additional logic.
So that's:
condition_1 = first.str.casefold().eq(second.str.casefold())
condition_2 = first.str.casefold().isin(['a', 'b']) & second.str.casefold().isin(['a', 'b'])
result = condition_1 | condition_2
Or with numpy:
condition_1 = first.str.casefold().eq(second.str.casefold())
condition_2 = numpy.bitwise_and(
first.str.casefold().isin(['a', 'b']),
second.str.casefold().isin(['a', 'b'])
)
result = numpy.bitwise_or(condition_1, condition_2)
Upvotes: 2
Reputation: 150735
You can use replace
to map all a
to b
:
def transform(s):
return s.str.lower().replace({'a':'b'})
transform(first).eq(transform(second))
Upvotes: 1
Reputation: 191
You can specify an "ascii_distance" as follows:
import pandas as pd
s1 = pd.Series(['a', 'a', 'b', 'c', 'd'])
s2 = pd.Series(['A', 'A', 'b', 'C', 'F'])
def helper(s1, s2, ascii_distance):
s1_processed = [ord(c1) for c1 in s1.str.lower()]
s2_processed = [ord(c2) for c2 in s2.str.lower()]
print(f'ascii_distance = {ascii_distance}')
print(f's1_processed = {s1_processed}')
print(f's2_processed = {s2_processed}')
result = []
for i in range(len(s1)):
result.append((abs(s1_processed[i] - s2_processed[i]) <= ascii_distance))
return result
ascii_distance = 2
print(helper(s1, s2, ascii_distance))
Output:
ascii_distance = 2
s1_processed = [97, 97, 98, 99, 100]
s2_processed = [97, 97, 98, 99, 102]
[True, True, True, True, True]
Upvotes: 0