Reputation: 129
I have two series of different lengths, and I am attempting to find the intersection of the two series based on the index, where the index is a string. The end result is, hopefully, a series that has the elements of the intersection based on the common string indexes.
Any ideas?
Upvotes: 10
Views: 14149
Reputation: 176770
Pandas indexes have an intersection method which you can use. If you have two Series, s1
and s2
, then
s1.index.intersection(s2.index)
or, equivalently:
s1.index & s2.index
gives you the index values which are in both s1
and s2
.
You can then use this list of indexes to view the corresponding elements of a series. For example:
>>> ixs = s1.index.intersection(s2.index)
>>> s1.loc[ixs]
# subset of s1 with only the indexes also found in s2 appears here
Upvotes: 13
Reputation: 1275
Both my data increments so I wrote a function to get the indices then filtered the data based on their indexes.
np.shape(data1) # (1330, 8)
np.shape(data2) # (2490, 9)
index_1, index_2 = overlap(data1, data2)
data1 = data1[index1]
data2 = data2[index2]
np.shape(data1) # (540, 8)
np.shape(data2) # (540, 9)
def overlap(data1, data2):
'''both data is assumed to be incrementing'''
mask1 = np.array([False] * len(data1))
mask2 = np.array([False] * len(data2))
idx_1 = 0
idx_2 = 0
while idx_1 < len(data1) and idx_2 < len(data2):
if data1[idx_1] < data2[idx_2]:
mask1[idx_1] = False
mask2[idx_2] = False
idx_1 += 1
elif data1[idx_1] > data2[idx_2]:
mask1[idx_1] = False
mask2[idx_2] = False
idx_2 += 1
else:
mask1[idx_1] = True
mask2[idx_2] = True
idx_1 += 1
idx_2 += 1
return mask1, mask2
Upvotes: 0