Reputation: 85
I want to use a mask from series x to filter out a vaex dataframe y. I know how to do this in pandas and numpy. In pandas it's like:
import pandas as pd
a = [0,0,0,1,1,1,0,0,0]
b = [4,5,7,8,9,9,0,6,4]
x = pd.Series(a)
y = pd.Series(b)
print(y[x==1])
The result is like:
3 8
4 9
5 9
dtype: int64
But in vaex, the following code doesn't work.
import vaex
import numpy as np
a = np.array([0, 0, 0, 1, 1, 1, 0, 0, 0])
b = np.array([4, 5, 7, 8, 9, 9, 0, 6, 4])
x = vaex.from_arrays(x=a)
y = vaex.from_arrays(x=b)
print(y[x.x == 1].values)
The result is empty:
[]
It seems that vaex doesn't have the same index concept as pandas and numpy. Although the two dataframe is equal shape, array y can't use mask x.x==1.
Is there any way to achieve the equavilent result as pandas does please?
Thanks
Upvotes: 1
Views: 819
Reputation: 813
While Vaex has a similar API to that of Pandas (similarly named methods, that do the same thing), the implementations of the two libraries is completely different and thus it is not easy to "mix and match".
In order to work with any kind of data, that data needs to be part of the same Vaex dataframe.
So in order to achieve what you want, something like this is possible:
import vaex
import numpy as np
a = np.array([0, 0, 0, 1, 1, 1, 0, 0, 0])
b = np.array([4, 5, 7, 8, 9, 9, 0, 6, 4])
y = vaex.from_arrays(x1=b)
y.add_column(name='x2', f_or_array=a)
print(y[y.x2 == 1])
Upvotes: 2