Reputation: 31918
Let's say I have the data below:
try:
from StringIO import StringIO
except ImportError:
from io import StringIO
import pandas as pd
from numpy import uint8
vector = pd.Series([1, 0, 0, 1])
df = pd.read_table(StringIO("""a b c
1 0 0
1 1 1
0 1 1
1 1 0"""), sep="\s+", dtype=uint8, header=0)
How do I "or" the vector with each column in the df?
I know I can make a partial function with "or" and my vector and apply it to the df, but this is probably unidiomatic and needlessly time-consuming. What is the pandas way?
Come to think of it, the idiomatic way is probably a lambda... Is there no binary operator for this, like dataframe.div(series)
? (Binary DF operations)
I'd like dataframe.or(vector)
...
Upvotes: 1
Views: 67
Reputation: 176860
You could pass the DataFrame and the (column) vector directly to np.logical_or
:
>>> np.logical_or(df, vector[:, None])
a b c
0 True True True
1 True True True
2 False True True
3 True True True
Note that this returns a DataFrame of boolean values; you can cast back to a numeric datatype if you prefer.
Upvotes: 2
Reputation: 64318
You can take advantage of numpy's broadcasting, bitwise-or'ing the underlying numpy array (df.values
) against the vector:
import numpy as np
new_values = df.values.astype(bool) | vector.values[:,np.newaxis].astype(bool)
This results with a numpy array, not a dataframe, but you can easily re-construct the dataframe:
new_df = pd.DataFrame(new_values, columns = df.columns)
Since this approach directly let's numpy do the computations, it is likely the fastest.
Upvotes: 1