Reputation: 13
Say we have this dict as a dataframe with two columns:
data = {
"slice_by" : [2, 2, 1]
"string_to_slice" : ["one", "two", "three"]
}
First line works just fine, second one doesn't:
df["string_to_slice"].str[:1])
df["string_to_slice"].str[:df["slice_by"])
Output:
0 ne
1 wo
2 hree
Name: string_to_slice, Length: 3, dtype: object
0 NaN
1 NaN
2 NaN
Name: string_to_slice, Length: 3, dtype: float64
What would be the appropiate way to do this? I'm sure I could make up something with df.iterrows() but that's probably not the efficient way.
Upvotes: 1
Views: 48
Reputation: 11650
here is one way to do it, by using apply
df.apply(lambda x: x['string_to_slice'][x['slice_by']:], axis=1)
0 e
1 o
2 hree
dtype: object
Upvotes: 1
Reputation: 14238
I am assuming you want str[slice_by:]
and not str[:slice_by]
. With that assumption you can do:
np_slice_string = np.vectorize(lambda x, y: x[y:]))
out = np_slice_string(df['string_to_slice'], df['slice_by'])
print(out):
['e' 'o' 'hree']
Upvotes: 0