Reputation: 1229
I'm playing with pandas and trying to apply string slicing on a Series of strings object. Instead of getting the strings sliced, the series gets sliced:
In [22]: s = p.Series(data=['abcdef']*20)
In [23]: s.apply(lambda x:x[:2])
Out[24]:
0 abcdef
1 abcdef
On the other hand:
In [25]: s.apply(lambda x:x+'qwerty')
Out[25]:
0 abcdefqwerty
1 abcdefqwerty
2 abcdefqwerty
...
I got it to work by using the map function instead, but I think I'm missing something about how it's supposed to work.
Would very much appreciate a clarification.
Upvotes: 8
Views: 5515
Reputation: 2564
Wes McKinney's answer is a bit out of date, but he made good on his wish--pandas now has efficient string processing methods, including slicing:
In [2]: s = Series(data=['abcdef']*20)
In [3]: s.str[:2]
Out[3]:
0 ab
1 ab
2 ab
...
Upvotes: 12
Reputation: 16327
apply
first tries to apply the function to the whole series. Only if that fails it maps the given function to each element. [:2]
is a valid function on a series, + 'qwerty'
apparently isn't, that's why you do get the implicit mapping on the latter. If you always want to do the mapping you can use s.map
.
apply
's source code for reference:
try:
result = func(self)
if not isinstance(result, Series):
result = Series(result, index=self.index, name=self.name)
return result
except Exception:
mapped = lib.map_infer(self.values, func)
return Series(mapped, index=self.index, name=self.name)
Upvotes: 4
Reputation: 105571
You're on the right track:
In [3]: s = Series(data=['abcdef']*20)
In [4]: s
Out[4]:
0 abcdef
1 abcdef
2 abcdef
3 abcdef
4 abcdef
5 abcdef
6 abcdef
7 abcdef
8 abcdef
9 abcdef
10 abcdef
11 abcdef
12 abcdef
13 abcdef
14 abcdef
15 abcdef
16 abcdef
17 abcdef
18 abcdef
19 abcdef
In [5]: s.map(lambda x: x[:2])
Out[5]:
0 ab
1 ab
2 ab
3 ab
4 ab
5 ab
6 ab
7 ab
8 ab
9 ab
10 ab
11 ab
12 ab
13 ab
14 ab
15 ab
16 ab
17 ab
18 ab
19 ab
I would really like to add a bunch of vectorized, NA-friendly string processing tools in pandas (See here). Always appreciate any development help also.
Upvotes: 7