LondonRob
LondonRob

Reputation: 79023

Pandas Series split n times

I'd like to split a pandas.Series by the first piece of whitespace only.

pd.Series.str.split offers an n parameter which according to the inline help kind of sounds like it should specify how many splits to perform. (it says Both 0 and -1 will be interpreted as return all splits in the notes, but doesn't actually specify what it does!)

In any case, it doesn't appear to work:

>>> x = pd.DataFrame(['Split Once', 'Split Once As Well!'])
>>> x[0].str.split(n=1)
0               [Split, Once]
1    [Split, Once, As, Well!]

Upvotes: 4

Views: 569

Answers (1)

behzad.nouri
behzad.nouri

Reputation: 78041

this seems to be a bug; you need to specify pat for it so it respects the value of n:

x[0].str.split( n=1, pat=' ' )

these are the lines in the source code which shows it ignores n if pat is None:

# pandas/core/strings.py
def str_split(arr, pat=None, n=None):
    if pat is None:
        if n is None or n == 0:
            n = -1
        f = lambda x: x.split()
...

edit: reported on github

Upvotes: 6

Related Questions