Reputation: 490
Does anyone have a clear way to explain the rule regarding str.split(sep=None)
? The Docstring provides some explanation but not enough to understand the following behaviors
>>> s = '\n Hello\t World\nOpps\t '
>>> print(s.split()) #by default sep = None
['Hello', 'World', 'Opps']
>>> print(s.split(maxsplit = 1))
['Hello', 'World\nOpps\t ']
''
(which should be dumped next) and ' Hello World\nOpps\t'
.Thank you in advance for any consistent and logical explanation you may provide.
PS: I have included the Docstring in below, and I am aware of this question, and an answer there offers some help but a clear rule with official source is still missing. Without official source explanation, a rule is just something for users to memorize without understanding.
sep
The delimiter according which to split the string. None (the default value) means split according to any whitespace, and discard empty strings from the result.
Upvotes: 1
Views: 555
Reputation: 490
Since the overflow FAQ and this stackoverflow blog post stated that it is encouraged to answer one's own question, I am posting a possible interpretation I can think of (edited after seeing the answer provided by @juanpa-arrivillaga and reading the official doc): when sep=None
, the special rule for str.split
is as follows.
maxsplit=0
, python still conducts item 1, as illustrated by the following example.>>> s = '\n Hello\t World\nOpps\t '
>>> print(s.split(maxsplit = 0))
['Hello\t World\nOpps\t ']
>>> print(s.split(maxsplit = 1))
['Hello', 'World\nOpps\t ']
>>> print(s.split(maxsplit = 2))
['Hello', 'World', 'Opps\t ']
>>> print(s.split(maxsplit = 3))
['Hello', 'World', 'Opps']
Upvotes: 1
Reputation: 95957
You should look at the official docs, where this is explained in detail:
If
sep
is not specified or isNone
, a different splitting algorithm is applied: runs of consecutive whitespace are regarded as a single separator, and the result will contain no empty strings at the start or end if the string has leading or trailing whitespace. Consequently, splitting an empty string or a string consisting of just whitespace with a None separator returns[]
.
Bold emphasis added by me.
However, this does leave ambiguity regarding what a "split" is in the case of `sep=None, maxsplit=some_positive_number). But apparently, the leading and trailing whitespace acts, at least conceptually, as if it simply wasn't there. But notice:
>>> " a b c ".split(maxsplit=1)
['a', 'b c ']
So it isn't actually removed.
Upvotes: 1