Reputation: 31
When I'm splitting a string "abac"
I'm getting undesired results.
Example
print("abac".split("a"))
Why does it print:
['', 'b', 'c']
instead of
['b', 'c']
Can anyone explain this behavior and guide me on how to get my desired output?
Thanks in advance.
Upvotes: 0
Views: 2203
Reputation: 27485
As @DeepSpace pointed out (referring to the docs)
If sep is given, consecutive delimiters are not grouped together and are deemed to delimit empty strings (for example, '1,,2'.split(',') returns ['1', '', '2']).
Therefore I'd suggest using a better delimiter such as a comma ,
or if this is the formatting you're stuck with then you could just use the builtin filter()
function as suggested in this answer, this will remove any "empty" strings if passed None
as the function.
sample = 'abac'
filtered_sample = filter(None, sample.split('a'))
print(filtered_sample)
#['b', 'c']
Upvotes: 1
Reputation: 1414
split
will return the characters between the delimiters you specify (or between an end of the string and a delimiter), even if there aren't any, in which case it will return an empty string. (See the documentation for more information.)
In this case, if you don't want any empty strings in the output, you can use filter
to remove them:
list(filter(lambda s: len(s) > 0, "abac".split("a"))
Upvotes: 0
Reputation: 437
In your example, "a"
is what's called a delimiter. It acts as a boundary between the characters before it and after it. So, when you call split
, it gets the characters before "a"
and after "a"
and inserts it into the list. Since there's nothing in front of the first "a"
in the string "abac"
, it returns an empty string and inserts it into the list.
Upvotes: 0
Reputation: 1818
When you split a string in python you keep everything between your delimiters (even when it's an empty string!)
For example, if you had a list of letters separated by commas:
>>> "a,b,c,d".split(',')
['a','b','c','d']
If your list had some missing values you might leave the space in between the commas blank:
>>> "a,b,,d".split(',')
['a','b','','d']
The start and end of the string act as delimiters themselves, so if you have a leading or trailing delimiter you will also get this "empty string" sliced out of your main string:
>>> "a,b,c,d,,".split(',')
['a','b','c','d','','']
>>> ",a,b,c,d".split(',')
['','a','b','c','d']
If you want to get rid of any empty strings in your output, you can use the filter function.
If instead you just want to get rid of this behavior near the edges of your main string, you can strip the delimiters off first:
>>> ",,a,b,c,d".strip(',')
"a,b,c,d"
>>> ",,a,b,c,d".strip(',').split(',')
['a','b','c','d']
Upvotes: 1