arti8719
arti8719

Reputation: 31

Python 3 split()

When I'm splitting a string "abac" I'm getting undesired results.

Example

print("abac".split("a"))

Why does it print:

['', 'b', 'c']

instead of

['b', 'c']

Can anyone explain this behavior and guide me on how to get my desired output?

Thanks in advance.

Upvotes: 0

Views: 2203

Answers (4)

Jab
Jab

Reputation: 27485

As @DeepSpace pointed out (referring to the docs)

If sep is given, consecutive delimiters are not grouped together and are deemed to delimit empty strings (for example, '1,,2'.split(',') returns ['1', '', '2']).

Therefore I'd suggest using a better delimiter such as a comma , or if this is the formatting you're stuck with then you could just use the builtin filter() function as suggested in this answer, this will remove any "empty" strings if passed None as the function.

sample = 'abac'
filtered_sample = filter(None, sample.split('a'))
print(filtered_sample)
#['b', 'c']

Upvotes: 1

Harry Cutts
Harry Cutts

Reputation: 1414

split will return the characters between the delimiters you specify (or between an end of the string and a delimiter), even if there aren't any, in which case it will return an empty string. (See the documentation for more information.)

In this case, if you don't want any empty strings in the output, you can use filter to remove them:

list(filter(lambda s: len(s) > 0, "abac".split("a"))

Upvotes: 0

Yang K
Yang K

Reputation: 437

In your example, "a" is what's called a delimiter. It acts as a boundary between the characters before it and after it. So, when you call split, it gets the characters before "a" and after "a" and inserts it into the list. Since there's nothing in front of the first "a" in the string "abac", it returns an empty string and inserts it into the list.

Upvotes: 0

jfbeltran
jfbeltran

Reputation: 1818

When you split a string in python you keep everything between your delimiters (even when it's an empty string!)

For example, if you had a list of letters separated by commas:

>>> "a,b,c,d".split(',')
['a','b','c','d']

If your list had some missing values you might leave the space in between the commas blank:

>>> "a,b,,d".split(',')
['a','b','','d']

The start and end of the string act as delimiters themselves, so if you have a leading or trailing delimiter you will also get this "empty string" sliced out of your main string:

>>> "a,b,c,d,,".split(',')
['a','b','c','d','','']

>>> ",a,b,c,d".split(',')
['','a','b','c','d']

If you want to get rid of any empty strings in your output, you can use the filter function.

If instead you just want to get rid of this behavior near the edges of your main string, you can strip the delimiters off first:

>>> ",,a,b,c,d".strip(',')
"a,b,c,d"

>>> ",,a,b,c,d".strip(',').split(',')
['a','b','c','d']

Upvotes: 1

Related Questions