Reputation: 8045
Receive a string from somewhere and the string is a sequence of params. Params are separated by whitespace. The task is parse the string to a param list, all params are type string.
For example:
input : "3 45 5.5 a bc"
output : ["3","45","5.5","a","bc"]
Things become little complicated if need to transfer a string include a whitespace, use "
to quote.
input: "3 45 5.5 \"This is a sentence.\" bc"
output: ["3","45","5.5","This is a sentence.","bc"]
But what if the sentence happened to include a quote mark? Use the escape character: \"
-> "
, \\
-> \
input: "3 45 5.5 \"\\\"Yes\\\\No?\\\" it said.\" bc"
output: ['3','45','5.5','"Yes\\NO?" it said.','bc']
Does python have an elegant way to do this job?
PS. I don't think regular expressions can work this out.
Upvotes: 2
Views: 316
Reputation: 1121554
Use the shlex.split()
function:
>>> import shlex
>>> shlex.split("3 45 5.5 a bc")
['3', '45', '5.5', 'a', 'bc']
>>> shlex.split("3 45 5.5 \"This is a sentence.\" bc")
['3', '45', '5.5', 'This is a sentence.', 'bc']
>>> shlex.split("3 45 5.5 \"\\\"Yes\\\\No?\\\" it said.\" bc")
['3', '45', '5.5', '"Yes\\No?" it said.', 'bc']
You can create a customizable parser with the shlex.shlex
function, then alter it's behaviour by setting its attributes. For example, you can set the .whitespace
attribute to ', \t\r\n'
to allow commas to delimit words as well. Then simply convert the shlex
instance back to a list to split the input.
Upvotes: 8