Reputation: 1384
I have some unpredictable log lines that I'm trying to split.
The one thing I can predict is that the first field always ends with either a .
or a :
.
Is there any way I can automatically split the string at whichever delimiter comes first?
Upvotes: 0
Views: 668
Reputation: 6098
Look at the index of the .
and :
characters in the string using the index()
function.
Here’s a simple implementation:
def index_default(line, char):
"""Returns the index of a character in a line, or the length of the string
if the character does not appear.
"""
try:
retval = line.index(char)
except ValueError:
retval = len(line)
return retval
def split_log_line(line):
"""Splits a line at either a period or a colon, depending on which appears
first in the line.
"""
if index_default(line, ".") < index_default(line, ":"):
return line.split(".")
else:
return line.split(":")
I wrapped the index()
function in an index_default()
function because if the line doesn’t contain a character, index()
throws a ValueError, and I wasn’t sure if every line in your log would contain both a period and a colon.
And then here’s a quick example:
mylines = [
"line1.split at the dot",
"line2:split at the colon",
"line3:a colon preceded. by a dot",
"line4-neither a colon nor a dot"
]
for line in mylines:
print split_log_line(line)
which returns
['line1', 'split at the dot']
['line2', 'split at the colon']
['line3', 'a colon preceded. by a dot']
['line4-neither a colon nor a dot']
Upvotes: 1
Reputation: 11
Check the indexes for both both characters, then use the lowest index to split your string.
Upvotes: 1