mdandr
mdandr

Reputation: 1384

Split by the delimiter that comes first, Python

I have some unpredictable log lines that I'm trying to split.

The one thing I can predict is that the first field always ends with either a . or a :.

Is there any way I can automatically split the string at whichever delimiter comes first?

Upvotes: 0

Views: 668

Answers (2)

alexwlchan
alexwlchan

Reputation: 6098

Look at the index of the . and : characters in the string using the index() function.

Here’s a simple implementation:

def index_default(line, char):
    """Returns the index of a character in a line, or the length of the string
    if the character does not appear.
    """
    try:
        retval = line.index(char)
    except ValueError:
        retval = len(line)
    return retval

def split_log_line(line):
    """Splits a line at either a period or a colon, depending on which appears 
    first in the line.
    """
    if index_default(line, ".") < index_default(line, ":"):
        return line.split(".")
    else:
        return line.split(":")

I wrapped the index() function in an index_default() function because if the line doesn’t contain a character, index() throws a ValueError, and I wasn’t sure if every line in your log would contain both a period and a colon.

And then here’s a quick example:

mylines = [
    "line1.split at the dot",
    "line2:split at the colon",
    "line3:a colon preceded. by a dot",
    "line4-neither a colon nor a dot"
]

for line in mylines:
    print split_log_line(line)

which returns

['line1', 'split at the dot']
['line2', 'split at the colon']
['line3', 'a colon preceded. by a dot']
['line4-neither a colon nor a dot']

Upvotes: 1

Chris Sjoblom
Chris Sjoblom

Reputation: 11

Check the indexes for both both characters, then use the lowest index to split your string.

Upvotes: 1

Related Questions