Learner
Learner

Reputation: 19

Should I use regex or something else to extract log id from log files?

I have list with the following data:

["asdf mkol ghth", "dfcf 5566 7766", "7uy7 jhjh ffvf"]

I want to use regular expressions in python to get a list of tuples like this

[("asdf", "mkol ghth"),("dfcf", "5566 7766"),("7uy7", "jhjh ffvf")]

I tried using re.split, but I am getting an error saying too many values to unpack. following is my code:

logTuples = [()]
    for log in logList:
        (logid, logcontent) = re.split(r"(\s)", log)
        logTuples.append((logid, logcontent))

Upvotes: -2

Views: 56

Answers (2)

Gigaflop
Gigaflop

Reputation: 390

From the documentation:

https://docs.python.org/3/library/re.html

\s

For Unicode (str) patterns: Matches Unicode whitespace characters (which includes [ \t\n\r\f\v], and also many other characters, for example the non-breaking spaces mandated by typography rules in many languages). If the ASCII flag is used, only [ \t\n\r\f\v] is matched.

there are 2 whitespaces, and thus 3 items.

If all of your log entries have 3 items separated by spaces, and you always organize them as (1, 2 + ' ' + 3), you don't need to use a regex to format them as such:

logtuples = []
for log in loglist:
    splitlog = log.split(" ") #3 total elements
    logtuples.append (splitlog[0], splitlog[1] + " " + splitlog[2])

Upvotes: 0

Andrej Kesely
Andrej Kesely

Reputation: 195543

Regex is overkill here:

l = ["asdf mkol ghth", "dfcf 5566 7766", "7uy7 jhjh ffvf"]

lst = [tuple(i.split(maxsplit=1)) for i in l]

print(lst)

Prints:

[('asdf', 'mkol ghth'), ('dfcf', '5566 7766'), ('7uy7', 'jhjh ffvf')]

Upvotes: 0

Related Questions