Reputation: 167
i have a script that parse log files and in a part of programming i have to know if a message is transmitted or not and by reading those lines i can get the Message id and know wether the message is trasmitted or not.
01:09:25.258 mta Messages I Doc O:NVS:SMTP/[email protected] R:NVS:SMS/+654811 mid:6261
01:09:41.965 mta Messages I Rep 6261 OK, Message received(ID: 26)
08:14:14.469 mta Messages I Doc O:NVS:SMTP/[email protected] R:NVS:SMS/+654646 mid:6262
08:14:30.630 mta Messages I Rep O:NVS:SMTP/[email protected] R:NVS:SMS/+304859 mid:6262
08:14:30.630 mta Messages I Rep 6262 Error while transmitting (ID: 28)
The lines i'm interested in are the second and the last, i'd like to extract the 6261 and the ok after it and same for the last line
Upvotes: 0
Views: 87
Reputation: 172249
You don't need regexp. Just split the lines on the whitespace.
>>> line.split(None, 5)
['10:56:45.255', 'Message', 'I', 'Rep', '2559', 'OK, Message received']
Since you only want the ID and message:
>>> [line.split(None, 5)[-2:] for line in file.readlines()]
[['2548', 'OK'], ['2559', 'OK, Message received'], ['2560', 'Error'], ['2561', 'Transmission... ']]
Note that the spaces in the message is NOT a problem.
Upvotes: 5
Reputation: 41958
/[0-9]{4} (.*)/
would fit the purpose, but I don't know if that's generic enough for you. Depending on whether the line id (2548 etc.) can also be shorter the regexp would have to be adapted slightly, but from the 4 shown lines this would work.
When writing regular expressions the most important thing is is not to work from 'samples' alone, but also from 'usable assumptions' about the strings you are trying to match. I cannot reliably say this solution perfectly solves your problem because I don't know the entire problem, and as such cannot supply a perfect pattern.
Upvotes: -1