Reputation: 11
My script down here is supposed to return a result in this format
[ {'heure':xxxx,'mid': xxxx,'type message': "e.g SMS.Message ", "Origine":xxx,"Destination":xxxx}]
Well it works but without the Type message I've just added this so I think that the regex isn't correct. :/ It also doesn't work when I add a data that doesn't have something that looks like the regex so I think I have to do a try:
- except:
but I don't know how. :/
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import re
################################_Function EXTRACT_###############################################
def extraire(data):
ms = re.match(r'(\S+).*mid:(\d+).*(R:NVS:\w+)', data) # heure & mid
k = re.findall(r"/\S+", data ) # source & destination extracte
return {'Heure':ms.group(1), 'mid':ms.group(2),'Type Message':ms.group(3),"Origine":k[0],"Destination":k[1]}
#################################################################################################
tableau = []
data3 = "12:07:32.546 mta Messages I Doc O:TCARVAL (NVS:SMTP/[email protected]) R:NVS:VOICE/+45154245 mid:6500"
data4 = "12:07:41.391 mta Messages I Rep O:TCARVAL (NVS:SMTP/[email protected]) R:NVS:**SMS.Message**/+39872422 mid:6500"
data5 = "12:07:32.546 mta Messages I Doc O:TCARVAL (NVS:VOICE/[email protected]) R:NVS:SMS.Message/+34659879 mid:6500"
data6 = "12:07:32.545 mta Messages I Doc O:TCARVAL [email protected] R:NVS:VOICE/01020150405 mid:9797"
data_list = [ data3, data4,data5, data6]
tableau = [extraire(data) for data in data_list]
print tableau
Upvotes: 1
Views: 87
Reputation: 92976
"mid" is coming after "R:NVS" so your pattern has it in the wrong order
12:07:32.546 mta Messages I Doc O:TCARVAL (NVS:SMTP/[email protected]) R:NVS:VOICE/+45154245 mid:6500
1 2
So, you need to change the order in your pattern into something like this
(\S+).*(R:NVS:\w+).*mid:(\d+)
Btw. what do you \S+
expect to match? Here it will match the first series of non whitespace characters in the string.
Upvotes: 1
Reputation: 43235
Change your extraire function to this, as you are trying to access properties on ms
even when
there are no matches. And when there are no matches, ms
is None :
def extraire(data):
ms = re.match(r'(\S+).*mid:(\d+).*(R:NVS:\w+)', data) # heure & mid
print(str(ms))
if(ms is not None):
k = re.findall(r"/\S+", data ) # source & destination extracte
return {'Heure':ms.group(1), 'mid':ms.group(2),'Type Message':ms.group(3),"Origine":k[0],"Desti\
nation":k[1]}
else:
return {}
Btw, your regex does not seem to match the text you intend to match.
You may get List index out of range error as well if k
does not contain the number of elements you are seeking.
Upvotes: 1