maxashtar
maxashtar

Reputation: 21

match pattern excluding charactere

i have the following situation. the character ";" is used as separator but there are some unexpected ";" in the values like valu;2 or va;ue4 in this string :

...;01;value1;02;valu;2;03;value3;04;va;ue4;....

with the pattern \d\d;.{6}; it returns all the blocks but I would like to know by looping each block and return True/False if ; is in the value .{6}, this way i will obtain 2 lists :

1.these having ; in the value .{6}

2.these not having ; in the value .{6}

the value isn't only alphanumeric, it can accept extra characters (* $ | ) but ; is not allowed in this usecase.

i tried to add [^;] but without success

how can i do ?

Thank you

Upvotes: 0

Views: 91

Answers (3)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626690

You can match those that contain no ; into one capturing group and those that have a ; into another. Then, you can check the captured group values to see what you actually match.

\d\d;(?:([^;\s]{6});|(\S{6});)

See the regex demo. Here, value1 and value3 are in Group 1, so no ; is present in those values. valu;2 and va;ue4 are in Group 2, so they contain a ; (as there is a match, and the first group did not match, the group pattern of which is the same except for ; support).

See the Python demo:

import re
rx = r'\d\d;(?:([^;\s]{6};)|(\S{6};))'
myString = ';01;value1;02;valu;2;03;value3;04;va;ue4;' 
matches = re.findall(rx, myString)
# => [('value1;', ''), ('', 'valu;2;'), ('value3;', ''), ('', 'va;ue4;')]

list1 = [x for x,y in matches if x]
# => ['value1;', 'value3;']

list2 = [y for x,y in matches if y]
# => ['valu;2;', 'va;ue4;']

Upvotes: 1

maxashtar
maxashtar

Reputation: 21

It's solved with response of the fourth bird, i will use 2nd pattern and loop over list of blocks found with 1st pattern, like this:

myString = ';01;value1;02;valu;2;03;value3;04;va;ue4;' 
pattern1 = re.compile(r'\d\d;.{6};')
listOfBlocks = pattern1.findall(myString) 
pattern2 = re.compile(r'\d\d;[^;]{6};')
for block in listOfBlocks : 
    if bool(re.search(pattern2, block )) is True :
         listeOK.append(block) 
    else : 
         listeKO.append(block)

Upvotes: 0

JGomez
JGomez

Reputation: 139

Values without ; can be obtained with this expression: \d\d;[^;]{6}

Values with ; can be obtained with this expression: \d\d;(?=[^;]{0,5};).{6}

Upvotes: 1

Related Questions