Reputation: 6110
I have regex list that check for multiple column values. Columns are separated by tab delimiter. I have 32 columns that are required and another 12 optional. So if my file has 44 columns in total I would need to run entire list but if I have 32 only then I would like to make rest of the list optional. Here is my regex:
<cfset myRegex = "^[A-Za-z '-]{1,20}\t[A-Za-z '-]{1,20}\t(0?[1-9]|1[0-2])\/(0?[1-9]|1\d|2\d|3[01])\/(19|20)\d{2}\t([M|F])\t(0?[0-9]|1[0-2]|PK)\t[A-Za-z0-9 ]{0,50}\t[A-Za-z0-9 ]{0,50}\t\d{10}(\t[^\t]*){22}\t([0-9]{4})\t([^\t]*)\t([A-Za-z '-]{1,20}|NULL)\t([A-Za-z '-]{1,20}|NULL)\t([A-Za-z0-9 '-.;]{1,100}|NULL)\t([A-Za-z '-]{1,50}|NULL)\t(\b([A-Z]{2})\b|NULL)\t(\b\d{5}(-\d{4})?\b|NULL)\t(\(?([0-9]{3})\)?[-. ]?([0-9]{3})[-. ]?([0-9]{4})|NULL)\t(\(?([0-9]{3})\)?[-. ]?([0-9]{3})[-. ]?([0-9]{4})|NULL)\t(\(?([0-9]{3})\)?[-. ]?([0-9]{3})[-. ]?([0-9]{4})|NULL)\t(\d+|NULL)\t([A-Za-z '-]{1,50}|NULL)\t((([A-Za-z0-9]+_+)|([A-Za-z0-9]+\-+)|([A-Za-z0-9]+\.+)|([A-Za-z0-9]+\++))*[A-Za-z0-9]+@((\w+\-+)|(\w+\.))*\w{1,63}\.[a-zA-Z]{2,6}|NULL)\r?$">
I'm wondering if I can make these extra columns optional in my list that way I can use them only if I have 44 columns. I tried to put ?
after each optional column but that did not work, my regex was outputting false in that case.
Here is updated regex with the optional columns in (\t):
<cfset fileRegex = "^[A-Za-z '-]{1,20}\t[A-Za-z '-]{1,20}\t(0?[1-9]|1[0-2])\/(0?[1-9]|1\d|2\d|3[01])\/(19|20)\d{2}\t([M|F])\t(0?[0-9]|1[0-2]|PK)\t[A-Za-z0-9 ]{0,50}\t[A-Za-z0-9 ]{0,50}\t\d{10}(\t[^\t]*){22}\t([0-9]{4})\t(([^\t]*)\t)?(([A-Za-z '-]{1,20}|NULL)\t)?(([A-Za-z '-]{1,20}|NULL)\t)?(([A-Za-z0-9 '-.;]{1,100}|NULL)\t)?(([A-Za-z '-]{1,50}|NULL)\t)?((\b([A-Z]{2})\b|NULL)\t(\b\d{5}(-\d{4})?\b|NULL)\t)?((\(?([0-9]{3})\)?[-. ]?([0-9]{3})[-. ]?([0-9]{4})|NULL)\t)?((\(?([0-9]{3})\)?[-. ]?([0-9]{3})[-. ]?([0-9]{4})|NULL)\t)?((\(?([0-9]{3})\)?[-. ]?([0-9]{3})[-. ]?([0-9]{4})|NULL)\t)?((\d+|NULL)\t)?(([A-Za-z '-]{1,50}|NULL)\t)?(((([A-Za-z0-9]+_+)|([A-Za-z0-9]+\-+)|([A-Za-z0-9]+\.+)|([A-Za-z0-9]+\++))*[A-Za-z0-9]+@((\w+\-+)|(\w+\.))*\w{1,63}\.[a-zA-Z]{2,6}|NULL))?\r?$">
This did not work as well.
Example with the data:
<cfset myData = "Ruiz John 09/01/1984 M 00 Red Star player 3345678879 0 0 0 0 0 1 0 0 0 0 0 0 0 0 19234 1011 0089 01">
<cfif REFind(fileRegex,myData,true) GT 0>
true<br>
<cfelse>
false<br>
</cfif>
Upvotes: 2
Views: 73
Reputation: 626927
Your fixed regex is
^[A-Za-z '-]{1,20}\t[A-Za-z '-]{1,20}\t(0?[1-9]|1[0-2])\/(0?[1-9]|1\d|2\d|3[01])\/(19|20)\d{2}\t([M|F])\t(0?[0-9]|1[0-2]|PK)\t[A-Za-z0-9 ]{0,50}\t[A-Za-z0-9 ]{0,50}\t\d{10}(\t[^\t]*){22}\t([0-9]{4})\t[^\t]*(\t([A-Za-z '-]{1,20}|NULL)\t)?(([A-Za-z '-]{1,20}|NULL)\t)?(([A-Za-z0-9 '-.;]{1,100}|NULL)\t)?(([A-Za-z '-]{1,50}|NULL)\t)?((\b([A-Z]{2})\b|NULL)\t(\b\d{5}(-\d{4})?\b|NULL)\t)?((\(?([0-9]{3})\)?[-. ]?([0-9]{3})[-. ]?([0-9]{4})|NULL)\t)?((\(?([0-9]{3})\)?[-. ]?([0-9]{3})[-. ]?([0-9]{4})|NULL)\t)?((\(?([0-9]{3})\)?[-. ]?([0-9]{3})[-. ]?([0-9]{4})|NULL)\t)?((\d+|NULL)\t)?(([A-Za-z '-]{1,50}|NULL)\t)?(((([A-Za-z0-9]+_+)|([A-Za-z0-9]+\-+)|([A-Za-z0-9]+\.+)|([A-Za-z0-9]+\++))*[A-Za-z0-9]+@((\w+\-+)|(\w+\.))*\w{1,63}\.[a-zA-Z]{2,6}|NULL))?\r?$
The regex demo is here.
There was one \t
that needed moving from (([^\t]*)\t)?
to the (([A-Za-z '-]{1,20}|NULL)\t)?
(=> (\t([A-Za-z '-]{1,20}|NULL)\t)?
) optional group.
Upvotes: 2
Reputation: 106
wrap the optional column's regex (and one \t) with parenthesis () and then add ? at the end
Upvotes: 0