Reputation: 155
I am working with data from a DB which produces information on transactions.
The problem is that transactions can have any number of related attributes, and transaction details will be replicated with a new line for each attribute.
In the format of:
[Transaction ID] [tab] [Attribute name] [tab] [Attribute value] [tab] [date]
Example:
11111 Amount 12000
11111 Reference 101010
11111 Operator John
11111 Subject Credit
11111 Notes XXXXXXXX
11112 Amount 75000
11112 Reference 202020
11112 Operator Will
I am trying to identify a REGEX expression for EACH attribute which will match on the following logic;
"Amount" - followed by TAB - followed by variable length number - followed by TAB
"Reference" - followed by TAB - followed by variable length number - followed by TAB
"Operator" - followed by TAB - followed by variable length string - followed by TAB
"Subject" - followed by TAB - followed by variable length string- followed by TAB
"Notes" - followed by TAB - followed by variable length string- followed by TAB
Upvotes: 1
Views: 100
Reputation: 2882
This answer applies more to reading all attributes that belong to the same transaction id. Take a look at regex101.com
(?s) // dot matches newline
(?<tid>\d+) // transactionid
\t
(?:Amount\t(?<amount>\d+)) // amount
.\1\t // newline, transactionid, tab
(?:Reference\t(?<ref>\d+)) // reference
.\1\t // newline, transactionid, tab
(?:Operator\t(?<ope>\w+)) // operator
(?:.\1\t(?:Subject\t(?<sub>\w+)))? // possible subject
(?:.\1\t(?:Notes\t(?<not>\w+)))? // possible notes
(?!\1) // negative lookahead
For a simple explanation, you want to read attributes until the transaction id is a different one.
Upvotes: 1
Reputation: 2882
A regex like this
(?<transactionid>\d+)\t(?<attribute>Amount|Reference|Operator|Subject|Notes)\t(?<value>\w+)
will do.
Look at regex101.com
Explanation:
(?<transactionid>\d+) // transaction id
\t // followed by tab
(?<attribute>Amount|Reference|Operator|Subject|Notes) // attribute
\t // followed by tab
(?<value>\w+) // value
Upvotes: 0