Reputation: 638
I am trying to get a regex expression to decode a string with either a comma or semi-colon as anchor but I can't seem to get it to work for comma's or both. Please tell me what I'm missing or doing wrong. thanks!
^(?<FADECID>\d{6})?(?<MSG>([a-z A-Z 0-9 ()-:]*[;,]{1}+){8,}+)?(?<ANCH>\w*[;,])?(?<TIME>\d{4})?(?<FM>\d{2})?[;,]?(?<CON>.*)$.*
inbound type strings to decode - I need to treat the comma and or semicolon the same.
383154VSC X1;;;;;;;BOTH WASTE DRAIN VLV NOT CLSD (135MG;35MG);HARD;093502
282151FCMC1 1;;;;;;;FUEL MAIN PUMP1 (121QA1);HARD;093502
732112EEC2B 1;;;;;;;FMU(E2-4071KS)WRG:EEC J12 TO FMV LVDT POS,HARD;
383154VSC X1,,,,,,,BOTH WASTE DRAIN VLV NOT CLSD (135MG,35MG),HARD,093502
282151FCMC1 1,,,,,,,FUEL MAIN PUMP1 (121QA1);HARD;093502
732112EEC2B 1,,,,,,,FMU(E2-4071KS)WRG:EEC J12 TO FMV LVDT POS,HARD,
383154VSC X1,,,,,,,BOTH WASTE DRAIN VLV NOT CLSD (135MG;35MG);HARD;093502
282151FCMC1 1;;;;;;;FUEL MAIN PUMP1 (121QA1),HARD,093502
732112EEC2B 1,,,,,,,FMU(E2-4071KS)WRG:EEC J12 TO FMV LVDT POS;HARD;
This string has the possibility to contain mulitple text [;,] separated messages.
ABC;DEF;;HIJ;NNN;JJJ;XXX;EEX;HARD;
This manages that - (?([a-z A-Z 0-9 ()-:]*[;,]{1}+){8,}+)? but it doesn't observe commas?
This works for ; but not for comma or both, my problem is that it can be both a semi-colon or a comma? if I make the regex only comma, it works for comma strings, I know i'm missing a quantifier or something like.
if ( null != MORE && ! MORE.isEmpty() ) {
while ( null != MORE && ! MORE.isEmpty() || MORE.trim().equals("EOR")) {
LOG.info("MORE CONTINUE: " + MORE);
if ( MORE.trim().equals("EOR") ) {
break;
}
String patternMoreString = "^(?<FADECID>\\d{6})?(?<MSG>([a-z A-Z 0-9 ()-:()]*[;,]{1}+){8,}+)+?(?<ANCH>\\w*[;,])?(?<TIME>\\d{4})?(?<FM>\\d{2})?[;,]?(?<CON>.*)$.*";
Pattern patternMore = Pattern.compile(patternMoreString, Pattern.DOTALL);
Matcher matcherMore = patternMore.matcher(MORE);
while ( matcherMore.find() ) {
MORE = matcherMore.group("CON");
summary.setReportId("FLR");
summary.setAreg(Areg);
summary.setShip(Ship);
summary.setOrig(Orig);
summary.setDest(Dest);
summary.setTimestamp(Ts);
summary.setAta(matcherMore.group("FADECID"));
summary.setTime(matcherMore.group("TIME"));
summary.setFm(matcherMore.group("FM"));
summary.setMsg(matcherMore.group("MSG"));
serviceRecords.add(summary);
LOG.info("*** A330 MPF MORE Record ***");
LOG.info(summary.getReportId());
LOG.info(summary.getAreg());
LOG.info(summary.getShip());
LOG.info(summary.getOrig());
LOG.info(summary.getDest());
LOG.info(summary.getTimestamp());
LOG.info(summary.getAta());
LOG.info(summary.getTime());
LOG.info(summary.getFm());
LOG.info(summary.getMsg());
summary = new A330PostFlightReportRecord();
}
}
}
}
//---
I need for all cases group 2 and if TIME and FM exists.
Upvotes: 0
Views: 70
Reputation: 163287
You could make use of a capturing group and a backreference using the number of that group to get consistent delimiters.
In this case the capturing group is ([;,])
which is the fourth group denoted by \4
matching either ;
or ,
If you only need group 2 and if TIME and FM you can omit group ANCH
^(?<FADECID>\d{6})(?<MSG>([a-zA-Z0-9() -]*([;,])){7,})(?<TIME>\d{4})?(?<FM>\d{2})?\4?(?<CON>.*)$
Explanation
^
Start of string(?<FADECID>\d{6})
Named group FADECID
, match 6 digits(?<MSG>
Named group MSG
(
Capture group 3
[a-zA-Z0-9() -]*
Match 0+ times any of the lister([;,])
Capture group 4, used as backreference to get consistent delimiters){7,}
Close group and repeat 7+ times)
Close group MSG
(?<TIME>\d{4})?
Optional named group TIME
, match 4 digits(?<FM>\d{2})?
Optional named group FM
, match 2 digits\4?
Optional backreference to capture group 4(?<CON>.*)
Named group CON
Match any char except a newline 0+ times$
End of stringNote that group 3 the capture group itself is repeated, giving you the last value of the iteration, which will be HARD
Upvotes: 1