Reputation: 3
im using swift 4 and got some problems with regex captured group.
so , i have regex here :
let regex = try! NSRegularExpression(pattern:"Tambah saldo via (merchant|transfer (online|bank)).(\\w+ \\.?)($)?(((\\w+\\.)?\\w+? (\\w+ )?\\.) No Rek:([0-9]+(.[0-9])*).[^0-9]+([0-9]+))?", options: .caseInsensitive)
and has multiple response :
let response1 = "Tambah saldo via transfer bank BCA PT.BARRYPRIMA CORP . No Rek:427.123.444 . Jumlah transfer Rp.200864."
let response2 = "Tambah saldo via transfer online Bersama BARRYPRIMA CORP . No Rek:123456. Jumlah transfer Rp.200864."
let response3 = "Tambah saldo via transfer bank Mandiri BARRYPRIMA . No Rek:43512343598347. Jumlah transfer Rp.200864."
let response4 = "Tambah saldo via merchant Tito ."
let response5 = "Tambah saldo via merchant Bagaskara putra."
i want to get multiple values and store it to [String] the value im looking is
1: "transfer bank" , "BCA" , "PT.BARRYPRIMA CORP","427.123.444","200864"
2: "transfer online", "Bersama", "BARRYPRIMA CORP", "123456","200864"
3: "transfer bank" , "Mandiri" , "BARRYPRIMA","43512343598347","200864"
4: "merchant","tito"
5: "merchant","bagaskara putra"
i ve tried function from this one
and get this errors
Thread 1: Fatal error: subscript: subrange extends past String end
Upvotes: 0
Views: 214
Reputation: 47896
First, you need to modify your regex pattern, to get the expected captures you describe.
Suppress some captures by using (?:...)
Your pattern has some faults, for example ([0-9]+(.[0-9])*)
does not match 427.123.444
I got this:
let regex = try! NSRegularExpression(pattern:"Tambah saldo via (merchant|transfer (?:online|bank)).(?:(\\w+(?:\\s*\\w+)*)\\.$|(\\w+)\\s+)(?:(?:((?:\\w+\\.)?\\w+?(?:\\s+\\w+)?)\\s+\\.) No Rek:([0-9]+(?:\\.[0-9]+)*)\\s*\\.[^0-9]+([0-9]+))?", options: .caseInsensitive)
But you may need to modify some more parts, because I do not understand how .
or works.
And I do not understand why this input:
let response5 = "Tambah saldo via merchant Bagaskara putra."
generates:
5: "merchant","bagaskara putra"
Isn't it "merchant", "Bagaskara", "putra" ? (3-elements I mean.)
But your question is about how to return multiple captures into a String Array, not about finding the right pattern for your purpose.
So, I use the regex above for testing.
The reason why you get Fatal error: subscript: subrange extends past String end is because some capture groups may return NSRange(location: NSNotFound, length: 0)
when not matched. (NSNotFound
is a huge Int.)
This happens when the capture group is included in some optional pattern using ?
or |
.
We can use Range(nsRange, in: str)
when converting NSRange
to Range<String.Index>
, which returns nil
(not crashes!) in such cases.
So, you can write something like this:
extension String {
func capturedGroups(withRegex regex: NSRegularExpression) -> [String] {
guard let match = regex.firstMatch(in: self, options: [], range: NSRange(0..<utf16.count)) else {
return []
}
let lastRangeIndex = match.numberOfRanges - 1
guard lastRangeIndex >= 1 else { return [] }
return (1...lastRangeIndex).compactMap {Range(match.range(at: $0), in: self)}
.map {String(self[$0])}
}
}
With using the extension above, you can get these outputs:
let response1 = "Tambah saldo via transfer bank BCA PT.BARRYPRIMA CORP . No Rek:427.123.444 . Jumlah transfer Rp.200864."
let response2 = "Tambah saldo via transfer online Bersama BARRYPRIMA CORP . No Rek:123456. Jumlah transfer Rp.200864."
let response3 = "Tambah saldo via transfer bank Mandiri BARRYPRIMA . No Rek:43512343598347. Jumlah transfer Rp.200864."
let response4 = "Tambah saldo via merchant Tito ."
let response5 = "Tambah saldo via merchant Bagaskara putra."
[response1, response2, response3, response4, response5].forEach { str in
print(str.capturedGroups(withRegex: regex))
}
Outputs:
["transfer bank", "BCA", "PT.BARRYPRIMA CORP", "427.123.444", "200864"] ["transfer online", "Bersama", "BARRYPRIMA CORP", "123456", "200864"] ["transfer bank", "Mandiri", "BARRYPRIMA", "43512343598347", "200864"] ["merchant", "Tito"] ["merchant", "Bagaskara putra"]
Upvotes: 1