Dzily
Dzily

Reputation: 3

regex with multiple group swift

im using swift 4 and got some problems with regex captured group.

so , i have regex here :

let regex = try! NSRegularExpression(pattern:"Tambah saldo via (merchant|transfer (online|bank)).(\\w+ \\.?)($)?(((\\w+\\.)?\\w+? (\\w+ )?\\.) No Rek:([0-9]+(.[0-9])*).[^0-9]+([0-9]+))?", options: .caseInsensitive)
   

and has multiple response :

let response1 = "Tambah saldo via transfer bank BCA PT.BARRYPRIMA CORP . No Rek:427.123.444 . Jumlah transfer Rp.200864."
let response2 = "Tambah saldo via transfer online Bersama BARRYPRIMA CORP . No Rek:123456. Jumlah transfer Rp.200864."
let response3 = "Tambah saldo via transfer bank Mandiri BARRYPRIMA . No Rek:43512343598347. Jumlah transfer Rp.200864."
let response4 = "Tambah saldo via merchant Tito ."
let response5 = "Tambah saldo via merchant Bagaskara putra."

i want to get multiple values and store it to [String] the value im looking is

1: "transfer bank" , "BCA" , "PT.BARRYPRIMA CORP","427.123.444","200864"

2: "transfer online", "Bersama", "BARRYPRIMA CORP", "123456","200864"

3: "transfer bank" , "Mandiri" , "BARRYPRIMA","43512343598347","200864"

4: "merchant","tito"

5: "merchant","bagaskara putra"

i ve tried function from this one

and get this errors

Thread 1: Fatal error: subscript: subrange extends past String end

Upvotes: 0

Views: 214

Answers (1)

OOPer
OOPer

Reputation: 47896

First, you need to modify your regex pattern, to get the expected captures you describe.

  • Suppress some captures by using (?:...)

  • Your pattern has some faults, for example ([0-9]+(.[0-9])*) does not match 427.123.444

I got this:

let regex = try! NSRegularExpression(pattern:"Tambah saldo via (merchant|transfer (?:online|bank)).(?:(\\w+(?:\\s*\\w+)*)\\.$|(\\w+)\\s+)(?:(?:((?:\\w+\\.)?\\w+?(?:\\s+\\w+)?)\\s+\\.) No Rek:([0-9]+(?:\\.[0-9]+)*)\\s*\\.[^0-9]+([0-9]+))?", options: .caseInsensitive)

But you may need to modify some more parts, because I do not understand how . or works.

And I do not understand why this input:

let response5 = "Tambah saldo via merchant Bagaskara putra."

generates:

5: "merchant","bagaskara putra"

Isn't it "merchant", "Bagaskara", "putra" ? (3-elements I mean.)


But your question is about how to return multiple captures into a String Array, not about finding the right pattern for your purpose.

So, I use the regex above for testing.


The reason why you get Fatal error: subscript: subrange extends past String end is because some capture groups may return NSRange(location: NSNotFound, length: 0) when not matched. (NSNotFound is a huge Int.)

This happens when the capture group is included in some optional pattern using ? or |.

We can use Range(nsRange, in: str) when converting NSRange to Range<String.Index>, which returns nil (not crashes!) in such cases.

So, you can write something like this:

extension String {
    func capturedGroups(withRegex regex: NSRegularExpression) -> [String] {
        guard let match = regex.firstMatch(in: self, options: [], range: NSRange(0..<utf16.count)) else {
            return []
        }

        let lastRangeIndex = match.numberOfRanges - 1
        guard lastRangeIndex >= 1 else { return [] }

        return (1...lastRangeIndex).compactMap {Range(match.range(at: $0), in: self)}
            .map {String(self[$0])}
    }
}

With using the extension above, you can get these outputs:

let response1 = "Tambah saldo via transfer bank BCA PT.BARRYPRIMA CORP . No Rek:427.123.444 . Jumlah transfer Rp.200864."
let response2 = "Tambah saldo via transfer online Bersama BARRYPRIMA CORP . No Rek:123456. Jumlah transfer Rp.200864."
let response3 = "Tambah saldo via transfer bank Mandiri BARRYPRIMA . No Rek:43512343598347. Jumlah transfer Rp.200864."
let response4 = "Tambah saldo via merchant Tito ."
let response5 = "Tambah saldo via merchant Bagaskara putra."
[response1, response2, response3, response4, response5].forEach { str in
    print(str.capturedGroups(withRegex: regex))
}

Outputs:

["transfer bank", "BCA", "PT.BARRYPRIMA CORP", "427.123.444", "200864"]
["transfer online", "Bersama", "BARRYPRIMA CORP", "123456", "200864"]
["transfer bank", "Mandiri", "BARRYPRIMA", "43512343598347", "200864"]
["merchant", "Tito"]
["merchant", "Bagaskara putra"]

Upvotes: 1

Related Questions