KANAYO AUGUSTIN UG
KANAYO AUGUSTIN UG

Reputation: 2198

Split String or Substring with Regex pattern in Swift

First let me point out... I want to split a String or Substring with any character that is not an alphabet, a number, @ or #. That means, I want to split with whitespaces(spaces & line breaks) and special characters or symbols excluding @ and #

In Android Java, I am able to achieve this with:

String[] textArr = text.split("[^\\w_#@]");

Now, I want to do the same in Swift. I added an extension to String and Substring classes

extension String {}
extension Substring {}

In both extensions, I added a method that returns an array of Substring

func splitWithRegex(by regexStr: String) -> [Substring] {
    //let string = self (for String extension) | String(self) (for Substring extension)
    let regex = try! NSRegularExpression(pattern: regexStr)
    let range = NSRange(string.startIndex..., in: string)
    return regex.matches(in: string, options: .anchored, range: range)
        .map { match -> Substring in
            let range = Range(match.range(at: 1), in: string)!
            return string[range]
    }
}

And when I tried to use it, (Only tested with a Substring, but I also think String will give me the same result)

let textArray = substring.splitWithRegex(by: "[^\\w_#@]")
print("substring: \(substring)")
print("textArray: \(textArray)")

This is the out put:

substring: This,is a #random @text written for debugging
textArray: []

Please can Someone help me. I don't know if the problem if from my regex [^\\w_#@] or from splitWithRegex method

Upvotes: 0

Views: 652

Answers (1)

vadian
vadian

Reputation: 285290

The main reason why the code doesn't work is range(at: 1) which returns the content of the first captured group, but the pattern does not capture anything.

With just range the regex returns the ranges of the found matches, but I suppose you want the characters between.

To accomplish that you need a dynamic index starting at the first character. In the map closure return the string from the current index to the lowerBound of the found range and set the index to its upperBound. Finally you have to add manually the string from the upperBound of the last match to the end.

The Substring type is a helper type for slicing strings. It should not be used beyond a temporary scope.

extension String {
    func splitWithRegex(by regexStr: String) -> [String] {
        guard let regex = try? NSRegularExpression(pattern: regexStr) else { return [] }
        let range = NSRange(startIndex..., in: self)
        var index = startIndex
        var array = regex.matches(in: self, range: range)
            .map { match -> String in
                let range = Range(match.range, in: self)!
                let result = self[index..<range.lowerBound]
                index = range.upperBound
                return String(result)
            }
        array.append(String(self[index...]))
        return array
    }
}

let text = "This,is a #random @text written for debugging"
let textArray = text.splitWithRegex(by: "[^\\w_#@]")
print(textArray) // ["This", "is", "a", "#random", "@text", "written", "for", "debugging"]

However in macOS 13 and iOS 16 there is a new API quite similar to the java API

let text = "This,is a #random @text written for debugging"
let textArray = Array(text.split(separator: /[^\w_#@]/))
print(textArray)

The forward slashes indicate a regex literal

Upvotes: 3

Related Questions