Sam
Sam

Reputation: 47

Removing only numbers at beginning of sentences in string - Swift

I need to remove numbers from within a string, but only if those numbers are at the beginning of a sentence. I don't want to remove numbers within sentences. For instance:

1There were 101 dalmatians in the room. 2 They had 2 parents. [new line]3 The parents were named Pongo and Perdita.

In the above text, I want to remove the numbers at the beginning of each sentence, whether there is a space after the number or not, including if the number is the first character on a new line. So, the text in the string needs to become:

There were 101 dalmatians in the room. They had 2 parents. [new line]The parents were named Pongo and Perdita.

Thanks for your help!

Upvotes: 1

Views: 452

Answers (3)

Rob
Rob

Reputation: 437562

You can enumerate the sentences, use regular expression to trim leading numbers, and build final string.

E.g.

let string = """
    1There were 101 dalmatians in the room. 2 They had 2 parents.
    3 The parents were named Pongo and Perdita.
    """

var result: String = ""
string.enumerateSubstrings(in: string.startIndex..., options: .bySentences) { substring, _, _, _ in
    guard
        let trimmed = substring?.replacingOccurrences(of: #"^\d+\s*"#, with: "", options: .regularExpression)
    else { return }
    result.append(trimmed)
}
print(result)

There were 101 dalmatians in the room. They had 2 parents.
The parents were named Pongo and Perdita.

There are lots of permutations on the regex pattern. E.g. if you used #"^\d+\.?\)?\s*"#, it would also handle cases like “1. This is a test!” or “1) This is a test.” It just depends upon what variations you want to handle. But if you're just looking for digits only, with or without spaces, then #"^\d+\s*"# should be fine.

Upvotes: 3

Leo Dabus
Leo Dabus

Reputation: 236360

Just for fun. You can enumerate the sentences, get all sentence ranges and replace the ranges of the original string. To remove the numbers from the beginning of each sentence you can use collection method drop(while:) and drop all non letters characters from your substrings (sentences):

extension Bool {
    var negated: Bool { !self }
}

var string = """
1There were 101 dalmatians in the room. 2 They had 2 parents.
3 The parents were named Pongo and Perdita.
"""
var ranges: [Range<String.Index>] = []

string.enumerateSubstrings(in: string.startIndex..., options: .bySentences) { _, range, _, _ in
    ranges.append(range)
}

for range in ranges.reversed() {
    string.replaceSubrange(range, with: string[range].drop(while: \.isLetter.negated))
}

string  // "There were 101 dalmatians in the room. They had 2 parents.\nThe parents were named Pongo and Perdita."

Upvotes: 1

Joakim Danielson
Joakim Danielson

Reputation: 51945

Here is a not so fancy solution using a basic for loop to go through each character of the string and a boolean to keep track if we are or aren't checking for a digit at the start of the sentence.

var checkForDigit = true // State of loop
var digitFound = false // Need this to ignore space after digit 
var output = ""
for character in text {
    if character.isPunctuation {
        output.append(character)
        checkForDigit = true
        continue
    }

    if checkForDigit {
        if !character.isNumber {
            if digitFound && character.isWhitespace {
                continue
            }
            output.append(character)
            if !character.isWhitespace {
                checkForDigit = false
            }
        } else {
            digitFound = true
        }
        continue
    }
    digitFound = false
    output.append(character)
}

I have only tested it on the example in the question so it might need some tweaking

Upvotes: 1

Related Questions