MikeMaus
MikeMaus

Reputation: 405

How to extract values from string? (swift)

I'am trying to analyze string and decompose it into clear values: taskValue and timeValue.

var str1 = "20 minutes (to do)/(for) some kind of task"
// other possibilities
var str2 = "1 hour 30 minutes for some kind of task"
var str3 = "do some kind of task for 1 hour"

How can I apply multiple regexes in one function? Maybe, something like array of regexes

["[0-9]{1,} minutes", 
 "[0-9] hour", 
 "[0-9] hour, [0-9]{1,} minutes",
  ...]

The values returned from function aren't clean, it remains with "of ..", "for...", "to..." etc.

Can you give me advice how to improve it? Maybe it's possible to do some machine learning with MLKit? How to add a couple of regex patterns? Or to check if string contains certain things manually?

// check it out
var str = "20 minutes to do some kind of task"
func decompose(_ inputText: String) -> (time: String, taskName: String) {
    
    let pattern = "[0-9]{1,} minutes"
    let regexOptions: NSRegularExpression.Options = [.caseInsensitive]
    let matchingOptions: NSRegularExpression.MatchingOptions = [.reportCompletion]
    let range = NSRange(location: 0, length: inputText.utf8.count)
    
    var time = ""
    var taskName = inputText
    
    let regex = try! NSRegularExpression(pattern: pattern, options: regexOptions)
    if let matchIndex = regex.firstMatch(in: inputText, options: matchingOptions, range: range) {
        
        let startIndex = inputText.index(inputText.startIndex, offsetBy: matchIndex.range.lowerBound)
        let endIndex = inputText.index(inputText.startIndex, offsetBy: matchIndex.range.upperBound)
        
        time = String(inputText[startIndex..<endIndex])

        taskName.removeSubrange(startIndex..<endIndex)
           
    } else {
        print("No match.")
    }


    return (time, taskName)
}

print(decompose(str))

Overall, I look to learn how to do text analysis on premise that we know the thematics beforehand.

Upvotes: 1

Views: 229

Answers (1)

Ryszard Czech
Ryszard Czech

Reputation: 18641

Use capture groups:

(\d+)\s*minute|(\d+)\s*hour

See regex proof. Then check which group matched and use captured values as you need. If the first group matched, you have minutes, else, you have hours in the second group.

EXPLANATION

--------------------------------------------------------------------------------
  (                        group and capture to \1:
--------------------------------------------------------------------------------
    \d+                      digits (0-9) (1 or more times (matching
                             the most amount possible))
--------------------------------------------------------------------------------
  )                        end of \1
--------------------------------------------------------------------------------
  \s*                      whitespace (\n, \r, \t, \f, and " ") (0 or
                           more times (matching the most amount
                           possible))
--------------------------------------------------------------------------------
  minute                   'minute'
--------------------------------------------------------------------------------
 |                        OR
--------------------------------------------------------------------------------
  (                        group and capture to \2:
--------------------------------------------------------------------------------
    \d+                      digits (0-9) (1 or more times (matching
                             the most amount possible))
--------------------------------------------------------------------------------
  )                        end of \2
--------------------------------------------------------------------------------
  \s*                      whitespace (\n, \r, \t, \f, and " ") (0 or
                           more times (matching the most amount
                           possible))
--------------------------------------------------------------------------------
  hour                     'hour'

Upvotes: 1

Related Questions