Reputation: 65
I am writing an app that is receiving message blocks via TCP. A message block is composed of the following:
<<:--!!
It sound logical to use NSRegularExpression to extract the messages from the data received, so I ended up with the following code in playground, implementing the processing of a string of data received:
import UIKit
struct Constants {
static let messageHeaderPattern = "<<:--!!(\\d{6})(.+)"
}
let receivedData = "<<:--!!000010My message"
let regex = try! NSRegularExpression(pattern: Constants.messageHeaderPattern, options: []) // Define the regular expression
let range = NSMakeRange(0, receivedData.characters.count) // Define the range (all the string)
let matches = regex.matchesInString(receivedData, options: [], range: range) // Get the matches
print("Number of matches: \(matches.count)")
for match in matches {
let locationOfMessageLength = match.rangeAtIndex(1).location
let expectedLengthOfMessage = Int(receivedData.substringWithRange(Range(start: receivedData.startIndex.advancedBy(locationOfMessageLength),
end: receivedData.startIndex.advancedBy(locationOfMessageLength + 6))))
let locationOfMessage = match.rangeAtIndex(2).location
let lengthOfMessage = match.rangeAtIndex(2).length
let data = receivedData.substringWithRange(Range(start: receivedData.startIndex.advancedBy(locationOfMessage),
end: receivedData.startIndex.advancedBy(locationOfMessage + lengthOfMessage)))
// data contains "My message"
}
This code works well, but only if there is one message in the string. To make it work for multiple messages, I changed the regular expression:
static let messageHeaderPattern = "(?:<<:--!!(\\d{6})(.+))+"
and the received data:
let receivedData = "<<:--!!000010My message<<:--!!000014Second message"
But there is still only one match, and data contains My message<<:--!!000014Second message
.
What is wrong with my regular expression?
Upvotes: 1
Views: 336
Reputation: 11993
The message could even contain <<:--!!\d{6}
so I don't think you will be able to do this with regex alone, so the safe solution is.
^<<:--!!(\d{6})
to extract the length NIf you want to live dangerously and are confident that <<:--!!\d{6}
will never occur in the message then this regex will do the trick.
(?<=<<:--!!\d{6})(.*?)(?=<<:--!!\d{6}|$)
Just remember it will mess up if the delimiter occurs inside the string, you should use the method in my first example to be safe.
Upvotes: 1
Reputation: 439
Try filtering the message itself more, so the (.*) not include the second message in it:
"(?:<<:--!!(\\d{6})([a-zA-Z ]+))"
Upvotes: 0
Reputation: 431
Try using the pattern static let messageHeaderPattern = "<<:--!!(\\d{6})(.+?)(?!<<:--!!)"
Upvotes: 0