Mohamed Asane
Mohamed Asane

Reputation: 25

Date parsing from unstructured string

I have unstructured string. From this I need to find the date.

Example: [expected inputs]

  1. "01/21/2012: text will be here"
  2. ";01/21/2012: text will be here"
  3. "text will be here. 01/21/2012: continues text"
  4. "text will be here. \n 01/21/2012: continues text"
  5. " text will be here 01/21/2012"

Note: Date can be any format such as 1st Jan 2012, 12-Jan-2012, 12/01/2012 etc

Any help greately appriciated.

Upvotes: 1

Views: 451

Answers (4)

Calphalon
Calphalon

Reputation: 68

Parse the string into contiunous blocks sepearated by spaces, looks like a string.split(" ") almost works, but you may need to account fo your ":"s.

On each block, check with DateTime.TryParse.

    Dim text(2) As String
    text(0) = "01/21/2012: text will be here"
    text(1) = "text will be here. \n 01/21/2012: continues text"
    text(2) = " text will be here 01/21/2012"

    For Each s As String In text
        Dim a As String() = s.split(" "c)
        For Each s1 As String In a
            If s1.endswith(":") Then s1 = s1.remove(s1.length-1)
            Dim dt As datetime
            Dim ok As Boolean = datetime.tryparse(s1,dt)
            If ok = True Then output.writeline(dt.tostring)
        Next s1
    Next s

Upvotes: 1

Simon. Li
Simon. Li

Reputation: 429

I think regular expression will help.

First, write down all possible date format second, convert them to regular expression final, match the regular expression.

Be attention, regular expression do not support counting, so, you can only extract one, two, three..... a specify number of dates in one match, if the number of dates in the string is not fixed, you can dynamic generate regular expression or match multiple times.

Upvotes: 0

Sebastian Siek
Sebastian Siek

Reputation: 2075

The best way would be to use RegEx but you would have to create rules for all date formats. Otherwise you could use more generic regex expression and then find all matches and validate/parse as date.

Hope that gives you an idea how to do it.

Upvotes: 0

user1921
user1921

Reputation:

Why is the user input allowing such free-form text to begin with? With the input that open-ended any string parsing you do is going to be spotty at best. What if the user enters numbers that look like dates or another date? How would you determine which date was the "date" you need to track?

Some more information on your problem MAY help with a solution, but right now I'd suggest requiring the date to be entered in its own input element.

Upvotes: 2

Related Questions