wendy
wendy

Reputation: 105

Extract text from long string using Excel

What formula should I be using if I would like to extract certain text or date out from the long string? Text to column doesn't work because I have a list of inconsistent data. Using the below example, I would like to extract Female and the Date of Birth.

For example, {'Name:''John Doe', 'Age:''29', 'Gender:''Female','Date of birth:''1989-09-28', 'Country:'"United Kingdom'}

I have read MID, LENS, RIGHT and LEFT but really not sure how should I do this. Would be really appreciate if someone can help me in this!

Upvotes: 0

Views: 720

Answers (3)

A. Hurley
A. Hurley

Reputation: 1

The slight difficulty for using LEFT() and RIGHT() is that MALE and FEMALE have a different letter count. So, you need to use a little abstract thinking. To use LEFT() and RIGHT(), you can do the following (assuming information is in cell A1):

=RIGHT(LEFT(A1,FIND("',",A1,FIND("Gender",A1))-1),FIND("',",A1,FIND("Gender",A1))-FIND("Gender",A1)-9)

This will return "Female", or "Male" depending on what the individual inputs. (The -18 comes from the distance from the beginning of Gender to the beginning of (Fe)Male being used twice).

Similarly, we can do the same for the birthdate. However, since this is a field of constant length, we can use actual numbers.

=RIGHT(LEFT(A1,FIND("birth:''",A1)+17),10)

Hope this helps!

Upvotes: 0

Nestor
Nestor

Reputation: 27

There is no one function that will do it. You'll have to use a combination of functions. Assuming string is in cell A1 the following function will work for gender.

=MID(A1,FIND("Gender:",A1)+9,FIND("'",A1,FIND("Gender:",A1)+9)-FIND("'",A1,FIND("Gender:",A1)+8)-1)

You need to find the quote before and after the Gender value and then do the mid on it, Similar can be done for date of birth

Upvotes: 0

K.Dᴀᴠɪs
K.Dᴀᴠɪs

Reputation: 10139

You can create a custom UDF.

Public Function getDemographics(inputString As String, retType As Long) As String

    If retType < 0 Or retType > 1 Then
        getDemographics = "Invalid retType!"
        Exit Function
    End If

    Dim retArr(1)

    With CreateObject("VBScript.RegExp")
        .Pattern = "'Gender:''(\w+).*?'Date of birth:''([\d-]+)"
        If .test(inputString) Then
            With .Execute(inputString)(0)
                retArr(0) = .SubMatches(0)
                retArr(1) = .SubMatches(1)
                getDemographics = retArr(retType)
            End With
        End If
    End With

End Function

This uses regular expressions. You provide two arguments, inputString, which would be the

{'Name:''John Doe', 'Age:''29', 'Gender:''Female','Date of birth:''1989-09-28', 'Country:'"United Kingdom'}

and the retType. 0 is for the gender and 1 is for the DOB.

On your worksheet, you would just call the function like:

=getDemographics(A1,1)

Which would return:

1989-09-28

Upvotes: 2

Related Questions