Reputation: 543
Let's say I have the following string within a cell:
E. Stark, T. Lannister, A. Martell, P Baelish, B. Dondarrion, and J. Mormont. Increased levels of nudity across Westeros contributes to its sporadic seasonal climate. Nat. Proc. Aca. Sci. (2011) 3: 142-149.
And I want to extract only the title from this. The approach I am considering is to write a script that says "Pull text from this string, but only if it is more than 50 characters long." This way it only returns the title, and not stuff like " Stark, T" and " Martell, P". The code I have so far is:
Sub TitleTest()
Dim txt As String
Dim Output As String
Dim i As Integer
Dim rng As Range
Dim j As Integer
Dim k As Integer
j = 5
Set rng = Range("A" & j) 'text is in cell A5
txt = rng.Value 'txt is string
i = 1
While j <= 10 'there are five references between A5 and A10
k = InStr(i, txt, ".") - InStr(i, txt, ". ") + 1 'k is supposed to be the length of the string returned, but I can't differenciate one "." from the other.
Output = Mid(txt, InStr(i, txt, "."), k)
If Len(Output) < 100 Then
i = i + 1
ElseIf Len(Output) > 10 Then
Output = Mid(txt, InStr(i, txt, "."), InStr(i, txt, ". "))
Range("B5") = Output
j = j + 1
End If
Wend
End Sub
Of course, this would work well if it wasn't two "." I was trying to full information from. Is there a way to write the InStr
function in such a way that it won't find the same character twice? Am I going about this in the wrong way?
Thanks in advance,
EDIT: Another approach that might work (if possible), is if I could have one character be " any lower case letter
." and ".". Would even this be possible? I can't find any example of how this could be achieved...
Upvotes: 1
Views: 5583
Reputation: 43575
Here you go, it works exactly as you wish. Judging from your code I am sure that you can adapt it for your needs quite quickly:
Option Explicit
Sub ExtractTextSub()
Debug.Print ExtractText("E. Stark, T. Lannister, A. Martell, P Baelish, B. Dondarrion, and J. Mormont. Increased levels of nudity across Westeros contributes to its sporadic seasonal climate. Nat. Proc. Aca. Sci. (2011) 3: 142-149.")
End Sub
Public Function ExtractText(str_text As String) As String
Dim arr As Variant
Dim l_counter As Long
arr = Split(str_text, ".")
For l_counter = LBound(arr) To UBound(arr)
If Len(arr(l_counter)) > 50 Then
ExtractText = arr(l_counter)
End If
Next l_counter
End Function
Edit: 5 votes in no time made me improve my code a bit :) This would return the longest string, without thinking of the 50 chars. Furthermore, on Error handlaer and a constant for the point. Plus adding a point to the end of the extract.
Option Explicit
Public Const STR_POINT = "."
Sub ExtractTextSub()
Debug.Print ExtractText("E. Stark, T. Lannister, A. Martell, P Baelish, B. Dondarrion, and J. Mormont. Increased levels of nudity across Westeros contributes to its sporadic seasonal climate. Nat. Proc. Aca. Sci. (2011) 3: 142-149.")
End Sub
Public Function ExtractText(str_text As String) As String
On Error GoTo ExtractText_Error
Dim arr As Variant
Dim l_counter As Long
Dim str_longest As String
arr = Split(str_text, STR_POINT)
For l_counter = LBound(arr) To UBound(arr)
If Len(arr(l_counter)) > Len(ExtractText) Then
ExtractText = arr(l_counter)
End If
Next l_counter
ExtractText = ExtractText & STR_POINT
On Error GoTo 0
Exit Function
ExtractText_Error:
MsgBox "Error " & Err.Number & Err.Description
End Function
Upvotes: 6