Reputation: 49
I have a list of adresses that i need to clean up.
The goal is to add a space before the numbers as shown in below examples.
16AVCHARLESDAGAULLECS10525 16 Avcharlesdagaullecs 10525
1BDHIPPOLYTEMARQUES 1 Bdhippolytemarques
20GARRICKSTREET4THFLOOR 20 Garrickstreet 4Thfloor
2109ZAC 2109 Zac
2-4VANDRIESSTREET4 2-4 Vandriesstreet 4
5:ETVÄRG.19.E 5:Etvärg. 19.E
901ACEHIGHENTTOWE9TH233 901 Acehighenttowe 9Th 233
I have found a script that puts spaces in between letters but that is not my goal. Here is the code I have so far. It does not give the result abowe.
Function Add_Spaces(ByVal sText As String) As String
Dim CharNum As Long
Dim FixedText As String
Dim CharCode As Long
FixedText = Left(sText, 1)
For CharNum = 2 To Len(sText)
CharCode = Asc(Mid(sText, CharNum, 1))
If CharCode >= 65 And CharCode <= 90 Then
FixedText = FixedText & " " & Mid(sText, CharNum, 1) 'This needs to be rewritten
Else
FixedText = FixedText & Mid(sText, CharNum, 1)
End If
Next CharNum
Add_Spaces = FixedText
End Function
Do you have any idea on how i can solve this?
Here is a solution for regex that might give a lead: R separate words from numbers in string
Upvotes: 2
Views: 150
Reputation: 60379
This seems to work on all of your examples:
The regular expression splits on the change from number to letter and vice versa, and also makes an exception for TH
and .
Those exceptions may lead to problems in other data, but seems to work for what you present. Whether your variability is such that this will be an issue is unknown.
Note that I used early-binding for the Regex (tools/references set to include Microsoft VBScript Regular Expressions 5.5), but you can change this to late-binding if you will be distributing this code.
Edit: Note that I did NOT convert this to proper case, but that could be done if really required
`Set Reference to Microsoft VBScript Regular Expressions 5.5
Option Explicit
Function replacer(s As String) As String
Dim RE As RegExp
Const sPat As String = "(\dTH|\d)(?!TH)(?=[A-Z])|([A-Z.])(?=\d)"
Const sRepl As String = "$1$2 "
Dim sTemp As String
Set RE = New RegExp
With RE
.Global = True
.Pattern = sPat
.IgnoreCase = True
replacer = .Replace(s, sRepl)
End With
End Function
Formula would be, eg:
A2: =replacer(A2)
If PROPER
case is required, then change formula to:
A2: =PROPER(replacer(A2))
this seems to work better on your data, here in the USA, than does the VBA StrConv
function.
Upvotes: 6
Reputation: 42256
This function would also work on all examples. It also avoid splitting in case of 'TH' following the number:
Private Function SeparateNumbersFromString(x As String) As String
Dim i As Long, j As Long, strInt As String, strFin As String
For i = 1 To Len(x)
strInt = ""
If IsNumeric(Mid(x, i, 1)) Then
For j = i To Len(x)
strInt = strInt & Mid(x, j, 1)
If Not IsNumeric(strInt) Or (Right(strInt, 1) = "-" And _
IsNumeric(left(strInt, Len(strInt) - 1))) Or _
(Right(strInt, 1) = "." And _
IsNumeric(left(strInt, Len(strInt) - 1))) Then
strFin = IIf(strFin = "", strFin, strFin & " ") & _
left(strInt, Len(strInt) - 1) & _
IIf(UCase(Mid(x, j, 2)) = "TH", "", " ")
strInt = ""
i = j - 1
Exit For
End If
If j >= Len(x) Then strFin = strFin & " " & strInt: GoTo Ending
Next j
Else
strFin = strFin & Mid(x, i, 1)
End If
Next i
Ending:
SeparateNumbersFromString = strFin
End Function
Upvotes: 1
Reputation: 96771
Consider:
Public Function OutString(Instring As String) As String
Dim L As Long, i As Long, CH As String
L = Len(Instring)
OutString = Left(Instring, 1)
For i = 2 To L
CH = Mid(Instring, i, 1)
If CH Like "[0-9]" And Not Right(OutString, 1) Like "[0-9]" Then
OutString = OutString & " " & CH
Else
OutString = OutString & CH
End If
Next i
End Function
The code is similar to crawling along the string with a pointer. If the character to the right of the pointer is a numeral and the character to the left of the pointer is not a numeral, then insert a space.
Upvotes: 1
Reputation: 12279
Undoubtedly, Regex is a much tidier way of doing it - but if you wanted to alter your existing code to achieve the desired result I think this might work:
Function Add_Spaces(ByVal sText As String) As String
Dim CharNum As Long
Dim FixedText As String
Dim CharCode As Long
Dim lastCharCode As Long
FixedText = Left(sText, 1)
For CharNum = 2 To Len(sText)
CharCode = Asc(Mid(sText, CharNum, 1))
lastCharCode = Asc(Mid(sText, CharNum - 1, 1))
If (CharCode >= 65) <> (lastCharCode >= 65) Then
FixedText = FixedText & " " & Mid(sText, CharNum, 1) 'This needs to be rewritten
Else
FixedText = FixedText & Mid(sText, CharNum, 1)
End If
Next CharNum
Add_Spaces = Application.WorksheetFunction.Proper(FixedText)
End Function
Upvotes: 2
Reputation: 522471
Try using two regex replacements on the following patterns:
([A-Za-z])([0-9])
([0-9])([A-Za-z])
and replace with $1 $2
:
Dim Regex As System.Text.RegularExpressions.Regex
Dim input As String = "16AVCHARLESDAGAULLECS10525"
Dim output As String = Regex.Replace(input, "([A-Za-z])([0-9])", "$1 $2")
output = Regex.Replace(output, "([0-9])([A-Za-z])", "$1 $2")
Console.WriteLine(output)
This prints:
16 AVCHARLESDAGAULLECS 10525
The strategy here is to match every two boundary characters, in separate capture groups. A boundary here is a number followed by a letter, or vice-versa. Then, we replace with those two captured characters, with a space intercolated in between them.
Upvotes: 2