Reputation: 95
I am working on some software that cleans up data before sending it into another system. The data comes from all around the world and contains a variety of characters that have to be replaced. For example ‘, : ; @
The system that accepts the parsed data has very strict character set. It allows the letters A to Z (upper case only) the numerals 0 to 9 the special characters / -. Space < =
The data arrives in Excel spreadsheets so I have written the following code in a visual basic macro.
fhl_str contains the data to be cleansed
fhl_str = Replace(fhl_str, ",", " ")
fhl_str = Replace(fhl_str, "'", " ")
fhl_str = Replace(fhl_str, ":", " ")
fhl_str = Replace(fhl_str, ";", " ")
fhl_str = ucase(fhl_str)
Now, each time a new unwanted character arrives we have to add a new line of code. e.g. fhl_str = Replace(fhl_str, "@", " ")
My question is Could I reverse the logic so that the macro looks for A to Z and 0 to 9 and deletes anything else. That way my code would be future proof for new unwanted characters.
Thanks
Upvotes: 1
Views: 970
Reputation: 11755
Here's some VBA that will do it if you find regex difficult to understand. It uses the ASCII code to determine the only characters to allow. If your scope changes you can modify the ASCII numbers in the Case
statement.
Public Function RemoveSpecial(s As String) As String
Dim sResult As String
Dim nIndex As Integer
s = UCase$(s)
For nIndex = 1 To Len(s)
Select Case Asc(Mid$(s, nIndex, 1))
Case 65 To 90, 45 To 57, 32, 60 To 61
sResult = sResult & Mid$(s, nIndex, 1)
Case Else
sResult = sResult & " "
End Select
Next
RemoveSpecial = sResult
End Function
Usage:
Debug.Print RemoveSpecial("TeSt<>=.@@")
or something like:
Range("A1") = RemoveSpecial("TeSt<>=.@@")
Upvotes: 1
Reputation: 96771
If you want to replace bad characters with a single space:
Sub KeepOnlyTheGood()
Dim i As Long, L As Long, v As String, CH As String
Dim r As Range
For Each r In Selection
t = ""
v = r.Value
L = Len(v)
For i = 1 To L
CH = Mid(v, i, 1)
If CH Like "[0-9A-Z]" Or CH = "/" Or CH = "-" Or CH = "." Or CH = " " Or CH = "<" Or CH = "=" Then
t = t & CH
Else
t = t & " "
End If
Next i
r.Value = t
Next r
End Sub
Upvotes: 1