Reputation: 9194
I have read some topics explaining how to do this, which would be incredibly slow. The explanation is here: https://www.extendoffice.com/documents/excel/651-excel-remove-non-numeric-characters.html
It involves iterating through each cell in a range and then iterating through the characters in the field and removing them if they do not match [0-9]
.
Any suggestions to do this more efficiently?
One that comes to mind is loading the cell contents into an array, iterating through it, and splitting each entry into its own array to iterate through.
Upvotes: 2
Views: 14987
Reputation: 431
Using regex (you need the library Microsoft VBScript Regular Expression 5.5 under Tools-References):
Public Function GetNumericValue(range)
Set myRegExp = New RegExp
myRegExp.IgnoreCase = True
myRegExp.Global = True
myRegExp.Pattern = "[\D]"
GetNumericValue = myRegExp.Replace(range.value, "")
End Function
Upvotes: 3
Reputation: 22195
For the VBA side of things (note the loops), I decided to satisfy my own curiosity about the performance of a couple different methods. All of them pull the range into an array and work on it in place. The linked article will get killed in speed by any of these, simply due to the overhead in reading and writing single cell values.
For the first method, I optimized the code from the linked article "a bit":
Private Sub MidMethod(values() As Variant)
Dim r As Long, c As Long, i As Long
Dim temp As String, output As String
For r = LBound(values, 1) To UBound(values, 1)
For c = LBound(values, 2) To UBound(values, 2)
output = vbNullString
For i = 1 To Len(values(r, c))
temp = Mid$(values(r, c), i, 1)
If temp Like "[0-9]" Then
output = output & temp
End If
Next
values(r, c) = output
Next
Next
End Sub
For the second method I used RegExp.Replace
:
Private Sub RegexMethod(values() As Variant)
Dim r As Long, c As Long, i As Long
With New RegExp
.Pattern = "[^0-9]"
.MultiLine = True
.Global = True
For r = LBound(values, 1) To UBound(values, 1)
For c = LBound(values, 2) To UBound(values, 2)
values(r, c) = .Replace(values(r, c), vbNullString)
Next
Next
End With
End Sub
Finally, for the last method I used a Byte
array:
Private Sub ByteArrayMethod(values() As Variant)
Dim r As Long, c As Long, i As Long
Dim chars() As Byte
For r = LBound(values, 1) To UBound(values, 1)
For c = LBound(values, 2) To UBound(values, 2)
chars = values(r, c)
values(r, c) = vbNullString
For i = LBound(chars) To UBound(chars) Step 2
If chars(i) > 47 And chars(i) < 58 Then
values(r, c) = values(r, c) & Chr$(chars(i))
End If
Next
Next
Next
End Sub
Then I used this code to benchmark them against 1000 cells, each containing a random mix of 25 letters and numbers:
Private Sub Benchmark()
Dim data() As Variant, start As Double, i As Long
start = Timer
For i = 1 To 5000
data = ActiveSheet.Range("A1:J100").Value
MidMethod data
Next
Debug.Print "Mid: " & Timer - start
start = Timer
For i = 1 To 5000
data = ActiveSheet.Range("A1:J100").Value
RegexMethod data
Next
Debug.Print "Regex: " & Timer - start
start = Timer
For i = 1 To 5000
data = ActiveSheet.Range("A1:J100").Value
ByteArrayMethod data
Next
Debug.Print "Byte(): " & Timer - start
End Sub
The results weren't horribly surprising - the Regex method is by far the fastest (but none of them are what I'd call "fast"):
Mid: 24.3359375
Regex: 8.31640625
Byte(): 22.5625
Note that I have no idea how this compares to @SiddharthRout's cool formula method in that I can't run it through my testing harness. The www.extendoffice.com code would also probably still be running, so I didn't test it.
Upvotes: 3
Reputation: 149315
No need for VBA or for looping. An excel formula can achieve what you want.
=NPV(-0.9,,IFERROR(MID(A1,1+LEN(A1)-ROW(OFFSET(A$1,,,LEN(A1))),1)%,""))
This is an array formula. You have to press Ctrl + Shift + Enter
Explanation:
Each term is multiplied by the inverse of (1+rate)^n
, where n
is the nth
term in the series.
By using different values for rate, we can get different results. In this case, using -0.9
gives us 1 + rate = 1 + -0.9 = 0.1
.
Result: {0.1;0.01;0.001;0.0001;0.00001}
Inverse of above: {10;100;1000;10000;100000}
Also NPV skips text values which contributes to the above
Disclaimer: I did not come up with this formula. I had seen this formula long time ago and simply fell in love with it. Since then it has been a part of my databank.
Upvotes: 3