Reputation: 4772
This seems like such a simple requirement, that I feel like I am missing something obvious.
I have an Excel spreadsheet with "dirty" text data in, containing text and unwanted leading and trailing, spaces, commas and newlines. I would like to TRIM references to these cells of all those characters.
Note: I don't want to replace all those characters, since they legitimately appear within the cell text - it is just when at the start or end of the cell text (i.e. value) that I want to trim them off.
The text data consists of names of people and schools, for cleaning and importing into a CRM.
So, is there a function built in, or do I need to write one? I feel spoiled by the number of string filtering functions in PHP ;-)
Upvotes: 2
Views: 9167
Reputation: 4903
I tried this using two steps
For removing leading and trailing spaces
Use direct function TRIM(A1)
For removing leading and trailing comma
=MID(A1,IF(FIND(",",A1)=1,2,1),IF(RIGHT(A1)=",",LEN(A1)-2,LEN(A1)))
or
=SUBSTITUTE(TRIM(SUBSTITUTE(A1,","," "))," ",",")
Upvotes: 0
Reputation: 4772
I have found this code, which I pasted in as a module into my spreadsheet:
Option Explicit
Function ReReplace(ReplaceIn, _
ReplaceWhat As String, ReplaceWith As String, Optional IgnoreCase As Boolean = False)
Dim RE As Object
Set RE = CreateObject("vbscript.regexp")
RE.IgnoreCase = IgnoreCase
RE.Pattern = ReplaceWhat
RE.Global = True
ReReplace = RE.Replace(ReplaceIn, ReplaceWith)
End Function
This provides a replace function that supports REs (why doesn't Excel do that itself? It has only been around since 1987 - I had it on my Atari ST, note that you can add more than ten cells before it crashed!). This cell function is able to do the trimming I need:
=ReReplace('source worksheet'!cell_reference, "^[\s,]+|[\s,]+$", "")
This works beautifully.
(Note: this answer moved from the question text, where it really should not have been.)
Upvotes: 1
Reputation: 11
Recursive function to remove comma and trailing spaces. Pure VBA..
Function removetrailcomma(txt As String) As String
If Right(txt, 1) = " " Or Right(txt, 1) = "," Then
removetrailcomma = removetrailcomma(Left(txt, Len(txt) - 1))
Else
removetrailcomma = txt
End If
End Function
Upvotes: 0
Reputation: 55702
This is well suited to a regexp
The code below adapted from this article uses this regexp
"[,\s]*(.+?)[,\s]*$"
to remove any leading and/or trailing whitespaces/commas while leaving any such characters within the text body intact
It will replace your existing data in-situ
Sub RemoveDirt()
Dim rng1 As Range
Dim rngArea As Range
Dim lngRow As Long
Dim lngCol As Long
Dim lngCalc As Long
Dim objReg As Object
Dim X()
On Error Resume Next
Set rng1 = Application.InputBox("Select range for the replacement of leading zeros", "User select", Selection.Address, , , , , 8)
If rng1 Is Nothing Then Exit Sub
On Error GoTo 0
'See Patrick Matthews excellent article on using Regular Expressions with VBA
Set objReg = CreateObject("vbscript.regexp")
objReg.MultiLine = True
objReg.Pattern = "[,\s]*(.+?)[,\s]*$"
'Speed up the code by turning off screenupdating and setting calculation to manual
'Disable any code events that may occur when writing to cells
With Application
lngCalc = .Calculation
.ScreenUpdating = False
.Calculation = xlCalculationManual
.EnableEvents = False
End With
'Test each area in the user selected range
'Non contiguous range areas are common when using SpecialCells to define specific cell types to work on
For Each rngArea In rng1.Areas
'The most common outcome is used for the True outcome to optimise code speed
If rngArea.Cells.Count > 1 Then
'If there is more than once cell then set the variant array to the dimensions of the range area
'Using Value2 provides a useful speed improvement over Value. On my testing it was 2% on blank cells, up to 10% on non-blanks
X = rngArea.Value2
For lngRow = 1 To rngArea.Rows.Count
For lngCol = 1 To rngArea.Columns.Count
'replace the leading zeroes
X(lngRow, lngCol) = objReg.Replace(X(lngRow, lngCol), "$1")
Next lngCol
Next lngRow
'Dump the updated array sans dirt over the initial range
rngArea.Value2 = X
Else
'caters for a single cell range area. No variant array required
rngArea.Value = objReg.Replace(rngArea.Value, "$1")
End If
Next rngArea
'cleanup the Application settings
With Application
.ScreenUpdating = True
.Calculation = lngCalc
.EnableEvents = True
End With
Set objReg = Nothing
End Sub
Upvotes: 2