Reputation: 49
I have a table which contains some bad data I am trying to clean up.
An example of the fields is below
36234735HAN876
2342JOE9823
554444PUT003
What I want to do is remove all the numeric characters before the first alphabetical character so it would look like the below:
HAN876
JOE9823
PUT003
What would be the best way to achieve this? I have used the below method but this can only be used to extract ALL numeric from the string, not the ones before the alphabetical characters
How to get the numeric part from a string using T-SQL?
Upvotes: 0
Views: 884
Reputation: 95554
Use PATINDEX
to find the first non-numeric character (or first alpha character, depending on the logic) and STUFF
to remove them:
SELECT STUFF(V.YourString,1,ISNULL(NULLIF(PATINDEX('%[^0-9]%',V.YourString),0)-1,0),'')
FROM (VALUES('36234735HAN876'),
('2342JOE9823'),
('554444PUT003'),
('ABC123'))V(YourString)
If the logic is the first alpha character, instead of the first non-numeric, then the pattern would be [A-z]
.
The NULLIF
and ISNULL
are in there for when/if the string starts with a alpha/non-numeric and thus doesn't cause STUFF
to error due to the 3rd parameter being -1
. The is demonstrated with the additional example I put into the sample data ('ABC123'
).
Upvotes: 1
Reputation: 16423
You could achieve this using PATINDEX
to locate the first position of an alphabetical character in the string, and then use SUBSTRING
to only return the characters after that position:
CREATE TABLE #temp (val VARCHAR(50));
INSERT INTO #temp VALUES ('36234735HAN876'), ('2342JOE9823'), ('554444PUT003'), ('TEST1234');
SELECT val,
SUBSTRING(val, PATINDEX('%[A-Z]%', val), LEN(val)) AS output
FROM #temp;
DROP TABLE #temp;
Outputs:
val output
36234735HAN876 HAN876
2342JOE9823 JOE9823
554444PUT003 PUT003
TEST1234 TEST1234
Note that I have created a temporary table with a column named val
. You should change this to work with whatever the actual column is called.
About case sensitivity: If you are using a non-case sensitive collation this will work without issue. If your collation is case sensitive then you may need to alter the pattern being matched to cater for upper- and lower-case letters.
Upvotes: 1