Doug B
Doug B

Reputation: 180

SQL Server 2012: Remove text from end of string

I'm new to SQL so please forgive me if I use incorrect terminology and my question sounds confused.

I've been tasked with writing a stored procedure which will be sent 3 variables as strings (varchar I think). I need to take two of the variables and remove text from the end of the variable and only from the end.

The strings/text I need to remove from the end of the variables are

For example this string

Global Widgets LLC

would become

Global Widgets

However it should only apply once so

Global Widgets Corporation LLC

Should become

Global Widgets Corporation

I then need to use the altered variables to do a SQL query.

This is to be used as a backup for an integration piece we have which makes a callout to another system. The other system takes the same variables and uses Regex to remove the strings from the end of variables.

I've tried different combinations of PATINDEX, SUBSTRING, REPLACE, STUFF but cannot seem to come up with something that will do the job.

===============================================================

Edit: I want to thank everyone for the answers provided so far, but I left out some information that I didn't think was important but judging by the answers seems like it would affect the processing.

My proc will start something like

ALTER PROC [dbo].[USP_MyDatabaseTable] @variableToBeAltered nvarchar(50)
AS

I will then need to remove all , and . characters. I've already figured out how to do this. I will then need to do the processing on @variableToBeAltered (technically there will be two variables) to remove the strings I listed previously. I must then remove all spaces from @variableToBeAltered. (Again I figured that part out). Then finally I will use @variableToBeAltered in my SQL query something like

SELECT [field1] AS myField
    ,[field2] AS myOtherField
FROM [MyData].[dbo].[MyDatabaseTable] 
WHERE [field1] = (@variableToBeAltered);

I hope this information is more useful.

Upvotes: 2

Views: 3731

Answers (4)

Jevl
Jevl

Reputation: 974

use the replace function in SQL 2012,

declare @var1 nvarchar(20) = 'ACME LLC' 
declare @var2 nvarchar(20) = 'LLC' 

SELECT CASE 
            WHEN ((PATINDEX('%'+@var2+'%',@var1) <= (LEN(@var1)-LEN(@var2))) 
                Or (SUBSTRING(@var1,PATINDEX('%'+@var2+'%',@var1)-1,1) <> SPACE(1))) 
                THEN @var1 
            ELSE 
                REPLACE(@var1,@var2,'') 
            END

Here is another way to overcome the 'Runco Co' situation.

declare @var1 nvarchar(20) = REVERSE('Runco Co') 
declare @var2 nvarchar(20) = REVERSE('Co')

Select REVERSE(
    CASE WHEN(CHARINDEX(' ',@var1) > LEN(@var2)) THEN
        SUBSTRING(@var1,PATINDEX('%'+@var2+'%',@var1)+LEN(@var2),LEN(@var1)-LEN(@var2))
    ELSE
        @var1
    END
)

Upvotes: 0

DhruvJoshi
DhruvJoshi

Reputation: 17126

You can use a query like below:

-- Assuming that you can maintain all patterns in a table or a temp table
CREATE TABLE tbl(pattern varchar(100))
INSERT INTO tbl values
('co'),('llc'),('beta')

--@a stores the string you need to manipulate, @lw & @b are variables to aid
DECLARE @a nvarchar(100), @b nvarchar(100), @lw varchar(100)
SET @a='alpha beta gamma'
SET @b=''

-- @t is a flag 
DECLARE @t int
SET @t=0

-- Below is a loop
WHILE(@t=0 OR LEN(@a)=0 )
  BEGIN
    -- Store the current last word in the @lw variable
    SET @lw=reverse(substring(reverse(@a),1, charindex(' ', reverse(@a)) -1)) 

    -- check if the word is in pattern dictionary. If yes, then Voila!
    SELECT @t=1 FROM tbl WHERE @lw like pattern

    -- remove the last word from @a
    SET @a=LEFT(@a,LEN(@a)-LEN(@lw))

    IF (@t<>1) 
     BEGIN
      -- all words which were not pattern are joined back onto this stack
       SET @b=CONCAT(@lw,@b)
     END
  END

-- get back the remaining word
SET @a=CONCAT(@a,@b)
SELECT @a
drop table tbl

Do note that this method overcomes Tom's problem of

if you have a company name like "Watco" then the "co" would be a false positive here. I'm not sure what can be done about that other than maybe making your suffixes include a leading space.

Upvotes: 1

JosephStyons
JosephStyons

Reputation: 58685

Building on the answer given by Tom H, but applying across the entire table:

set nocount on;
declare @suffixes table(tag nvarchar(20));
insert into @suffixes values('co');
insert into @suffixes values('corp');
insert into @suffixes values('corporation');
insert into @suffixes values('company');
insert into @suffixes values('lp');
insert into @suffixes values('llc');
insert into @suffixes values('ltd');
insert into @suffixes values('limited');

declare @companynames table(entry nvarchar(100),processed bit default 0);
insert into @companynames values('somecompany llc',0);
insert into @companynames values('business2 co',0);
insert into @companynames values('business3',0);
insert into @companynames values('business4 lpx',0);

while exists(select * from @companynames where processed = 0)
begin
  declare @currentcompanyname nvarchar(100) = (select top 1 entry from @companynames where processed = 0);
  update @companynames set processed = 1 where entry = @currentcompanyname;
  update @companynames
  set entry = SUBSTRING(entry, 1, LEN(entry) - LEN(tag))
  from @suffixes
  where entry like '%' + tag
end
select * from @companynames

Upvotes: 1

Tom H
Tom H

Reputation: 47454

I'd keep all of your suffixes in a table to make this a little easier. You can then perform code like this either within a query or against a variable.

DECLARE @company_name VARCHAR(50) = 'Global Widgets Corporation LLC'

DECLARE @Suffixes TABLE (suffix VARCHAR(20))
INSERT INTO @Suffixes (suffix) VALUES ('LLC'), ('CO'), ('CORP'), ('CORPORATION'), ('COMPANY'), ('LP'), ('LTD'), ('LIMITED')

SELECT @company_name = SUBSTRING(@company_name, 1, LEN(@company_name) - LEN(suffix))
FROM @Suffixes
WHERE @company_name LIKE '%' + suffix

SELECT @company_name

The keys here are that you are only matching with strings that end in the suffix and it uses SUBSTRING rather than REPLACE to avoid accidentally removing copies of any of the suffixes from the middle of the string.

The @Suffixes table is a table variable here, but it makes more sense for you to just create it and fill it as a permanent table.

The query will just find the one row (if any) that matches its suffix with the end of your string. If a match is found then the variable will be set to a substring with the length of the suffix removed from the end. There will usually be a trailing space, but for a VARCHAR that will just get dropped off.

There are still a couple of potential issues to be aware of though...

First, if you have a company name like "Watco" then the "co" would be a false positive here. I'm not sure what can be done about that other than maybe making your suffixes include a leading space.

Second, if one suffix ends with one of your other suffixes then the ordering that they get applied could be a problem. You could get around this by only applying the row with the greatest length for suffix, but it gets a little more complicated, so I've left that out for now.

Upvotes: 4

Related Questions