Reputation: 453
I am looking to take a string of directory path and parse information out of it into existing columns on another table. This is for the purpose of creating a staging table for reporting. It will be parsing many directory paths if the ProjectName is applicable to the change in structure.
Data Example:
Table1_Column1
ProjectName\123456_ProjectShortName\Release_1\Iteration\etc
Expected Output:
Table2_Column1, Table2_Column2
123456 ProjectShortName
I've figured out how to parse some strings by character but it seems a bit clunky and inefficient. Is there a better structure to go about this? To add some more to it, this is just one column I need to manipulate before shifting it over there are three other columns that are being directly shifted to the staging table based on the ProjectName.
Is it better to just create a UDF to split then call it within the job that will move the data or is there another way?
Upvotes: 0
Views: 91
Reputation: 29647
Here's a method without a UDF.
It uses charindex and substring to get the parts from that path string.
An example using a table variable:
declare @T table (Table1_Column1 varchar(100));
insert into @T values
('ProjectName\123456_ProjectShortName\Release_1\Iteration\etc'),
('OtherProjectName\789012_OtherProjectShortName\Release_2\Iteration\xxx');
select
case
when FirstBackslashPos > 0 and FirstUnderscorePos > 0
then substring(Col1,FirstBackslashPos+1,FirstUnderscorePos-FirstBackslashPos-1)
end as Table1_Column1,
case
when FirstUnderscorePos > 0 and SecondBackslashPos > 0
then substring(Col1,FirstUnderscorePos+1,SecondBackslashPos-FirstUnderscorePos-1)
end as Table1_Column2
from (
select
Table1_Column1 as Col1,
charindex('\',Table1_Column1) as FirstBackslashPos,
charindex('_',Table1_Column1) as FirstUnderscorePos,
charindex('\',Table1_Column1,charindex('\',Table1_Column1)+1) as SecondBackslashPos
from @T
) q;
If you want to calculate only one into a variable
declare @ProjectPath varchar(100);
set @ProjectPath = 'ProjectName\123456_ProjectShortName\Release_1\Iteration\etc';
declare @FirstBackslashPos int = charindex('\',@ProjectPath);
declare @FirstUnderscorePos int = charindex('_',@ProjectPath,@FirstBackslashPos);
declare @SecondBackslashPos int = charindex('\',@ProjectPath,@FirstBackslashPos+1);
declare @ProjectNumber varchar(30) = case when @FirstBackslashPos > 0 and @FirstUnderscorePos > 0 then substring(@ProjectPath,@FirstBackslashPos+1,@FirstUnderscorePos-@FirstBackslashPos-1)end;
declare @ProjectShortName varchar(30) = case when @FirstUnderscorePos > 0 and @SecondBackslashPos > 0 then substring(@ProjectPath,@FirstUnderscorePos+1,@SecondBackslashPos-@FirstUnderscorePos-1) end;
select @ProjectNumber as ProjectNumber, @ProjectShortName as ProjectShortName;
But i.m.h.o. it might be worth the effort to add some CLR that brings true regex matching to the SQL server. Since CHARINDEX and PATINDEX are not as flexible as regex.
Upvotes: 1
Reputation: 81950
The following is a SUPER fast Parser but it is limited to 8K bytes. Notice the Returned Sequence Number... Perhaps you can key off of that because I am still not clear on the logic for why column1 is 123456 and not ProjectName
Declare @String varchar(max) = 'ProjectName\123456_ProjectShortName\Release_1\Iteration\etc'
Select * from [dbo].[udf-Str-Parse-8K](@String,'\')
Returns
RetSeq RetVal
1 ProjectName
2 123456_ProjectShortName
3 Release_1
4 Iteration
5 etc
The UDF if needed
CREATE FUNCTION [dbo].[udf-Str-Parse-8K] (@String varchar(max),@Delimiter varchar(10))
Returns Table
As
Return (
with cte1(N) As (Select 1 From (Values(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) N(N)),
cte2(N) As (Select Top (IsNull(DataLength(@String),0)) Row_Number() over (Order By (Select NULL)) From (Select N=1 From cte1 a,cte1 b,cte1 c,cte1 d) A ),
cte3(N) As (Select 1 Union All Select t.N+DataLength(@Delimiter) From cte2 t Where Substring(@String,t.N,DataLength(@Delimiter)) = @Delimiter),
cte4(N,L) As (Select S.N,IsNull(NullIf(CharIndex(@Delimiter,@String,s.N),0)-S.N,8000) From cte3 S)
Select RetSeq = Row_Number() over (Order By A.N)
,RetVal = Substring(@String, A.N, A.L)
From cte4 A
);
--Much faster than str-Parse, but limited to 8K
--Select * from [dbo].[udf-Str-Parse-8K]('Dog,Cat,House,Car',',')
--Select * from [dbo].[udf-Str-Parse-8K]('John||Cappelletti||was||here','||')
Upvotes: 0