Reputation: 421
I have a table.
create table tblProduct
(
ProductID int primary key identity(1000,1),
ProductName varchar(100),
ProductDescription nvarchar(max)
)
In this table, there are 1000 records like this...
ProductID=1001
ProductName='Apple i6'
ProductDescription='Lorem Ipsum is simply dummy text of the printing and typesetting industry. **Product of USA** Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.'
ProductID=1002
ProductName='Micromax Canvas'
ProductDescription='Scrambled it to make a type specimen bookLorem Ipsum is simply dummy text of the printing and typesetting industry. **Product of INDIA** Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.'
ProductID=1003
ProductName='Oppo Z3'
ProductDescription='Lorem Ipsum is simply dummy text of the printing and typesetting industry. **Product of INDIA** Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.'
and so on....
Now I want to find only country name (no duplicate and no other word) from the productDescription
column of the tblProduct
...
Output must be something like this:
Country Name - Total(Group By)
USA - 1
INDIA - 2
Note: "product of XXX*" will be available almost all rows of productdescription
column.
*xxx is the country name.
Upvotes: 0
Views: 7759
Reputation: 175716
You can use string manipulation functions LEFT/RIGHT/PATINDEX
:
WITH cte AS
(
SELECT
r = RIGHT(ProductDescription, LEN(ProductDescription) -
PATINDEX('%Product of%' ,ProductDescription) - 10)
FROM #tblProduct
WHERE PATINDEX('%Product of%' ,ProductDescription) > 0
)
SELECT country = LEFT(r, CHARINDEX(' ', r)-1), COUNT(*) AS Total
FROM cte
GROUP BY LEFT(r, CHARINDEX(' ', r)-1);
But you have to think about corner cases:
Russian Federation
U.S./USA/United States of America...
) you will get multiple groups, you need data cleansingNote: "product of XXX*" will be available almost all rows of productdescription column.
If you know all countries in advance it will be much easier. Just create table countries:
name master
'U.S.' 'USA'
'USA' 'USA'
'India' 'India'
SELECT c.master, COUNT(*) AS total
FROM #tblProduct p
JOIN countries c
ON p.Description LIKE '%Product of ' + c.name + '%'
GROUP BY c.master;
EDIT:
WITH cte AS
(
SELECT r = RIGHT(ProductDescription, LEN(ProductDescription) - PATINDEX('%Product of <strong>%' ,ProductDescription) - 18)
FROM #tblProduct
WHERE PATINDEX('%Product of <strong>%' ,ProductDescription) > 0
)
SELECT country = LEFT(r, CHARINDEX('<', r)-1), COUNT(*) AS Total
FROM cte
GROUP BY r,LEFT(r, CHARINDEX('<', r)-1);
Upvotes: 2