Justin
Justin

Reputation: 4539

How to use substring with case statement in order to filter for two specific category and group all others in a single category

Beginner. So I am trying to take a data set like

Id, Name, Team, Games, PPG
1, Joe Smith, Wolves, 5, 14.1
2, Mike Bender, Bears, 6, 7.1
3, Jordan Shoeheart, Blazers, 5, 9.0
4, Sara Eckert, Tigers, 5, 4.1
5, Jason Beard, Bears, 6, 22.1,
6, Noel Redding, Wolves, 4, 1.5
7, Jared Herring, Wolves, 6, 5.5
8, Jennifer Bradley, Bears, 4, 3.2
9, Erica Medley, Tigers, 5, 8.3
10, Justin Awesome, Sharks, 6, 0.3

and answer the question, how count how many instances (records) for the Bears, how many for the Wolves, and how many for all other teams in TSQL. I need to use substrings because the data comes in various formats (extras chars on beginning or end at times). I am confused because I need to end up with a key value pair that would look something like this:

 Key        | Value
 -----------------
 Wolves     |  3
 Bears      |  3
 All Others |  4

I have tried something like:

SELECT CASE WHEN tbl.team LIKE '%Wolves%' THEN COUNT(tbl.id) end as [Value],
       CASE WHEN tbl.team LIKE '%Wolves%' THEN 'Wolves' END AS [Key]
       -- this isn't right because I couldn't at least get the values for the others right?

I am learning so I appreciate the help

Upvotes: 0

Views: 99

Answers (2)

Jason A. Long
Jason A. Long

Reputation: 4442

This should do the trick...

IF OBJECT_ID('tempdb..#TestData', 'U') IS NOT NULL 
DROP TABLE #TestData;

CREATE TABLE #TestData (
    Id INT NOT NULL PRIMARY KEY, 
    Name VARCHAR(50) NOT NULL,
    Team VARCHAR(50) NOT null,
    Games INT NOT NULL,
    PPG DECIMAL(9,2) NOT NULL 
    );
INSERT #TestData (Id, Name, Team, Games, PPG) VALUES
    (1, 'Joe Smith', 'Wolves', 5, 14.1),
    (2, 'Mike Bender', 'Bears', 6, 7.1),
    (3, 'Jordan Shoeheart', 'Blazers', 5, 9.0),
    (4, 'Sara Eckert', 'Tigers', 5, 4.1),
    (5, 'Jason Beard', 'Bears', 6, 22.1),
    (6, 'Noel Redding', 'Wolves', 4, 1.5),
    (7, 'Jared Herring', 'Wolves', 6, 5.5),
    (8, 'Jennifer Bradley', 'Bears', 4, 3.2),
    (9, 'Erica Medley', 'Tigers', 5, 8.3),
    (10, 'Justin Awesome', 'Sharks', 6, 0.3);

--==================================================

SELECT 
    gn.GroupName,
    PlayerCount = COUNT(1)
FROM
    #TestData td
    CROSS APPLY ( VALUES (CASE WHEN td.Team IN ('Bears', 'Wolves') THEN td.Team ELSE 'All Others' END) ) gn (GroupName)
GROUP BY 
    gn.GroupName;

Results...

GroupName  PlayerCount
---------- -----------
All Others 4
Bears      3
Wolves     3

Edit... (showing how to "fuzzy" match the team name)

SELECT 
    gn.GroupName,
    PlayerCount = COUNT(1)
FROM
    #TestData td
    CROSS APPLY ( VALUES (CASE 
                                WHEN td.Team LIKE '%Bears%' THEN 'Bears'
                                WHEN td.Team LIKE '%Wolves%' THEN 'Wolves' 
                                ELSE 'All Others' 
                            END) ) gn (GroupName)
GROUP BY 
    gn.GroupName;

HTH, Jason

Upvotes: 3

Oxana Urdaneta
Oxana Urdaneta

Reputation: 1

Based on what you need, all you care about is the counts per team. There is a function called TRIM which takes care of removing extra spaces on both sides of your text column. You can do

SELECT COUNT(1) AS Instances, TRIM(t.Team) AS team_without_extra_spaces
FROM   mytable t 
GROUP BY TRIM(t.Team) 

Note: If you are using Oracle, Mysql or Postgres this will work. If you are using SQL Server you will have to do LTRIM (left trim) and RTRIM (right trim) something like LTRIM(RTRIM(t.Team)).

Don't know how beginner you are so the GROUP BY serves to Group together all rows that have the same value, in this case you are saying to group by team so all Bears, all Wolves, etc become one group and then the COUNT(1) only gives you the count, per group. Hope it makes sense! You can read more about group by here: https://www.w3schools.com/sql/sql_groupby.asp

Upvotes: 0

Related Questions