SGP
SGP

Reputation: 394

SQL Server: In a Select how to add column with ids for duplicate values

I need help to figure out how to do this in sql server, I've been searching tirelessly and up to now no luck, I'm sure this is not a duplicate question.

Let's say I have this result in a select:

  date value name
  16:02 429 nameIdontKnow1
  16:03 430 nameIdontKnow1
  16:04 433 nameIdontKnow1
  16:02 435 nameIdontKnow2
  16:03 430 nameIdontKnow2
  ...

What I need, is that in the select, I need to get an id for the identical values, but I don't know what will appear in the name column because the select selects from a top 10 of a bigger set. This is for a graph where the id will be used for the graph series and I cannot use the name column because I don't know what will be appearing there

  date value name           id
  16:02 429 nameIdontKnow1  1
  16:03 430 nameIdontKnow1  1
  16:04 433 nameIdontKnow1  1
  16:02 435 nameIdontKnow2  2
  16:03 430 nameIdontKnow2  2
  ...

so what I need is that it assigns the same id based on the different repeated values of the name column but not hardcoded "nameIdondtKnow1" in a case switch or something, because I don't know what values will be there at any given time.

for instance, next time the query runs it could be

  date value name           id
  16:02 429 nameIdontKnow6  1
  16:03 430 nameIdontKnow6  1
  16:02 433 nameIdontKnow8  2
  16:04 435 nameIdontKnow6  1
  16:02 430 nameIdontKnow9  3
  ...

my select looks like this

  Select CONVERT(VARCHAR(5),date,114) date, value, name, (idIdontknowhowtomake) as id 
  from tableA 
  where name in (
        select top 10 name from tableA 
        where date > dateadd(hour, -1, GETDATE()) 
        group by name order by max(value) 
   )

The total set of different names for now is 20, and my select will always return only 10 of those 20 which have in the last hour the highest number in the column value. The graph where it is going to was made in silverlight and cannot be changed, and since it would be published the ideal thing is that the query doesn't need to be changed every time a new name is added to the "graph pool" and since it will always select only 10 of the possible values I cannot use a switch case.

Thank you all in advance!

Upvotes: 0

Views: 59

Answers (2)

shawnt00
shawnt00

Reputation: 17915

So compute your top 10 and bring along the minimum value for each of the potential names in the top. Then number those rows according to the captured minimum by using dense_rank().

with m as (
    select top 10
        date, value name,
        min(value) over (partition by name) as min_per_name
    from T
    order by date -- ??
)
select *, dense_rank() over (order by min_per_name) as id
from m;

ErikE's query is fundamentally different. One thing to note is that his query is returning a top 10 ranking of names and then all rows with that name would appear in the final result, very likely resulting in more than ten rows. It also doesn't appear that name has anything to do with the ranking of your results anyways.

You can see a difference here: http://rextester.com/ZBHMN31712

Upvotes: 1

ErikE
ErikE

Reputation: 50241

You're looking for windowing functions, specifically, DENSE_RANK.

SELECT
   Date = Convert(varchar(5), date, 114),
   Value,
   Name,
   Id = DENSE_RANK() OVER (ORDER BY Name) 
FROM dbo.TableA 
WHERE
   Name IN (
      SELECT TOP 10 Name FROM dbo.TableA 
      WHERE Date > DateAdd(hour, -1, GetDate()) 
      GROUP BY Name
      ORDER BY Max(Value) 
   );

Also, best practice is to use schema names (dbo.TableName). Try to format indents and capitalization consistently.

Note that your query may not return 10 distinct Name values, if the top 10 items have duplicates. There is some subtlety there in what will be returned.

Upvotes: 1

Related Questions