Raptor776
Raptor776

Reputation: 190

SQL remove rows where top 1% of values

I am using the following query to remove all rows where volume is in the top 1%. I am structuring my query using the following stackoverflow question: Select top 10 percent, also bottom percent in SQL Server.

However, my query is generating an error. I was hoping for some input on what has to be changed.

CREATE TABLE TEST AS

WITH PERCENTILE AS
(
    SELECT SEGMENT,
           VOLUME,
           OUTLIER_VOL = NTILE(100) OVER (ORDER BY VOLUME)
    FROM OFFER_PERIOD_SEGMENT
)

SELECT *
FROM PERCENTILE
WHERE OUTLIER_VOL NOT IN (99,100)

I am receiving the following error:

CLI prepare error: [Oracle][ODBC][Ora]ORA-00923: FROM keyword not found where expected

Upvotes: 0

Views: 1169

Answers (2)

Raptor776
Raptor776

Reputation: 190

If someone stumbles upon this in the future, I want to add that I was calculating the percentiles incorrectly. The code below is used to calculate the percentiles, whereas in the above scenario you are creating 100 equal sized buckets in which your data is placed:

CREATE TABLE TEST AS

WITH PERCENTILE AS
(
    SELECT SEGMENT,
           VOLUME,
           PERCENT_RANK() OVER (ORDER BY VOLUME) AS OUTLIER_VOL
    FROM OFFER_PERIOD_SEGMENT
)

SELECT *
FROM PERCENTILE
WHERE OUTLIER_VOL < 0.99

Upvotes: 1

sticky bit
sticky bit

Reputation: 37472

Try to change

OUTLIER_VOL = NTILE(100) OVER (ORDER BY VOLUME)

to:

NTILE(100) OVER (ORDER BY VOLUME) OUTLIER_VOL

That <column alias> = <value> syntax is special to SQL Server I believe.

Upvotes: 3

Related Questions