Veraaa
Veraaa

Reputation: 301

Calculate percentages of columns in Oracle SQL

I have three columns, all consisting of 1's and 0's. For each of these columns, how can I calculate the percentage of people (one person is one row/ id) who have a 1 in the first column and a 1 in the second or third column in oracle SQL?

For instance:

id    marketing_campaign    personal_campaign     sales
 1       1                    0                    0 
 2       1                    1                    0
 1       0                    1                    1
 4       0                    0                    1 

So in this case, of all the people who were subjected to a marketing_campaign, 50 percent were subjected to a personal campaign as well, but zero percent is present in sales (no one bought anything).

Ultimately, I want to find out the order in which people get to the sales moment. Do they first go from marketing campaign to a personal campaign and then to sales, or do they buy anyway regardless of these channels.

This is a fictional example, so I realize that in this example there are many other ways to do this, but I hope anyone can help!

The outcome that I'm looking for is something like this:

percentage marketing_campaign/ personal campaign = 50 %
percentage marketing_campaign/sales = 0%
etc (for all the three column combinations)

Upvotes: 1

Views: 13854

Answers (2)

Thomas G
Thomas G

Reputation: 10226

You can get your percentages like this :

SELECT COUNT(*),
       ROUND(100*(SUM(personal_campaign) / sum(count(*)) over ()),2) perc_personal_campaign,
       ROUND(100*(SUM(sales) / sum(count(*)) over ()),2) perc_sales
FROM (  
  SELECT ID,
    CASE 
      WHEN SUM(personal_campaign) > 0 THEN 1
      ELSE 0
    end  AS personal_campaign,
    CASE 
      WHEN SUM(sales) > 0 THEN 1
      ELSE 0
    end  AS sales   
  FROM the_table
  WHERE ID IN
    (SELECT ID FROM the_table WHERE marketing_campaign = 1)
  GROUP BY ID   
)

I have a bit overcomplicated things because your data is still unclear to me. The subquery ensures that all duplicates are cleaned up and that you only have for each person a 1 or 0 in marketing_campaign and sales


About your second question :

Ultimately, I want to find out the order in which people get to the sales moment. Do they first go from marketing campaign to a personal campaign and then to sales, or do they buy anyway regardless of these channels.

This is impossible to do in this state because you don't have in your table, either :

  • a unique row identifier that would keep the order in which the rows were inserted
  • a timestamp column that would tell when the rows were inserted.

Without this, the order of rows returned from your table will be unpredictable, or if you prefer, pure random.

Upvotes: 1

krokodilko
krokodilko

Reputation: 36127

Use count, sum and case expressions, together with basic arithmetic operators +,/,*

  • COUNT(*) gives a total count of people in the table
  • SUM(column) gives a sum of 1 in given column
  • case expressions make possible to implement more complex conditions

The common pattern is X / COUNT(*) * 100 which is used to calculate a percent of given value ( val / total * 100% )

An example:

SELECT 
    -- percentage of people that have 1 in marketing_campaign column
    SUM( marketing_campaign ) / COUNT(*) * 100 As marketing_campaign_percent,

    -- percentage of people that have 1 in sales column
    SUM( sales ) / COUNT(*) * 100 As sales_percent,

    -- complex condition:
    -- percentage of people (one person is one row/ id) who have a 1 
    -- in the first column and a 1 in the second or third column 
    COUNT( 
           CASE WHEN marketing_campaign = 1
                AND (  personal_campaign = 1 OR sales = 1 )
           THEN 1 END
       ) / COUNT(*) * 100 As complex_condition_percent
FROM table;

Upvotes: 1

Related Questions