Mona Jalal
Mona Jalal

Reputation: 38315

Why percentage is not working properly in SQLite3?

I have the following code for the following question however percentage is happening just to be zero:

SELECT p.state, (p.popestimate2011/sum(p.popestimate2011)) * 100
FROM pop_estimate_state_age_sex_race_origin p
WHERE p.age >= 21
GROUP BY p.state;

enter image description here

Also here's the table schema:

sqlite> .schema pop_estimate_state_age_sex_race_origin
CREATE TABLE pop_estimate_state_age_sex_race_origin (
   sumlev NUMBER, 
   region NUMBER, 
   division NUMBER, 
   state NUMBER, 
   sex NUMBER, 
   origin NUMBER, 
   race NUMBER, 
   age NUMBER, 
   census2010pop NUMBER, 
   estimatesbase2010 NUMBER, 
   popestimate2010 NUMBER, 
   popestimate2011 NUMBER, 
   PRIMARY KEY(state, age, sex, race, origin), 
   FOREIGN KEY(sumlev) REFERENCES SUMLEV(sumlev_cd), 
   FOREIGN KEY(region) REFERENCES REGION(region_cd), 
   FOREIGN KEY(division) REFERENCES DIVISION(division_cd), 
   FOREIGN KEY(sex) REFERENCES SEX(sex_cd), 
   FOREIGN KEY(race) REFERENCES RACE(race_cd), 
   FOREIGN KEY(origin) REFERENCES ORIGIN(origin_cd));

So when I run the query it just shows 0 for the percentage:

stat  p.popestimate
----  -------------
1     0            
2     0            
4     0            
5     0            
6     0            
8     0            
9     0            
10    0            
11    0            
12    0            
13    0            
15    0            
16    0            
17    0            
18    0            
19    0            
20    0            
21    0            
22    0            
23    0    

I was trying to write it using nested queries by didn't get anywhere too:

    SELECT p.state, 100.0 * sum(p.popestimate2011) / total_pop AS percentage
FROM pop_estimate_state_age_sex_race_origin p
JOIN (SELECT state, sum(p2.popestimate2011) AS total_pop
      FROM pop_estimate_state_age_sex_race_origin p2) s ON (s.state = p.state)
WHERE age >= 21
GROUP BY p.state, total_pop
ORDER BY p.state;

The current problem I am having is that it just shows one row as result and just shows the result for the last state number (state ID=56):

56    0.131294163192301

Upvotes: 0

Views: 945

Answers (4)

josepn
josepn

Reputation: 367

The NUMBER type does not exist in SQLite. SQLite interprets as INTEGER and decimals are lost in an integer division

(p.popestimate2011 / sum (p.popestimate2011))

is always 0.

Change the type of the column popestimate2011 REAL or use CAST (...)

(CAST (p.popestimate2011 AS REAL) / SUM (p.popestimate2011))

Upvotes: 0

PM 77-1
PM 77-1

Reputation: 13352

You need something along these lines (not tested):

SELECT state,  SUM(popestimate2011) / 
               (SELECT SUM(popestimate2011) 
                FROM pop_estimate_state_age_sex_race_origin 
                WHERE age > 21)))      
               * 100 as percentage
FROM pop_estimate_state_age_sex_race_origi
WHERE age >= 21
GROUP by state      
;

Upvotes: 0

Larry Lustig
Larry Lustig

Reputation: 51008

Here's an approach (not tested) that does not require an inner query. It makes a single pass over the table, aggregating by state, and using CASE to calculate the numerator of population aged over 20 and denominator of total state population.

 SELECT 
    state,
    (SUM(CASE WHEN age >= 21 THEN popestimate2011 ELSE 0) / SUM(popestimate2011)) * 100
 FROM pop_estimate_state_age_sex_race_origin 
 GROUP BY state

Upvotes: 3

Larry Lustig
Larry Lustig

Reputation: 51008

I'm not sure why your SQL statement is executing at all. You are including the non-aggregated column value popestimate2011 in a GROUP BY select and that should generate an error.

A closer reading of the SQLite documentation indicates that it does, in fact, support random value selection for non-aggregate columns in the result expression list (a feature also offered by MySQL). This explains:

  1. Why your SELECT statement is able to execute (a random value is chosen for the non-aggregated popestimate2011 reference).
  2. Why you are seeing a result of 0: the random value chosen is probably the first occurring row and if the rows were added to the database in order that row probably has an age value of 0. Since the numerator in your division would then be 0, the result is also 0.

As to the meat of your calculation it's not clear from your table definition whether the data in your base table is already aggregated or not and, if so, what the age column represents (an average? the grouping factor for that row?)

Finally, SQLite does not have a NUMBER data type. These columns will get the default affinity of NUMERIC which is probably what you want but might not be.

Upvotes: 1

Related Questions