Delicious
Delicious

Reputation: 968

US Census API - Get The Population of Every City in a State Using Python

I'm having an issue getting the population of every city in a specific state. I do get the population of cities but if I sum the population in every city I don't get the same number as the population of the state.

I got my API Key used the P0010001 variable for total population used the FIPS 25 for the state of Massachusetts and requested the population by the geography level "place" which I understand it to mean city.

Here is the Python 3 code I used:

import urllib.request
import ast


class Census:
    def __init__(self, key):
        self.key = key

    def get(self, fields, geo, year=2010, dataset='sf1'):
        fields = [','.join(fields)]
        base_url = 'http://api.census.gov/data/%s/%s?key=%s&get=' % (str(year), dataset, self.key)
        query = fields
        for item in geo:
            query.append(item)
        add_url = '&'.join(query)
        url = base_url + add_url
        print(url)
        req = urllib.request.Request(url)
        response = urllib.request.urlopen(req)
        return response.read()

c = Census('<mykey>')
state = c.get(['P0010001'], ['for=state:25'])
# url: http://api.census.gov/data/2010/sf1?key=<mykey>&get=P0010001&for=state:25
county = c.get(['P0010001'], ['in=state:25', 'for=county:*'])
# url: http://api.census.gov/data/2010/sf1?key=<mykey>&get=P0010001&in=state:25&for=county:*
city = c.get(['P0010001'], ['in=state:25', 'for=place:*'])
# url: http://api.census.gov/data/2010/sf1?key=<mykey>&get=P0010001&in=state:25&for=place:*

# Cast result to list type
state_result = ast.literal_eval(state.decode('utf8'))
county_result = ast.literal_eval(county.decode('utf8'))
city_result = ast.literal_eval(city.decode('utf8'))

def count_pop_county():
    count = 0
    for item in county_result[1:]:
        count += int(item[0])
    return count

def count_pop_city():
    count = 0
    for item in city_result[1:]:
        count += int(item[0])
    return count

And here are the results:

print(state)
# b'[["P0010001","state"],\n["6547629","25"]]'

print('Total state population:', state_result[1][0])
# Total state population: 6547629

print('Population in all counties', count_pop_county())
# Population in all counties 6547629

print('Population in all cities', count_pop_city())
# Population in all cities 4615402

I'm reasonable sure that 'place' is the city e.g.

# Get population of Boston (FIPS is 07000)
boston = c.get(['P0010001'], ['in=state:25', 'for=place:07000'])
print(boston)
# b'[["P0010001","state","place"],\n["617594","25","07000"]]'

What am I doing wrong or misunderstanding? Why is the sum of populations by place not equal to the population of the state?

List of example API calls

Upvotes: 6

Views: 5940

Answers (2)

Halbert
Halbert

Reputation: 324

@Delicious -- the census has several levels of geography division available. I'm not immediately sure where the data API stops (Census goes down to individual blocks, but I believe the API does not, for Human Subjects reasons), but Census Tracts, Census Divisions, ZCTAs (Zip Code Tabulation Area -- basically a Zip Code for the map) would all cover geographic ranges, and include un-incorporated population at the sub-county level.

You can play with these various levels (and with a mapping tool) at the census data website: factfinder.census.gov --> Advanced Search.

Upvotes: 1

Alex Martelli
Alex Martelli

Reputation: 881437

if I sum the population in every city I don't get the same number as the population of the state.

That's because not everybody lives in a city -- there are rural "unincorporated areas" in many counties that are not part of any city, and, people do live there.

So, this is not a programming problem!-)

Upvotes: 8

Related Questions