dennis vlaar
dennis vlaar

Reputation: 53

Python function to select all municipalities

So I want to import some data from the Dutch databank CBS. I need to select all the municipalities. They all have a code that starts with GM and then 4 numbers.

Do I have to type them all in? Or is there a quicker way to get them all in in one time.

# Downloaden van selectie van data
data = pd.DataFrame(
      cbsodata.get_data('70072ned', 
                          filters="RegioS eq 'GM0003', 'GM0004', 'GM0005'",
                          select=['RegioS', 'Vrouwen_3', 'Mannen_2']))
print(data.head())

Upvotes: 0

Views: 153

Answers (2)

Jonathan Leon
Jonathan Leon

Reputation: 5648

This looks like you can use ODATA querying. If you try the following you'll the idea how to modify the GM019 to just GM and it'll return what you need.

pd.DataFrame(
     cbsodata.get_data('70072ned',
     filters="startswith(RegioS, 'GM019')"))

You'll get anything starting with GM019

Same data, filtered for columns

pd.DataFrame(
     cbsodata.get_data('70072ned', 
     filters="startswith(RegioS, 'GM019')",
     select=['RegioS', 'Vrouwen_3', 'Mannen_2']))

Side note: while returning everything (no filters or select), the dataset wasn't large, but it did take a while (couple minutes) to get the data.

Upvotes: 1

vladsiv
vladsiv

Reputation: 2936

I'm not sure how cbsodata.get_data works but it seems to me that you could generate filters.

filters = "RegioS eq " + ", ".join(["'GM" + str(i).zfill(4) + "'" for i in range(3, 8)])

This will give you:

"RegioS eq 'GM0003', 'GM0004', 'GM0005', 'GM0006', 'GM0007'"

Which you can use as filter variable.


Example:

filters = "RegioS eq " + ", ".join(["'GM" + str(i).zfill(4) + "'" for i in range(3, 8)])

data = pd.DataFrame(
    cbsodata.get_data(
        "70072ned",
        filters=filters,
        select=["RegioS", "Vrouwen_3", "Mannen_2"],
    )
)
print(data.head())

Upvotes: 1

Related Questions