Reputation: 1105
I am willing to show Google Analytics and Google Search Console data directly into Superset through their API.
How can I do so?
I couldn't find a Google Analytics datasource. I couldn't find a Google Search Console datasource either.
I can't find a way to display in Superset data retrieved from an API, only data stored in a database. I must be missing something, but I can't find anything in the docs related to authenticating & querying external APIs.
Upvotes: 8
Views: 16070
Reputation: 389
There is this project named shillelagh by one of Superset's contributors. This gives a SQL interface to REST APIs. This same package is used in Apache Superset to connect with gsheets.
New adapters are relatively easy to implement. There's a step-by-step tutorial that explains how to create a new adapter to an API or filetype in shillelagh.
The package shillelagh underlying uses SQLite Virtual Tables by using the SQLite wrapper APSW
Upvotes: 8
Reputation: 691
Superset can’t query external data API’s directly. Superset has to work with a supported database or data engine (https://superset.incubator.apache.org/installation.html#database-dependencies). This means that you need to find a way to fetch data out of the API and store it in a supported database / data engine. Some options:
Build a little Python pipeline that will query the data API, flatten the data to something tabular / relational, and upload that data to a supported data source - https://superset.incubator.apache.org/installation.html#database-dependencies - and set up Superset so it can talk to that database / data engine.
For more robust solutions, you may want to work with an devops / infrastructure to stand up a workflow scheduler like Apache Airflow (https://airflow.apache.org/) to regularly ping this API and store it in a database of some kind that Superset can talk to.
If you want to regularly query data from a popular 3rd party API, I also recommend checking out Meltano and learning more about Singer taps. These will handle some of the heavy lifting of fetching data from an API regularly and storing it in a database like Postgres. The good news is that there's a Singer tap for Google Analytics - https://github.com/singer-io/tap-google-analytics
Either way, Superset is just a thin layer above your database / data engine. So there’s no way around the reality that you need to find a way to extract data out of an API and store it in a compatible data source.
Upvotes: 9
Reputation: 175
Redash is an alternative to Superset for that task, but it doesn't have the same features. Here is a compared list of integrations for both tools: https://discuss.redash.io/t/a-comparison-of-redash-and-superset/1503
A quick alternative is paying for a third party service like: https://www.stitchdata.com/integrations/google-analytics/superset/
Upvotes: 2
Reputation: 5535
There is no such connector available by default.
A recommended solution would be storing your Google Analytics and Search Console data in a database, you could write a script that pulls data every 4 hours or whichever interval works for you.
Also, you shouldn't store all data but only the dimension/metrics you wish to see in your reports.
Upvotes: 1