Public API Architecture

Question

I've been working on a project for data transparency, and one of the initiatives is to provide access to our data using APIs. So far, we've decided to use an API Manager platform to expose and manage all the API's. An API gateway (provided by the platform) will control all the requests.

Our main doubt right now is, how the public API's should be constructed? How it's architecture should be planned? For example (very simplified example):

(1)Replicating the production database:

(2)Not Replicating the production database:

In case (1) (Replicating), how may I deal with real time data? For example: Bus Location?

In case (2), what kind of concerns I may have? (performance, security...)

I've been trying to find famous public API cases (like the twitter API), but I couldn't find anything yet about the public api architecture and implementation.

Mavi Domates · Accepted Answer

I think a few conditions should be established first.

1. Your production is not (should not be) separate from your APIs.

Most companies are consuming the external APIs that they have one way or the other. Why wouldn't they? It's reliable / being monitored / distributed / load balanced so it would be a great service to use. They might have some internal versions of those APIs for testing / release ring purposes but primarily their front end would be built on the same stack that you can call. So you shouldn't consider API entirely separate from what you have on your production. At this day of age, it's really just a part of the system.

2. Database replication

Both of those diagrams look incorrect to me. The top one looks incorrect because you are explicitly reading from another instance and the bottom one looks incorrect because you don't have replication. Now replication is a very lengthy topic - the short lesson is you should definitely have replication. Check this link out to see different types of consistencies for replicating data: https://en.wikipedia.org/wiki/Consistency_model You will have to make some trade-offs here, but as an example depending on your load and load pattern; you can replicate your DB, allow your reads to be multi-regional, writes to be single-region and your business layer - which also powers your API would be deployed to multiple regions. (That is if you have enough traffic)

3. Design concerns

Well there are a few design issues with this, but let's address your questions first.

In case (1) (Replicating), how may I deal with real time data? For example: Bus Location?

Check out different consistency types and select the one which fits your service the best.
What's your ideal real-time experience? Is this a map application where you are pushing notifications of the bus location to devices / or are you expecting users to connect and query to see when is the next bus? Depending on your needs you might have different design decisions here. (Also I highly doubt that your bus location is going to come from a DB) Neither of them really effects the replication though, consistency + read/write ratios and overall load / load pattern is way more important in this case.

In case (2), what kind of concerns I may have? (performance, security...)

You shouldn't use case 2.

I think you should identify your user's and system's requirements before everything else. It's also very hard to design just by those diagrams since we can't really know if we're designing this system for a local district or a world-wide system. Ideally you would have redundancies for everything (so multiple servers serving network traffic / multiple DBs / multiple CDNs for your static content etc..) so you'd have a higher quality of service and a much smaller chance of falling over. Sometimes even entire regions of cloud services go down due to natural disasters / so replicating across different regions is a good idea but your system might not really need it. In all cases, your public API should not be separate from your production.

Public API Architecture

Answers (1)

1. Your production is not (should not be) separate from your APIs.

2. Database replication

3. Design concerns

Related Questions