Reputation: 77
I've been working on a project for data transparency, and one of the initiatives is to provide access to our data using APIs. So far, we've decided to use an API Manager platform to expose and manage all the API's. An API gateway (provided by the platform) will control all the requests.
Our main doubt right now is, how the public API's should be constructed? How it's architecture should be planned? For example (very simplified example):
(1)Replicating the production database:
(2)Not Replicating the production database:
In case (1) (Replicating), how may I deal with real time data? For example: Bus Location?
In case (2), what kind of concerns I may have? (performance, security...)
I've been trying to find famous public API cases (like the twitter API), but I couldn't find anything yet about the public api architecture and implementation.
Upvotes: 2
Views: 520
Reputation: 4521
I think a few conditions should be established first.
Most companies are consuming the external APIs that they have one way or the other. Why wouldn't they? It's reliable / being monitored / distributed / load balanced so it would be a great service to use. They might have some internal versions of those APIs for testing / release ring purposes but primarily their front end would be built on the same stack that you can call. So you shouldn't consider API entirely separate from what you have on your production. At this day of age, it's really just a part of the system.
Both of those diagrams look incorrect to me. The top one looks incorrect because you are explicitly reading from another instance and the bottom one looks incorrect because you don't have replication. Now replication is a very lengthy topic - the short lesson is you should definitely have replication. Check this link out to see different types of consistencies for replicating data: https://en.wikipedia.org/wiki/Consistency_model You will have to make some trade-offs here, but as an example depending on your load and load pattern; you can replicate your DB, allow your reads to be multi-regional, writes to be single-region and your business layer - which also powers your API would be deployed to multiple regions. (That is if you have enough traffic)
Well there are a few design issues with this, but let's address your questions first.
In case (1) (Replicating), how may I deal with real time data? For example: Bus Location?
In case (2), what kind of concerns I may have? (performance, security...)
You shouldn't use case 2.
I think you should identify your user's and system's requirements before everything else. It's also very hard to design just by those diagrams since we can't really know if we're designing this system for a local district or a world-wide system. Ideally you would have redundancies for everything (so multiple servers serving network traffic / multiple DBs / multiple CDNs for your static content etc..) so you'd have a higher quality of service and a much smaller chance of falling over. Sometimes even entire regions of cloud services go down due to natural disasters / so replicating across different regions is a good idea but your system might not really need it. In all cases, your public API should not be separate from your production.
Upvotes: 2