Reputation: 13
I'm building a Laravel application that offers an authoring tool to customers. Each customer will get their own subdomain i.e:
customer-a.my-tool.com
customer-b.my-tool.com
My tool is hosted on Amazon in multiple regions for performance but mostly for privacy law reasons(GDPR++). Each customer have their data in only one region. Australian customers in Australia, European in Europe etc. So the customers users must be directed to the correct region. If a European user ends up being served by the US region their data won't be there.
We can solve this manually using DNS and simply point each sub-domain to the correct IP, but we don't want to do this for two reasons. (1) updating the DNS might take up to 60 seconds. We don't want the customer to wait. (2) It seems the sites we've researched uses wildcard domains. For instance slack and atlassian.net. We know that atlassian.net also have multiple regions.
So the question is:
How can we use a wildcard domain and still route the traffic to the regions where the content is located?
Note:
Upvotes: 1
Views: 841
Reputation: 179404
How can we use a wildcard domain and still route the traffic to the regions where the content is located?
It is, in essence, not possible to do everything you are trying to do, given all of the constraints you are imposing: automatically, instantaneously, consistently, and with zero overhead, zero cost, and zero complexity.
But that isn't to say it's entirely impossible.
You have asserted that other vendors are using a "wildcard domain," which is a concept that is essentially different than I suspect you believe it necessarily entails. A wildcard in DNS, like *.example.com
is not something you can prove to the exclusion of other possibilities, because wildcard records are overridden by more specific records.
For a tangible example that you can observe, yourself... *.s3.amazonaws.com
has a DNS wildcard. If you query some-random-non-existent-bucket.s3.amazonaws.com
, you will find that it's a valid DNS record, and it routes to S3 in us-east-1. If you then create a bucket by that name in another region, and query the DNS a few minutes later, you'll find that it has begun returning a record that points to the S3 endpoint in the region where you created the bucket. Yes, it was and is a wildcard record, but now there's a more specific record that overrides the wildcard. The override will persist for at least as long as the bucket exists.
Architecturally, other vendors that segregate their data by regions (rather than replicating it, which is another possibility, but not applicable to your scenario) must necessarily be doing something along one of these lines:
creating specific DNS records and accepting the delay until the DNS is ready or
implementing what I'll call a "hybrid" environment that behaves one way initially, and a different way eventually, this evironment uses specific DNS records to override a wildcard and has an ability to temporarily deliver, via a reverse proxy, a misrouted request to the correct cluster, to allow instantaneous correct behavior until the DNS propagates or
an ongoing "two-tier" environment, using a wildcard without more specific records to override it, operating a two-tier infrastructure, with an outer tier that is distributed globally, that accepts any request, and has internal routing records that deliver the request to an inner tier -- the correct regional cluster.
The first option really doesn't seem unreasonable. Waiting a short time for your own subdomain to be created seems reasonably common. But, there are other options.
The second option, the hybrid environment, would simply require that the location where your wildcard points to by default be able to do some kind of database lookup to determine where the request should go, and proxy the request there. Yes, you would pay for inter-region transport, if you implement this yourself in EC2, but only until the DNS update takes effect. Inter-region bandwidth between any two AWS regions costs substantially less than data transfer to the Internet -- far less than "double" the cost.
This might be accomplished in any number of ways that are relatively straightforward.
You must, almost by definition, have a master database of the site configuration, somewhere, and this system could be queried by a complicated service that provides the proxying -- HAProxy and Nginx both support proxying and both support Lua integrations that could be used to do a lookup of routing information, which could be cached and used as long as needed to handle the temporarily "misrouted" requests. (HAProxy also has static-but-updatable map tables and dynamic "stick" tables that can be manipulated at runtime by specially-crafted requests; Nginx may offer similar things.)
But EC2 isn't the only way to handle this.
Lambda@Edge allows a CloudFront distribution to select a back-end based on logic -- such as a query to a DynamoDB table or a call to another Lambda function that can query a relational database. Your "wildcard" CloudFront distribution could implement such a lookup, caching results in memory (container reuse allows very simple in-memory caching using simply an object in a global varible). Once the DNS record propagates, the requests would go directly from the browser to the appropriate back-end. CloudFront is marketed as a CDN, but it is in fact a globally-distributed reverse proxy with an optional response caching capability. This capability may not be obvious at first.
In fact, CloudFront and Lambda@Edge could be used for such a scenario as yours in either the "hybrid" environment or the "two-tier" environment. The outer tier is CloudFront -- which automatically routes requests to the edge on the AWS network that is nearest the viewer, at which point a routing decision can be made at the edge to determine the correct cluster of your inner tier to handle the request. You don't pay for anything twice, here, since bandwidth from EC2 to CloudFront costs nothing. This will not impact site performance other than the time necessary for thst initial database lookup, and once your active containers have that cached the responsiveness of the site will not be impaired. CloudFront, in general, improves responsiveness of sites even when most of the content is dynamic, because it optimizes both the network path and protocol exchanges between the viewer and your back-end, with optimized TCP stacks and connection reuse (particularly helpful at reducing the multiple round-trips required by TLS handshakes).
In fact, CloudFront seems to offer an opportunity to have it both ways -- an initially hybrid capability that automatically morphs into a two-tier infrastructure -- because CloudFront distributions also have a wildcard functionality with overrides: a distribution with *.example.com
handles all requests unless a distribution with a more specific domain name is provisioned -- at which point the other distribution will start handling the traffic. CloudFront takes a few minutes before the new distribution overrides the wildcard, but when the switchover happens, it's clean. A few minutes after the new distribution is configured, you make a parallel DNS change to the newly assigned hostname for the new distribution, but CloudFront is designed in such a way that you do not have to tightly coordinate this change -- all endpoints will handle all domains because CloudFront doesn't use the endpoint to make the routing decision, it uses SNI and the HTTP Host
header.
This seems almost like a no-brainer. A default, wildcard CloudFront distribution is pointed to by a default, wildcard DNS record, and uses Lambda@Edge to identify which of your clusters handles a given subdomain using a database lookup, followed by the deployment -- automated, of course -- of a distribution for each of your customers, which already knows how to forward the request to the correct cluster, so no further database queries are needed after the subdomain is fully live. You'll need to ask AWS Support to increase your account's limit for the number of CloudFront distributions from the default of 200, but that should not be a problem.
There are multiple ways to accomplish that database lookup. As mentioned, before, the Lambda@Edge function can invoke a second Lambda function inside VPC to query the database for routing instructions, or you could push the domain location config to a DynamoDB global table, which would replicate your domain routing instructions to multiple DynamoDB regions (currently Virginia, Ohio, Oregon, Ireland, and Frankfurt) and DynamoDB can be queried directly from a Lambda@Edge function.
Upvotes: 2