Reputation: 2775
I need to setup aws s3 access points for each data_uri
passed in the inference
variable; this is to provide a cross-account uri(s). I need to parse out just the bucket name from each of these data_uri
and then create resources for each. How would I go about doing that?
Here is what I have so far:
resource "aws_s3_access_point" "s3_access_point" {
count = var.create ? 1 : 0
for_each = var.inference
bucket = split("/", replace(each.value.image_uri, "s3://", ""))[0]
name = format("%s-%s", split("/", replace(each.value.image_uri, "s3://", ""))[0], "-access-point")
}
The variable would look like this:
{
"inference": [
{
"data_uri": "s3://my_bucket/model.tar.gz"
},
{
"data_uri": "s3://my_bucket_2/model.tar.gz"
},
{
"data_uri": "s3://my_bucket_3/model.tar.gz"
}
]
}
Upvotes: 1
Views: 1095
Reputation: 7546
I would recommend using split if the naming is this consistent. Also, you cannot mix count and for_each; use one or the other as the case requires.
note: This answer has been reworked based on the comments below
locals {
inference = [
{ "data_uri" : "s3://my_bucket/model.tar.gz" },
{ "data_uri" : "s3://my_bucket_2/model.tar.gz" },
{ "data_uri" : "s3://my_bucket_3/model.tar.gz" }
]
uri_bucket_map = {
for x in local.inference : x.data_uri =>
split("/", split("//", x.data_uri)[1])[0]
}
}
resource "aws_s3_access_point" "s3_access_point" {
for_each = local.uri_bucket_map
bucket = each.value
name = var.s3_access_point_name
}
output "s3_access_point" {
value = { for uri, ap in aws_s3_access_point.s3_access_point : uri => ap.arn }
}
In this iteration, we generate a map from data_uri
to bucket_name
so that you can access both when needed. You could of course still just use the toset
version of the uri's and calculate the bucket within the resources, but since you preferred this method so you can access the bucket names elsewhere, I went with the map.
After the resources are created, we then generate an output that is a map from data_uri
to access point arn.
Upvotes: 3
Reputation: 74084
The documentation for Terraform's regex
function includes an example containing a pattern for matching the scheme and authority parts of URLs, which is a simplification of the pattern given in RFC 3986 Appendix B:
> regex("^(?:(?P<scheme>[^:/?#]+):)?(?://(?P<authority>[^/?#]*))?", "https://terraform.io/docs/") { "authority" = "terraform.io" "scheme" = "https" }
These S3 URLs seem to follow the usual URL productions and so one way to parse these would be to run them through that same regex pattern:
> regex("^(?:(?P<scheme>[^:/?#]+):)?(?://(?P<authority>[^/?#]*))?", "s3://my_bucket/model.tar.gz")
{
"authority" = "my_bucket"
"scheme" = "s3"
}
To do this I'd first derive a new value from local.inference
which expands out those URLs:
locals {
data_uris = [
for o in local.inference : merge(
{uri = o.data_uri},
regex("^(?:(?P<scheme>[^:/?#]+):)?(?://(?P<authority>[^/?#]*))?", o.data_uri),
)
]
}
The merge
in the above is to combine the regex results with the original input URIs to get a single simple object for each of the inputs:
[
{
uri = "s3://my_bucket/model.tar.gz",
scheme = "s3"
authority = "my_bucket"
},
{
uri = "s3://my_bucket_2/model.tar.gz",
scheme = "s3"
authority = "my_bucket_2"
},
{
uri = "s3://my_bucket_3/model.tar.gz",
scheme = "s3"
authority = "my_bucket_3"
},
]
We can then do a little more filtering/projection inside for_each
to transform this into a set of bucket names, to filter them all out if var.create
isn't set, and to skip any that aren't S3 URLs:
resource "aws_s3_access_point" "s3_access_point" {
for_each = toset([
for ap in local.data_uris : ap.authority
if var.enabled && ap.scheme == "s3"
])
bucket = each.key
name = "${each.key}-access-point"
}
The use of regex here is admittedly not ideal because it's pretty unclear what that regex pattern is doing. If you move forward with this then I would suggest including source code comments noting that this regex is parsing the URI and possibly linking to RFC 3986 Appendix B for its explanation of the pattern.
Upvotes: 1