Duncan Drennan
Duncan Drennan

Reputation: 920

How do I efficiently configure AWS IoT rules to write data to a device specific Timestream database?

I want to store IoT device data in a device specific table in a Timestream database. This will allow us to give users access to their particular device data only based on a specific IAM or Cognito policy.

Data from the devices would look something like this with the id being a Timestream dimension and temperature a measure.

{
  "ts": 1619815725,
  "id": "device_12345",
  "temperature": 47.2
}

and it will be published to the topic

devices/data/device_12345

There is a Timestream database device_data and in that a table for each device, in this case device_12345

Now we can create a rule which pushes the data from that device to the particular table, like this:

device to timestream rule

which selects the relevant data from the exact endpoint. The action looks like this:

timestream rule action

The role is configured to only allow writes to the specific table in the Timestream database. A policy can be attached to the device to allow only the specific device to write to that particular MQTT endpoint (which prevents other device accidentally writing to it).

A policy can then be configured for a user to only allow read access to that table to query data from their device only.

In this case the following have to be in place for each device:

  1. A policy which limits the MQTT publish endpoint to the device end point (in this case devices/data/device_12345
  2. A rule which pushes the data to Timestream which has the specific device endpoint as well as the specific device database table set correctly
  3. A IAM policy which allows that rule to write data to the device specific table in the database

Now the questions:

  1. Is it possible to configure a generic rule which looks at the device ID, or some information about the IoT thing and pushes it to the correct database table based on this info? e.g. can the rule look at the id dimension and based on that push it to the correct table?
  2. How can this be automated if we have to set up each of those items for every one of the millions of devices?
  3. Is having millions of device specific policies and rules the most effective/efficient way to do this?

Upvotes: 0

Views: 1660

Answers (1)

Duncan Drennan
Duncan Drennan

Reputation: 920

It is possible to configure a set of policies and rules which can be attached to the IoT thing and user to bound the access that devices have as well as the the access that an end user has.

On a high level the following can be done:

  1. The IoT thing has a certificate which is uniquely linked to a device
  2. A policy can be linked to the certificate which only allows the device to publish to particular topics
  3. A IoT code rule can be set up to push data to a device specific Timestream database based on the topic the device publishes to
  4. An end user can be given read rights to access a specific Timestream table

The smallest IAM resource resolution for a Timestream database is a table (see https://docs.aws.amazon.com/timestream/latest/developerguide/security_iam_service-with-iam.html), so the only way to limit user access to their own data is to contain device data within its own table and then give the user access rights to the tables with their device(s) data.

1. IoT certificate

This is part of the basic AWS IoT core thing setup. Once the device has a private key and certificate it can connect to IoT core and publish/subscribe according to the policy attached to the certificate

2. Set up a generic policy using IoT Thing policy variables

AWS IoT Thing policy variables info can be found here: https://docs.aws.amazon.com/iot/latest/developerguide/thing-policy-variables.html

The following policy can be attached to any certificate and will only allow the device to publish to the devices/data/device_12345 topic. The thing policy variable ${iot:Connection.Thing.ThingName} is substituted with the actual thing name. This is a minimal policy which allows connecting and publishing only to the one topic.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": "iot:Connect",
      "Resource": "arn:aws:iot:us-east-1:123456789:client/${iot:Connection.Thing.ThingName}"
    },
    {
      "Effect": "Allow",
      "Action": "iot:Publish",
      "Resource": "arn:aws:iot:us-east-1:123456789:topic/devices/data/${iot:Connection.Thing.ThingName}"
    }
  ]
}

Any device with this policy attached to its certificate will only be able to publish to the topic with its own thing name. This is a generic policy which can be attached to any Thing due to the policy variable substitution.

3. Configure an IoT core rule to push data to a Timestream database table

You will need to create a Timestream database and device specific table where the table has the same name as the Thing, e.g. database device_data with a table device_12345

The first thing to define is the IoT SQL query that will select the data to be pushed to the Timestream action in the rule. In this case the SQL will be,

SELECT temperature FROM 'devices/data/+'

The + is a wildcard which selects a single topic level, i.e. it will match data on devices/data/device_12345 but not devices/data/device_12345/more_data. This will select the data from published data from any device.

In the Timestream action the database will be device_data and the tableName uses the SQL function topic() (which is available in the contact of the action) to obtain the device name.

This rule must be set up using the CLI, as substitution templates are only available through the AWS CLI or API.

The JSON for setting the rule is,

{
    "sql": "SELECT temperature FROM 'devices/data/+'",
    "actions": [
        {
            "timestream": {
                "roleArn": "arn:aws:iam::123456789:role/service-role/devices_to_timestream",
                "databaseName": "device_data",
                "tableName": "${topic(3)}",
                "dimensions": [
                    {
                        "name": "id",
                        "value": "${id}"
                    }
                ],
                "timestamp": {
                    "value": "${ts}",
                    "unit": "SECONDS"
                }
            }
        }
    ],
    "ruleDisabled": false,
    "awsIotSqlVersion": "2015-10-08"
    }

Note that an IAM role is needed to provide write access to the database tables and this must be set up before configuring the rule. Create a role devices_to_timestream_role and attached a custom policy to allow writing to the tables of the database. The policy is

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "timestream:WriteRecords",
            "Resource": "arn:aws:timestream:us-east-1:123456789:database/device_data/table/*"
        },
        {
            "Effect": "Allow",
            "Action": "timestream:DescribeEndpoints",
            "Resource": "*"
        }
    ]
}

This policy allows writing to any table as it will be used by many different Things to write data. Although this policy allows writing to any table, the Thing certificate policy and the rule query limit which table a particular Thing can write to.

The IoT Rule can be written from the command line using this command:

aws iot create-topic-rule --rule-name devices_to_timestream --topic-rule-payload file://devices_to_timestream.json

where devices_to_timestream.json has the contents of the rule listed above.

4. Give the user access to their data

An access policy can be attached to a user identity to give the end user access to their device data following the pattern in the Timesteam identity based policy examples.

In summary:

This configuration provides,

  • One generic policy which can be attached to any Thing to allow it to publish to a topic with a Thing specific name
  • One generic rule which pushes all data to Thing specific tables in a Timestream database
  • User access to only the Thing data linked to their AWS identity

Beyond what is described here you would need,

  1. An automated method to create new tables when a new Thing is created or connects for the first time
  2. A mechanism to attached access rights to users identities

Upvotes: 3

Related Questions