Posts Tagged IoT
IoT Data Analytics at the Edge: Exploring the convergence of IoT, Data Analytics, and Edge Computing with Grafana, Mosquitto, and TimescaleDB on ARM-based devices
Posted by Gary A. Stafford in Analytics, AWS, Cloud, IoT, Python, Raspberry Pi, Software Development on April 15, 2021
This post is a revised version of an earlier post, featuring major version updates of TimescaleDB (v1.7.4-pg12 to v2.0.0-pg12), Grafana (v7.1.5 to v7.5.2), and Mosquitto (v1.6.12 to v2.0.9). All source code and SQL scripts are revised. Note that TimeScaleDB has a current limitation/bug with Docker on ARM later than v2.0.0.

The Edge
Edge computing is a fast-growing technology trend, which involves pushing compute capabilities to a network’s edge. Wikipedia describes edge computing as a distributed computing paradigm that brings computation and data storage closer to the location needed to improve response times and save bandwidth. The term edge commonly refers to a compute node at the edge of a network (edge device), sitting close to the data source and between that data source and external system such as the Cloud. In his recent post, 3 Advantages (And 1 Disadvantage) Of Edge Computing, well-known futurist Bernard Marr argues reduced bandwidth requirements, reduced latency, and enhanced security and privacy as three primary advantages of edge computing.
David Ricketts, Head of Marketing at Quiss Technology PLC, in his post, Cloud and Edge Computing — The Stats You Need to Know for 2018, estimates that the global edge computing market is expected to reach USD 6.72 billion by 2022 at a compound annual growth rate of a whopping 35.4 percent. Realizing the market potential, many major Cloud providers, edge device manufacturers, and integrators are rapidly expanding their edge compute capabilities. AWS, for example, currently offers more than a dozen services in their edge computing category.
Internet of Things
Edge computing is frequently associated with the Internet of Things (IoT). IoT devices, industrial equipment, and sensors generate data transmitted to other internal and external systems, often by way of edge nodes, such as an IoT Gateway. IoT devices typically generate time-series data. According to Wikipedia, a time series is a set of data points indexed in time order — a sequence taken at successive equally spaced points in time. IoT devices typically generate continuous high-volume streams of time-series data, often on a scale of millions of data points per second. IoT data characteristics require IoT platforms to minimally support temporal accuracy, high-volume ingestion and processing, efficient data compression and downsampling, and real-time querying capabilities.
Edge devices such as IoT Gateways, which aggregate and transmit IoT data from these devices to external systems, are generally lower-powered, with limited processors, memory, and storage. Accordingly, IoT platforms must satisfy all the requirements of IoT data while simultaneously supporting resource-constrained environments.
IoT Analytics at the Edge
Leading Cloud providers AWS, Azure, Google Cloud, IBM Cloud, Oracle Cloud, and Alibaba Cloud all offer IoT services. Many offer IoT services with edge computing capabilities. AWS offers AWS IoT Greengrass. Greengrass provides local compute, messaging, data management, sync, and machine learning (ML) inference capabilities to edge devices. Azure offers Azure IoT Edge. Azure IoT Edge provides the ability to run artificial intelligence (AI), Azure and third-party services, and custom business logic on edge devices using standard containers. Google Cloud offers Edge TPU. Edge TPU (Tensor Processing Unit) is Google’s purpose-built application-specific integrated circuit (ASIC), designed to run AI at the edge.
IoT Analytics
Many Cloud providers also offer IoT analytics as part of their suite of IoT services, although not at the edge. AWS offers AWS IoT Analytics, while Azure has Azure Time Series Insights. Google provides IoT analytics, indirectly, through downstream analytic systems and ad hoc analysis using Google BigQuery or advanced analytics and machine learning with Cloud Machine Learning Engine. These services generally all require data to be transmitted to the Cloud for analytics.

The ability to analyze real-time, streaming IoT data at the edge is critical to a rapid feedback loop. IoT edge analytics can accelerate anomaly detection and remediation, improve predictive maintenance capabilities, and expedite proactive inventory replenishment.
IoT Edge Analytics Stack
In my opinion, the ideal IoT edge analytics stack is comprised of lightweight, purpose-built, easily deployable and manageable, platform- and programming language-agnostic, open-source software components. The minimal IoT edge analytics stack should include:
- Lightweight message broker;
- Purpose-built time-series database;
- ANSI-standard SQL ad-hoc query engine;
- Data visualization tool;
- Simple deployment and management framework;
Each component should be purpose-built for IoT.
Lightweight Message Broker
We will use Eclipse Mosquitto as our message broker. According to the project’s description, Mosquitto is an open-source message broker that implements the Message Queuing Telemetry Transport (MQTT) protocol versions 5.0, 3.1.1, and 3.1. Mosquitto is lightweight and suitable for use on all devices, from low-power single-board computers (SBCs) to full-powered servers.
MQTT Client Library
We will interact with Mosquitto using Eclipse Paho. According to the project, the Eclipse Paho project provides open-source, mainly client-side implementations of MQTT and MQTT-SN in a variety of programming languages. MQTT and MQTT for Sensor Networks (MQTT-SN) are light-weight publish/subscribe messaging transports for TCP/IP and connectionless protocols, such as UDP, respectively.
We will be using Paho’s Python Client. The Paho Python Client provides a client class with support for both MQTT v3.1 and v3.1.1 on Python 2.7 or 3.x. The client also provides helper functions to make publishing messages to an MQTT server straightforward.
Time-Series Database
Time-series databases are optimal for storing IoT data. According to InfluxData, makers of a leading time-series database, InfluxDB, a time-series database (TSDB), is a database optimized for time-stamped or time-series data. Time series data are simply measurements or events that are tracked, monitored, downsampled, and aggregated over time. Jiao Xian of Alibaba Cloud has authored an insightful post on the time-series database ecosystem, What Are Time Series Databases? A few leading Cloud providers offer purpose-built time-series databases, though they are not available at the edge. AWS offers Amazon Timestream, and Alibaba Cloud offers Time Series Database.
InfluxDB is an excellent choice for a time-series database. It was my first choice, along with TimescaleDB, when developing this stack. However, InfluxDB Flux’s apparent incompatibilities with some ARM-based architecture ruled it out for inclusion in the stack for this particular post.
We will use TimescaleDB as our time-series database. TimescaleDB is the leading open-source relational database for time-series data. Described as ‘PostgreSQL for time-series,’ TimescaleDB is based on PostgreSQL, which provides full ANSI SQL, rock-solid reliability, and a massive ecosystem. TimescaleDB claims to achieve 10–100x faster queries than PostgreSQL, InfluxDB, and MongoDB, with native optimizations for time-series analytics.
TimescaleDB is designed for performing analytical queries, both through its native support for PostgreSQL’s full range of SQL functionality and additional functions native to TimescaleDB. These time-series optimized functions include Median/Percentile, Cumulative Sum, Moving Average, Increase, Rate, Delta, Time Bucket, Histogram, and Gap Filling.
Ad-hoc Data Query Engine
We have the option of using psql
, the terminal-based front-end to PostgreSQL, to execute ad-hoc queries against TimescaleDB. The psql
front-end enables you to enter queries interactively, issue them to PostgreSQL, and see the query results.

We also have the option of using pgAdmin, specifically the biarms/pgadmin4 Docker version, to execute ad-hoc queries and perform most other database tasks. pgAdmin is the most popular open-source administration and development platform for PostgreSQL. While several popular Docker versions of pgAdmin only support Linux AMD64 architectures, the biarms/pgadmin4 Docker version supports ARM-based devices.


Data Visualization
For data visualization, we will use Grafana. Grafana allows you to query, visualize, alert on, and understand metrics no matter where they are stored. With Grafana, you can create, explore, and share dashboards, fostering a data-driven culture. Grafana allows you to define thresholds visually and get notified via Slack, PagerDuty, and more. Grafana supports dozens of data sources, including MySQL, PostgreSQL, Elasticsearch, InfluxDB, TimescaleDB, Graphite, Prometheus, Google BigQuery, GraphQL, and Oracle. Grafana is extensible through a large collection of plugins.

Edge Deployment and Management Platform
Docker introduced the current industry standard for containers in 2013. Docker containers are a standardized unit of software that allows developers to isolate apps from their environment. We will use Docker to deploy the IoT edge analytics stack, referred to herein as the GTM Stack, composed of containerized versions of Grafana, TimescaleDB, Eclipse Mosquitto, and pgAdmin, to an ARMv7-based edge node. The acronym, GTM, comes from the three primary OSS projects composing the stack. The abbreviation also suggests Greenwich Mean Time, relating to the precise time-series nature of IoT data.

Running Docker Engine in swarm mode, we can use Docker to deploy the complete IoT edge analytics stack to the swarm, running on the edge node. The deploy command accepts a stack description in the form of a Docker Compose file, a YAML file used to configure the application’s services. With a single command, we can create and start all the services from the configuration file.
Source Code
All source code for this post is available on GitHub. Use the following command to git clone
a local copy of the project. Note that the updated version of the source code for this post is in the v2021–03
branch.
git clone --branch v2021-03 --single-branch --depth 1 \
https://github.com/garystafford/iot-analytics-at-the-edge.git
IoT Devices
For this post, I have deployed three Linux ARM-based IoT devices, each connected to a sensor array. Each sensor array contains multiple analog and digital sensors. The sensors record temperature, humidity, air quality (liquefied petroleum gas (LPG), carbon monoxide (CO), and smoke), light, and motion. For more information on the IoT device and sensor hardware involved, please see my previous post.Getting Started with IoT Analytics on AWS
Analyze environmental sensor data from IoT devices in near real-time with AWS IoT Analyticstowardsdatascience.com
Each ARM-based IoT device runs a small Python3-based script, sensor_data_to_mosquitto.py, shown below.
The IoT devices’ script implements the Eclipse Paho MQTT Python client library. An MQTT message containing simultaneous readings from each sensor is sent to a Mosquitto topic on the edge node at a configurable frequency.
Below are the actual sensor readings sent by the IoT device as an MQTT message to the Mosquitto topic.
IoT Edge Node
For this post, I have deployed a single Linux ARM-based edge node. The three IoT devices containing sensor arrays communicate with the edge node over Wi-Fi. IoT devices could easily use an alternative communication protocol, such as BLE, LoRaWAN, or Ethernet. For more information on BLE and LoRaWAN, please see some of my previous posts:LoRa and LoRaWAN for IoT: Getting Started with LoRa and LoRaWAN Protocols for Low Power, Wide Area Networking of IoT and BLE and GATT for IoT: Getting Started with Bluetooth Low Energy (BLE) and Generic Attribute Profile (GATT) Specification for IoT.
The edge node also runs a small Python3-based script, mosquitto_to_timescaledb.py, shown below.
Like the IoT devices, the edge node’s script implements the Eclipse Paho MQTT Python client library. The script pulls MQTT messages off a Mosquitto topic(s), serializes the message payload to JSON, and writes the payload’s data to the TimescaleDB database. The edge node’s script accepts several arguments, which allow you to configure the necessary Mosquitto and TimescaleDB connection settings.
Why not use Telegraf?
Telegraf is a plugin-driven agent that collects, processes, aggregates, and writes metrics. There is a Telegraf output plugin, the PostgreSQL and TimescaleDB Output Plugin for Telegraf, produced by TimescaleDB. The plugin can replace the need to manage and maintain the above script. However, I chose not to use it because it is not yet an official Telegraf plugin. If the plugin was included in a Telegraf release, I would certainly encourage its use.
Script Management
Both Linux-based IoT devices and edge nodes run systemd
system and service manager. To ensure the Python scripts keep running in the case of a system restart, we define a systemd
unit. Units are objects that systemd
knows how to manage. This is a standardized representation of system resources that can be managed by the suite of daemons and manipulated by the provided utilities. Each script has a systemd
unit file. Below, we see the gtm_stack_mosquitto
unit file, gtm_stack_mosquitto.service.
The gtm_stack_mosq_to_tmscl
unit file, gtm_stack_mosq_to_tmscl.service, is nearly identical.
To install the gtm_stack_mosquitto.service
systemd
unit file on each IoT device, use the following commands:
Installing the gtm_stack_mosq_to_tmscl.service
unit file on the edge node is nearly identical.
Docker Stack
The edge node runs the GTM Docker stack, stack.yml, in a swarm. As discussed earlier, the stack contains four containers: Eclipse Mosquitto, TimescaleDB, Grafana, and pgAdmin. The Mosquitto, TimescaleDB, and Grafana containers have paths within the containers bind-mounted to directories on the edge device. With bind-mounting, the container’s configuration and data will persist if the containers are removed and re-created. The containers are running on an isolated overlay network.
The GTM Docker stack is installed using the following commands on the edge node. We will assume Docker and git are pre-installed on the edge node for this post.
First, we will create several local directories on the edge device, which will be used to bind-mount to the Docker container’s directories. Below, we see the bind-mounted local directories with the eventual container’s contents stored within them.

Next, we copy the custom Mosquitto configuration file, mosquitto.conf, included in the project to the edge device’s correct location.
Lastly, we initialize the Docker swarm and deploy the stack.

docker service ls'
command, showing the running GTM Stack containersTimescaleDB Setup
With the GTM stack running, we need to create a single Timescale hypertable, sensor_data
, in the TimescaleDB demo_iot
database to hold the incoming IoT sensor data. Hypertables, according to TimescaleDB, are designed to be easy to manage and to behave like standard PostgreSQL tables. Hypertables are comprised of many interlinked “chunk” tables. Commands made to the hypertable automatically propagate changes down to all of the chunks belonging to that hypertable.
I suggest using psql
to execute the required DDL statements, which will create the hypertable and the proceeding views and database user permissions. All SQL statements are included in the project’s statements.sql file. One way to use psql
is to install it on your local workstation, then use psql
to connect to the remote edge node. I prefer to instantiate a local PostgreSQL Docker container instance running psql
. I then use the local container’s psql
client to connect to the edge node’s TimescaleDB database. For example, from my local machine, I run the following docker run
command to connect to the edge node’s TimescaleDB database on the edge node, located locally at 192.168.1.12
.
Although not as practical, you can also access psql
from within the TimescaleDB Docker container, running on the actual edge node, using the following docker exec
command.
TimescaleDB Continuous Aggregates
For this post’s demonstration, we will create four TimescaleDB materialized views, which will be queried from a Grafana Dashboard. The materialized views are TimescaleDB Continuous Aggregates. According to Timescale, aggregate queries which touch large swathes of time-series data can take a long time to compute because the system needs to scan large amounts of data on every query execution. To make these queries faster, a continuous aggregate allows materializing the computed aggregates, while also providing means to continuously, and with low overhead, keep them up-to-date as the underlying source data changes.
For example, we generate sensor data every five seconds from the three IoT devices in this post. When visualizing a 24-hour period in Grafana, using continuous aggregates with an interval of one minute, we would reduce the total volume of data queried from 51,840 rows to 4,320 rows, a reduction of over 91%. The larger the time period or the number of IoT devices being analyzed, the more significant these savings will positively impact query performance.
A time_bucket
on the time partitioning column of the hypertable is required for all continuous aggregate views. The time_bucket
function, in this case, has a bucket width (interval) of 1 minute. The interval is configurable.
To automatically refresh the four materialized views, we will create four corresponding continuous aggregate policies. In this demonstration, the continuous aggregate policies create a refresh window between one week ago and one hour ago, with a refresh interval of one hour.
Advanced Analytic Queries
The ability to perform ad-hoc queries on time-series IoT data is an essential feature of the IoT edge analytics stack. We can use psql
, pgAdmin, or even our own IDE to perform ad-hoc queries against the TimescaleDB database on the edge node. Below are examples of typical ad-hoc queries a data analyst might perform on IoT sensor data. These example queries demonstrate TimescaleDB’s advanced analytical capabilities for working with time-series data, including Moving Average, Delta, Time Bucket, and Histogram.
Data Visualization with Grafana
Using the TimescaleDB continuous aggregates we have created, we can quickly build a richly featured dashboard in Grafana. Below we see a typical IoT Dashboard you might build to monitor the post’s IoT sensor data in near real-time. An exported version, dashboard_external_export.json, is included in the GitHub project.


Limiting Grafana’s Access to IoT Data
Following the Grafana recommendation for database user permissions, we create a grafanareader
PostgresSQL user, and limit the user’s access to the sensor_data
table and the four views we created. Grafana will use this user’s credentials to perform SELECT
queries of the TimescaleDB demo_iot
database.
Using PostgreSQL in Grafana
Grafana’s documentation includes a comprehensive set of instructions for Using PostgreSQL in Grafana. To connect to the TimescaleDB database from Grafana, we use the PostgreSQL data source plugin.

The data displayed in each Panel in the Grafana Dashboard is based on a SQL query. For example, the Average Temperature Panel might use a query similar to the example below. This particular query also converts Celsius to Fahrenheit. Note the use of Grafana Macros (e.g., $__time()
, $__timeFilter()
). Macros can be used within a query to simplify syntax and allow for dynamic parts.
SELECT
$__time(bucket),
device_id AS metric,
((avg_temp * 1.9) + 32) AS avg_temp
FROM temperature_humidity_summary_minute
WHERE
$__timeFilter(bucket)
ORDER BY 1,2
Below, we see another example from the Average Humidity Panel. In this particular query, we might choose to limit the humidity data to a valid range of 0%–100%.
SELECT
$__time(bucket),
device_id AS metric,
avg_humidity
FROM temperature_humidity_summary_minute
WHERE
$__timeFilter(bucket)
AND avg_humidity >= 0.0
AND avg_humidity <= 100.0
ORDER BY 1,2
Mobile Friendly
Grafana dashboards are mobile-friendly. Below we see two views of the dashboard, using the Chrome mobile browser on an Apple iPhone.

Grafana Alerts
Grafana allows Alerts to be created based on the Rules you define in each Panel. If data values match the Rule’s conditions, which you pre-define, such as a temperature reading above a certain threshold for a set amount of time, an alert is sent to your choice of destinations. According to the Rule shown below, If the average temperature exceeds 75°F for a period of 5 minutes, an alert is sent.

As demonstrated below, when the laboratory temperature began to exceed 75°F, the alert entered a ‘Pending’ state. If the temperature exceeded 75°F for the pre-determined period of 5 minutes, the alert status changes to ‘Alerting’, and an alert is sent. When the temperature dropped back below 75°F for the pre-determined period of 5 minutes, the alert status changed from ‘Alerting’ to ‘OK’, and a subsequent notification was sent.

There are currently twenty alert notifiers available out-of-the-box with Grafana, including Slack, email, PagerDuty, webhooks, VictorOps, Opsgenie, and Microsoft Teams. We can use Grafana Alerts to notify the proper resources, in near real-time, if an issue is detected based on the data. Below, we see an actual series of high-temperature alerts sent by Grafana to the Slack channel, followed by subsequent notifications as the temperature returned to normal.

Conclusion
This post explored the development of an IoT edge analytics stack comprised of lightweight, purpose-built, easily deployable and manageable platform- and programming language-agnostic, open-source software components. These components included Docker containerized versions of Grafana, TimescaleDB, Eclipse Mosquitto, and pgAdmin, referred to as the GTM Stack. Using the GTM stack, we collected, analyzed, and visualized IoT data without first shipping the data to Cloud or other external systems.
This blog represents my own viewpoints and not of my employer, Amazon Web Services (AWS). All product names, logos, and brands are the property of their respective owners.
AWS IoT Core for LoRaWAN, AWS IoT Analytics, and Amazon QuickSight
Posted by Gary A. Stafford in Analytics, AWS, Cloud, IoT, Python, Software Development on April 15, 2021
In the following post, we will learn how to monitor indoor air quality (IAQ) using a private LoRaWAN sensor device network. The devices transmit their sensor telemetry to AWS through a LoRaWAN gateway using the newly released AWS IoT Core for LoRaWAN service. We will then analyze and visualize the sensor data using AWS IoT Analytics and Amazon QuickSight.

Introduction
On December 15, 2020, AWS announced support for Semtech’s low-power, long-range wide area network (LoRaWAN) connectivity. LoRaWAN devices and gateways can now connect to AWS IoT Core using AWS IoT Core for LoRaWAN. AWS IoT Core for LoRaWAN is a fully managed feature that enables customers to connect wireless devices that use the LoRaWAN protocol with the AWS cloud. Using AWS IoT Core, customers can now set up a private LoRaWAN network by securely connecting their LoRaWAN devices and gateways to the AWS cloud — without developing or operating a LoRaWAN Network Server (LNS).

The LoRa Alliance describes the LoRaWAN specification as a Low Power, Wide Area (LPWA) networking protocol designed to wirelessly connect battery operated ‘things’ to the internet in regional, national, or global networks, and targets key Internet of Things (IoT) requirements such as bi-directional communication, end-to-end security, mobility and localization services. The LoRaWAN specification defines both the device-to-infrastructure (LoRa) physical layer parameters and the (LoRaWAN) protocol, providing seamless interoperability between manufacturers.
According to Wikipedia, LoRa (Long Range) is a proprietary low-power wide-area network modulation technique. It is based on spread spectrum modulation techniques derived from chirp spread spectrum (CSS) technology. It was developed by Cycleo of Grenoble, France, and acquired by Semtech, the founding member of the LoRa Alliance, and it is patented.
AWS-qualified Hardware
Along with the AWS IoT Core for LoRaWAN announcement, AWS released a list of qualified gateways found in the AWS Partner Device Catalog. The catalog helps customers discover qualified hardware that works with AWS services to help build and deliver successful IoT solutions. AWS IoT Core for LoRaWAN supports open-source gateway-LNS protocol software called LoRa Basics Station, an implementation of a LoRa packet forwarder. The AWS IoT Core for LoRaWAN gateway qualification program enables customers to source pre-tested LoRaWAN gateways and developer kits that meet the required LoRa Basics Station specifications.

LoRaWAN Gateway
In this post, we will use the AWS-qualified LoRaWAN-compliant MiniHub Pro gateway by Browan Communications. At $109, MiniHub Pro is a low-cost gateway that uses LoRa Basics Station to forward RF packets received from LoRaWAN devices (uplinks) to the LoRaWAN Network Server (LNS), part of the AWS IoT Core for LoRaWAN service. The MiniHub Pro is based on FreeRTOS, the real-time operating system for microcontrollers.

LoRaWAN Devices
In addition to the gateway, we will use a set of two Model TBHV110 915 Mhz Healthy Home Sensor IAQ (aka Tabs Healthy Home), also by Browan Communications Inc. According to Browan, the Healthy Home Sensor utilizes LoRaWAN connectivity (LoRaWAN 1.0.3) to communicate the temperature, relative humidity, volatile organic compound (VOC) levels, and indoor air quality (IAQ) of the surrounding environment. The Healthy Home Sensor costs $60 per sensor. The gateway and two devices used in this post were purchased from Cal-Chip Connected Devices at a total retail cost of $229 plus tax and shipping.

IAQ Index
The Healthy Home Sensor determines an IAQ Index, a reading between 0–500, indicating general air quality. According to the CDC, monitoring and maintaining good indoor air quality and proper ventilation have become essential to reduce the potential airborne spread of COVID-19.

Source Code
The source code for this post is available on GitHub. Use the following command to git clone
a local copy of the project.
git clone --branch main --single-branch --depth 1 \
https://github.com/garystafford/lorawan-iot-core-demo.git
The Lambda function and its associated AWS resources are deployed using the AWS Serverless Application Model (SAM). The primary AWS resources used in this demonstration are shown in the architectural diagram below.

Adding Gateways and Devices
In combination with the manufacturer’s product manual, the AWS documentation walks you through adding your LoRaWAN gateways and devices with AWS IoT Core for LoRaWAN. Adding gateways and devices is easy and straightforward on AWS as long as you have the appropriate GatewayEUI, DevEUI, AppEUI (aka JoinEUI), and AppKey, supplied by the manufacturer or retailer.

Over-the-Air Activation (OTAA)
In this post, the gateway and sensor devices are connected to AWS IoT Core for LoRaWAN using Over-the-Air Activation (OTAA). According to both The Things Network and AWS, OTAA is preferred over Activation by Personalization (ABP) as it is more secure. When using OTAA, temporary session keys are derived from root keys on each activation. ABP uses static keys, is less secure, and should not be used if not explicitly required by the use case.

The LoRaWAN gateway is added to AWS IoT Core for LoRaWAN. The gateway is then configured locally using the information received from IoT Core. With the MiniHub Pro, you can enable FreeRTOS Over-the-Air Updates and Configuration and Update Server (CUPS).


Below, we see that the MiniHub Pro gateway has been successfully added to IoT Core and has exchanged uplink frames (RF packets) through that connection with the AWS IoT Core for LoRaWAN, which is acting as the LoRaWAN Network Server (LNS).

We then add the two Healthy Home Sensor devices in a similar fashion.

Below, we see that one of the two Healthy Home Sensor devices has successfully connected to AWS IoT Core for LoRaWAN and has also transmitted uplink frames via the MiniHub Pro gateway.

IoT Destinations
Messages from the devices are received from the gateway and routed to an IoT rule using a Destination. According to AWS, a destination describes the AWS IoT rule that processes the data from a wireless device so that AWS IoT services can use the data. Many devices can share a single destination.

IoT Rules
IoT rules, according to AWS, tell AWS IoT what to do when it receives messages from your devices. Rules extract data from messages, evaluate expressions using message data, and invoke one or more actions when the rule’s conditions are met.

Using AWS IoT Core for LoRaWAN, the device’s messages are base64 encoded binary messages. The IoT rule’s query statement is a SQL statement that determines which messages are forwarded to Actions. In the case of LoRaWAN, the query statement contains an inline call to an AWS Lambda function. The function accepts a number of input parameters. It decodes the base64 encoded message, decodes and translates the binary message, and builds a Python dictionary with the results. Finally, the dictionary is serialized to JSON and returned to the IoT rule.
Healthy Home Sensor Messages
LoRa uses a license-free sub-gigahertz radio frequency of 915 MHz in North America. The Healthy Home Sensor device transmits binary messages over this radio frequency using the LoRaWAN 1.0.3 specification. Binary messages sent by the Healthy Home Sensor device to the MiniHub Pro gateway have a payload length of 11 bytes, as follows:

The device encrypts its binary message, containing sensor data, using AES128 CTR mode before transmitting it. AWS IoT Core for LoRaWAN decrypts the binary message, then encodes the decrypted binary message payload as a base64 string.
The final JSON-format message delivered by AWS IoT Core for LoRaWAN the IoT rule contains the device’s base64 encoded binary message (PayloadData), along with additional information about the source LoRaWAN device and the LoRaWAN gateway that transmitted the message.
The IoT rule’s query statement calls a Lambda function to decode and transform the base64 encoded binary message. The Lambda calls the correct decoder, which contains logic to decode each byte of the binary message. AWS has developed the IoT Rule decoder Lambda along with several binary message payload decoders, freely available on GitHub, including:
- Browan Tabs Object Locator
- Dragino LHT65, LGT92, LSE01, LBT1, LDS01
- Axioma W1
- Elsys
- Globalsat LT-100
- NAS Pulse Reader UM3080
Since ASW did not include the Browan Healthy Home Sensor device in the list of provided decoders, I have created a new Python-based decoder using the message payload information detailed in the device’s product manual.
The decoder first decodes the base64 encoded binary message payload.
# base64 encoded binary message
AAs0LNsCAQB/ADM=
# base64 decoded 11-byte binary message
00000000 00001011 00110100 00101100 11011011 00000010 00000001 00000000 01111111 00000000 00110011
# as hexadecimal
00 0b 34 2c db 02 01 00 7f 00 33
# as decimal
0 11 52 44 219 2 1 0 127 0 51
The decoder then processes each byte of the binary message payload, bit-by-bit, and applies any additional logic to derive the final sensor values. Decoding the base64 encoded binary message payload and placing it back into the original message, we are left with the following message structure:
IoT Rule Actions
Messages returned by the rule’s query and processed by the Lambda function are passed to an AWS IoT rule action. AWS IoT rule actions specify what to do when a rule is triggered. This rule’s action sends a message to an AWS IoT Analytics channel.

AWS IoT Analytics
IoT Analytics, according to AWS, is a fully-managed service that makes it easy to run and operationalize sophisticated analytics on massive volumes of IoT data. IoT Analytics consists of five primary components: channels, pipelines, data stores, datasets, and notebooks.

The IoT rule sends messages to the IoT Analytics channel. The channel publishes the data to a pipeline. The pipeline consumes messages from the channel and enables you to process and filter the messages before storing them in the data store. The data store receives and stores the messages. You retrieve data from a data store by creating a SQL dataset or a container dataset. The SQL dataset contains a SQL query, which is executed against the data store. According to the documentation, both Amazon Athena and Amazon IoT Analytics’ SQL expressions are based on PrestoDB.

The SQL query I have written is specific to the data collected by the Healthy Home Sensor device. In Amazon QuickSight, we will use this dataset to construct an IAQ monitoring dashboard.
Amazon QuickSight
Amazon QuickSight, according to AWS, is a scalable, serverless, embeddable, machine learning-powered Business Intelligence (BI) service built for the cloud. QuickSight lets you easily create and publish interactive BI dashboards that include Machine Learning-powered insights. AWS IoT Analytics provides direct integration with Amazon QuickSight.

Using the ‘AWS IoT Analytics’ data source, we can build a QuickSight data set by importing the IoT Analytics dataset built in the previous step. In QuickSight, we are able to preview the data, change field types, apply filters, add calculated fields (e.g., sensor_location
), and exclude fields if desired.

The final data is then saved (cached) in SPICE, QuickSight’s Super-fast, Parallel, In-memory Calculation Engine.https://programmaticponderings.wordpress.com/media/509f9c79155177fd1480eeee2bd23593Final data used for analysis
Using this dataset, we can build a QuickSight Analysis. The analysis will use a variety of visual types to display sensor data, such as temperature, relative humidity, volatile organic compound (VOC) levels, indoor air quality (IAQ), the sensor device’s battery levels, and the gateway’s SNR (Signal-to-Noise Ratio) and RSSI (Received Signal Strength Indication).

We can then securely share the Analysis we have built with data consumers as a QuickSight Dashboard, accessible through a web browser or using the Amazon QuickSight mobile app.

Below, we see how QuickSight is able to visualize indoor air quality data for multiple sensor locations over the course of a single day. In this visual, sensor data is captured and displayed in five-minute increments. The visual contains reference lines indicating both good and bad IAQ thresholds. Analyzing the results, we can quickly see how external and internal environmental conditions, HVAC systems, air filtration devices, external ventilation through windows and doors, and human activity all impact IAQ throughout the day in different areas of the dwelling.

Conclusion
In this post, we learned how easy it is to activate, configure, and start collecting sensor telemetry from LoRaWAN devices and gateways using the newly released AWS IoT Core for LoRaWAN service. We observed the seamless integration of AWS IoT Core, IoT Analytics, and Amazon QuickSight to transform, store, and visualize sensor data. Extending this example to add notifications and auto-remediation based on sensor data is equally as easy with services such as AWS IoT Events.
Reference
GTM Stack: Exploring IoT Data Analytics at the Edge with Grafana, Mosquitto, and TimescaleDB on ARM-based Architectures
Posted by Gary A. Stafford in Big Data, IoT, Python, Raspberry Pi, Software Development on October 12, 2020
In the following post, we will explore the integration of several open-source software applications to build an IoT edge analytics stack, designed to operate on ARM-based edge nodes. We will use the stack to collect, analyze, and visualize IoT data without first shipping the data to the Cloud or other external systems.

The Edge
Edge computing is a fast-growing technology trend, which involves pushing compute capabilities to the edge. Wikipedia describes edge computing as a distributed computing paradigm that brings computation and data storage closer to the location needed to improve response times and save bandwidth. The term edge commonly refers to a compute node at the edge of a network (edge device), sitting in close proximity to the source a data and between that data source and external system such as the Cloud.
In his recent post, 3 Advantages (And 1 Disadvantage) Of Edge Computing, well-known futurist Bernard Marr argues reduced bandwidth requirements, reduced latency, and enhanced security and privacy as three primary advantages of edge computing. Due to techniques like data downsampling, Marr advises one potential disadvantage of edge computing is that important data could end up being overlooked and discarded in the quest to save bandwidth and reduce latency.
David Ricketts, Head of Marketing at Quiss Technology PLC, estimates in his post, Cloud and Edge Computing — The Stats You Need to Know for 2018, the global edge computing market is expected to reach USD 6.72 billion by 2022 at a compound annual growth rate of a whopping 35.4 percent. Realizing the market potential, many major Cloud providers, edge device manufacturers, and integrators are rapidly expanding their edge compute capabilities. AWS, for example, currently offers more than a dozen services in the edge computing category.
Internet of Things
Edge computing is frequently associated with the Internet of Things (IoT). IoT devices, industrial equipment, and sensors generate data, which is transmitted to other internal and external systems, often by way of edge nodes, such as an IoT Gateway. IoT devices typically generate time-series data. According to Wikipedia, a time series is a set of data points indexed in time order — a sequence taken at successive equally spaced points in time. IoT devices typically generate continuous high-volume streams of time-series data, often on the scale of millions of data points per second. IoT data characteristics require IoT platforms to minimally support temporal accuracy, high-volume ingestion and processing, efficient data compression and downsampling, and real-time querying capabilities.
The IoT devices and the edge devices, such as IoT Gateways, which aggregate and transmit IoT data from these devices to external systems, are generally lower-powered, with limited processor, memory, and storage capabilities. Accordingly, IoT platforms must satisfy all the requirements of IoT data while simultaneously supporting resource-constrained environments.
IoT Analytics at the Edge
Leading Cloud providers AWS, Azure, Google Cloud, IBM Cloud, Oracle Cloud, and Alibaba Cloud all offer IoT services. Many offer IoT services with edge computing capabilities. AWS offers AWS IoT Greengrass. Greengrass provides local compute, messaging, data management, sync, and ML inference capabilities to edge devices. Azure offers Azure IoT Edge. Azure IoT Edge provides the ability to run AI, Azure and third-party services, and custom business logic on edge devices using standard containers. Google Cloud offers Edge TPU. Edge TPU (Tensor Processing Unit) is Google’s purpose-built application-specific integrated circuit (ASIC), designed to run AI at the edge.
IoT Analytics
Many Cloud providers also offer IoT analytics as part of their suite of IoT services, although not at the edge. AWS offers AWS IoT Analytics, while Azure has Azure Time Series Insights. Google provides IoT analytics, indirectly, through downstream analytic systems and ad hoc analysis using Google BigQuery or advanced analytics and machine learning with Cloud Machine Learning Engine. These services generally all require data to be transmitted to the Cloud for analytics.

The ability to analyze IoT data at the edge, as data is streamed in real-time, is critical to a rapid feedback loop. IoT edge analytics can accelerate anomaly detection, improve predictive maintenance capabilities, and expedite proactive inventory replenishment.
The IoT Edge Analytics Stack
In my opinion, the ideal IoT edge analytics stack is comprised of lightweight, purpose-built, easily deployable and manageable, platform- and programming language-agnostic, open-source software components. The minimal IoT edge analytics stack should include a lightweight message broker, a time-series database, an ANSI-standard ad-hoc query engine, and a data visualization tool. Each component should be purpose-built for IoT.
Lightweight Message Broker
We will use Eclipse Mosquitto as our message broker. According to the project’s description, Mosquitto is an open-source message broker that implements the Message Queuing Telemetry Transport (MQTT) protocol versions 5.0, 3.1.1, and 3.1. Mosquitto is lightweight and suitable for use on all devices from low power single board computers (SBCs) to full servers.
MQTT Client Library
We will interact with Mosquitto using Eclipse Paho. According to the project, the Eclipse Paho project provides open-source, mainly client-side implementations of MQTT and MQTT-SN in a variety of programming languages. MQTT and MQTT for Sensor Networks (MQTT-SN) are lightweight publish/subscribe messaging transports for TCP/IP and connectionless protocols, such as UDP, respectively.
We will be using Paho’s Python Client. The Paho Python Client provides a client class with support for both MQTT v3.1 and v3.1.1 on Python 2.7 or 3.x. The client also provides helper functions to make publishing messages to an MQTT server straightforward.
Time-Series Database
Time-series databases are optimal for storing IoT data. According to InfluxData, makers of a leading time-series database, InfluxDB, a time-series database (TSDB), is a database optimized for time-stamped or time-series data. Time series data are simply measurements or events that are tracked, monitored, downsampled, and aggregated over time. Jiao Xian, of Alibaba Cloud, has authored an insightful post on the time-series database ecosystem, What Are Time Series Databases? A few leading Cloud providers offer purpose-built time-series databases, though they are not available at the edge. AWS offers Amazon Timestream and Alibaba Cloud offers Time Series Database.
InfluxDB is an excellent choice for a time-series database. It was my first choice, along with TimescaleDB, when developing this stack. However, InfluxDB Flux’s apparent incompatibilities with some ARM-based architectures ruled it out for inclusion in the stack for this particular post.
We will use TimescaleDB as our time-series database. TimescaleDB is the leading open-source relational database for time-series data. Described as ‘PostgreSQL for time-series,’ TimescaleDB is based on PostgreSQL, which provides full ANSI SQL, rock-solid reliability, and a massive ecosystem. TimescaleDB claims to achieve 10–100x faster queries than PostgreSQL, InfluxDB, and MongoDB, with native optimizations for time-series analytics.
TimescaleDB claims to achieve 10–100x faster queries than PostgreSQL, InfluxDB, and MongoDB, with native optimizations for time-series analytics.
TimescaleDB is designed for performing analytical queries, both through its native support for PostgreSQL’s full range of SQL functionality, as well as additional functions native to TimescaleDB. These time-series optimized functions include Median/Percentile, Cumulative Sum, Moving Average, Increase, Rate, Delta, Time Bucket, Histogram, and Gap Filling.
Ad-hoc Data Query Engine
We have the option of using psql
, the terminal-based front-end to PostgreSQL, to execute ad-hoc queries against TimescaleDB. The psql
front-end enables you to enter queries interactively, issue them to PostgreSQL, and see the query results.

We also have the option of using pgAdmin, specifically the biarms/pgadmin4 Docker version, to execute ad-hoc queries and perform most other database tasks. pgAdmin is the most popular open-source administration and development platform for PostgreSQL. While several popular Docker versions of pgAdmin only support Linux AMD64 architectures, the biarms/pgadmin4 Docker version supports ARM-based devices.


Data Visualization
For data visualization, we will use Grafana. Grafana allows you to query, visualize, alert on, and understand metrics no matter where they are stored. With Grafana, you can create, explore, and share dashboards, fostering a data-driven culture. Grafana allows you to define thresholds visually and get notified via Slack, PagerDuty, and more. Grafana supports dozens of data sources, including MySQL, PostgreSQL, Elasticsearch, InfluxDB, TimescaleDB, Graphite, Prometheus, Google BigQuery, GraphQL, and Oracle. Grafana is extensible through a large collection of plugins.

Edge Deployment and Management Platform
Docker introduced the current industry standard for containers in 2013. Docker containers are a standardized unit of software that allows developers to isolate apps from their environment. We will use Docker to deploy the IoT edge analytics stack, referred to herein as the GTM Stack, composed of containerized versions of Eclipse Mosquitto, TimescaleDB, and Grafana, and pgAdmin, to an ARM-based edge node. The acronym, GTM, comes from the three primary OSS projects composing the stack. The acronym also suggests Greenwich Mean Time, relating to the precise time-series nature of IoT data.

Running Docker Engine in swarm mode, we can use Docker to deploy the complete IoT edge analytics stack to the swarm, running on the edge node. The deploy command accepts a stack description in the form of a Docker Compose file, a YAML file used to configure the application’s services. With a single command, we can create and start all the services from the configuration file.
Source Code
All source code for this post is available on GitHub. Use the following command to git clone
a local copy of the project.
IoT Devices
For this post, I have deployed three Linux ARM-based IoT devices, each connected to a sensor array. Each sensor array contains multiple analog and digital sensors. The sensors record temperature, humidity, air quality (liquefied petroleum gas (LPG), carbon monoxide (CO), and smoke), light, and motion. For more information on the IoT device and sensor hardware involved, please see my previous post: Getting Started with IoT Analytics on AWS.
Each ARM-based IoT device is running a small Python3-based script, sensor_data_to_mosquitto.py, shown below.
The IoT devices’ script implements the Eclipse Paho MQTT Python client library. An MQTT message containing simultaneous readings from each sensor is sent to a Mosquitto topic on the edge node, at a configurable frequency.
IoT Edge Node
For this post, I have deployed a single Linux ARM-based edge node. The three IoT devices, containing sensor arrays, communicate with the edge node over Wi-Fi. The IoT devices could easily use an alternative communication protocol, such as BLE, LoRaWAN, or Ethernet. For more information on BLE and LoRaWAN, please see some of my previous posts: LoRa and LoRaWAN for IoT: Getting Started with LoRa and LoRaWAN Protocols for Low Power, Wide Area Networking of IoT and BLE and GATT for IoT: Getting Started with Bluetooth Low Energy (BLE) and Generic Attribute Profile (GATT) Specification for IoT.
The edge node is also running a small Python3-based script, mosquitto_to_timescaledb.py, shown below.
Similar to the IoT devices, the edge node’s script implements the Eclipse Paho MQTT Python client library. The script pulls MQTT messages off a Mosquitto topic(s), serializes the message payload to JSON, and writes the payload’s data to the TimescaleDB database. The edge node’s script accepts several arguments, which allow you to configure necessary Mosquitto and TimescaleDB variables.
Why not use Telegraf?
Telegraf is a plugin-driven agent that collects, processes, aggregates, and writes metrics. There is a Telegraf output plugin, the PostgreSQL and TimescaleDB Output Plugin for Telegraf, produced by TimescaleDB. The plugin could replace the need to manage and maintain the above script. However, I chose not to use it because it is not yet an official Telegraf plugin. If the plugin was included in a Telegraf release, I would certainly encourage its use.
Script Management
Both the Linux-based IoT devices and the edge node run systemd
system and service manager. To ensure the Python scripts keep running in the case of a system restart, we define a systemd
unit. Units are the objects that systemd
knows how to manage. These are basically a standardized representation of system resources that can be managed by the suite of daemons and manipulated by the provided utilities. Each script has a systemd
unit files. Below, we see the gtm_stack_mosquitto
unit file, gtm_stack_mosquitto.service.
The gtm_stack_mosq_to_tmscl
unit file, gtm_stack_mosq_to_tmscl.service, is nearly identical.
To install the gtm_stack_mosquitto.service
systemd
unit file on each IoT device, use the following commands.
Installing the gtm_stack_mosq_to_tmscl.service
unit file on the edge node is nearly identical.
Docker Stack
The edge node runs the GTM Docker stack, stack.yml, in a swarm. As discussed earlier, the stack contains four containers: Eclipse Mosquitto, TimescaleDB, and Grafana, along with pgAdmin. The Mosquitto, TimescaleDB, and Grafana containers have paths within the containers, bind-mounted to directories on the edge device. With bind-mounting, the container’s data will persist if the containers are removed and re-created. The containers are running on their own isolated overlay network.
The GTM Docker stack is installed using the following commands on the edge node. We will assume Docker and git are pre-installed on the edge node for this post.
First, we create the proper local directories on the edge device, which will be used to bind-mount to the container’s directories. Below, we see the bind-mounted local directories with the eventual container’s contents stored within them.

Next, we copy the custom Mosquitto configuration file, mosquitto.conf, included in the project, to the correct location on the edge device. Lastly, we initialize the Docker swarm and deploy the stack.

docker container ls
command, showing the running GTM Stack containers
docker stats
command, showing the resource consumption of GTM Stack containersTimescaleDB Setup
With the GTM stack running, we need to create a single Timescale hypertable, sensor_data
, in the TimescaleDB demo_iot
database, to hold the incoming IoT sensor data. Hypertables, according to TimescaleDB, are designed to be easy to manage and to behave like standard PostgreSQL tables. Hypertables are comprised of many interlinked “chunk” tables. Commands made to the hypertable automatically propagate changes down to all of the chunks belonging to that hypertable.
I suggest using psql
to execute the required DDL statements, which will create the hypertable, as well as the proceeding views and database user permissions. All SQL statements are included in the project’s statements.sql file. One way to use psql
is to install it on your local workstation, then use psql
to connect to the remote edge node. I prefer to instantiate a local PostgreSQL Docker container instance running psql
. I then use the local container’s psql
client to connect to the edge node’s TimescaleDB database. For example, from my local machine, I run the following docker run
command to connect to the edge node’s TimescaleDB database, on the edge node, located locally at 192.168.1.12
.
Although not always practical, could also access psql
from within the TimescaleDB Docker container, running on the actual edge node, using the following docker exec
command.
TimescaleDB Continuous Aggregates
For this post’s demonstration, we need to create four TimescaleDB database views, which will be queried from an eventual Grafana Dashboard. The views are TimescaleDB Continuous Aggregates. According to Timescale, aggregate queries that touch large swathes of time-series data can take a long time to compute because the system needs to scan large amounts of data on every query execution. TimescaleDB continuous aggregates automatically calculate the results of a query in the background and materialize the results.
For example, in this post, we generate sensor data every five seconds from the three IoT devices. When visualizing a 24-hour period in Grafana, using continuous aggregates with an interval of one minute, we would reduce the total volume of data queried from ~51,840 rows to ~4,320 rows, a reduction of over 91%. The larger the time period or the number of IoT devices being analyzed, the more significant these savings will positively impact query performance.
A time_bucket
on the time partitioning column of the hypertable is required in all continuous aggregate views. The time_bucket
function, in this case, has a bucket width (interval) of 1 minute. The interval is configurable.
Limiting Grafana’s Access to IoT Data
Following the Grafana recommendation for database user permissions, we create a grafanareader
PostgresSQL user, and limit the user’s access to the sensor_data
table and the four views we created. Grafana will use this user’s credentials to perform SELECT
queries of the TimescaleDB demo_iot
database.
Grafana Dashboards
Using the TimescaleDB continuous aggregates we have created, we can quickly build a richly-featured dashboard in Grafana. Below we see a typical IoT Dashboard you might build to monitor the post’s IoT sensor data in near real-time. An exported version, dashboard_external_export.json, is included in the GitHub project.


Grafana’s documentation includes a comprehensive set of instructions for Using PostgreSQL in Grafana. To connect to the TimescaleDB database from Grafana, we use a PostgreSQL data source.

The data displayed in each Panel in the Grafana Dashboard is based on a SQL query. For example, the Average Temperature Panel might use a query similar to the example below. This particular query also converts Celcius to Fahrenheit. Note the use of Grafana Macros (e.g., $__time()
, $__timeFilter()
). Macros can be used within a query to simplify syntax and allow for dynamic parts.
SELECT
$__time(bucket),
device_id AS metric,
((avg_temp * 1.9) + 32) AS avg_temp
FROM temperature_humidity_summary_minute
WHERE
$__timeFilter(bucket)
ORDER BY 1,2
Below, we see another example from the Average Humidity Panel. In this particular query, we might choose to validate the humidity data is within an acceptable range of 0%–100%.
SELECT
$__time(bucket),
device_id AS metric,
avg_humidity
FROM temperature_humidity_summary_minute
WHERE
$__timeFilter(bucket)
AND avg_humidity >= 0.0
AND avg_humidity <= 100.0
ORDER BY 1,2
Mobile Friendly
Grafana dashboards are mobile-friendly. Below we see two views of the dashboard, using the Chrome mobile browser on an Apple iPhone.

Grafana Alerts
Grafana allows Alerts to be created based on Rules you define in each Panel. If data values match the Rule’s conditions, which you pre-define, such as a temperature reading above a certain threshold for a set amount of time, an alert is sent to your choice of destinations. According to the Rule shown below, If the average temperature exceeds 75°F for a period of 5 minutes, an alert is sent.

As demonstrated below, when the temperature in the laboratory began to exceed 75°F, the alert entered a ‘Pending’ state. If the temperature exceeded 75°F for the pre-determined period of 5 minutes, the alert status changed to ‘Alerting’, and an alert was sent. When the temperature dropped back below 75°F for the pre-determined period of 5 minutes, the alert status changed from ‘Alerting’ to ‘OK’, and a subsequent notification was sent.

There are currently 18 destinations available out-of-the-box with Grafana, including Slack, email, PagerDuty, webhooks, HipChat, and Microsoft Teams. We can use Grafana Alerts to notify the proper resources, in near real-time, if an issue is detected, based on the data. Below, we see an actual series of high-temperature alerts sent by Grafana to a Slack channel, followed by subsequent notifications as the temperature returned to normal.

Ad-hoc Queries
The ability to perform ad-hoc queries on the time-series IoT data is an essential feature of the IoT edge analytics stack. We can use psql
or pgAdmin to perform ad-hoc queries against the TimescaleDB database. Below are examples of typical ad-hoc queries we might perform on the IoT sensor data. These example queries demonstrate TimescaleDB’s advanced analytical capabilities for working with time-series data, including Moving Average, Delta, Time Bucket, and Histogram.
Conclusion
In this post, we have explored the development of an IoT edge analytics stack, comprised of lightweight, purpose-built, easily deployable and manageable, platform- and programming language-agnostic, open-source software components. These components included Docker containerized versions of Eclipse Mosquitto, TimescaleDB, Grafana, and pgAdmin, referred to as the GTM Stack. Using the GTM stack, we collected, analyzed, and visualized IoT data, without first shipping the data to Cloud or other external systems.
This blog represents my own viewpoints and not of my employer, Amazon Web Services. All product names, logos, and brands are the property of their respective owners.
Collecting and Analyzing IoT Data in Near Real-Time with AWS IoT, LoRa, and LoRaWAN
Posted by Gary A. Stafford in AWS, Bash Scripting, IoT, Python, Raspberry Pi, Serverless, Software Development on August 26, 2020
Introduction
In a recent post published on ITNEXT, LoRa and LoRaWAN for IoT: Getting Started with LoRa and LoRaWAN Protocols for Low Power, Wide Area Networking of IoT, we explored the use of the LoRa (Long Range) and LoRaWAN protocols to transmit and receive sensor data, over a substantial distance, between an IoT device, containing several embedded sensors, and an IoT gateway. In this post, we will extend that architecture to the Cloud, using AWS IoT, a broad and deep set of IoT services, from the edge to the Cloud. We will securely collect, transmit, and analyze IoT data using the AWS cloud platform.

LoRa and LoRaWAN
According to the LoRa Alliance, Low-Power, Wide-Area Networks (LPWAN) are projected to support a major portion of the billions of devices forecasted for the Internet of Things (IoT). LoRaWAN is designed from the bottom up to optimize LPWANs for battery lifetime, capacity, range, and cost. LoRa and LoRaWAN permit long-range connectivity for IoT devices in different types of industries. According to Wikipedia, LoRaWAN defines the communication protocol and system architecture for the network, while the LoRa physical layer enables the long-range communication link.
AWS IoT
AWS describes AWS IoT as a set of managed services that enable ‘internet-connected devices to connect to the AWS Cloud and lets applications in the cloud interact with internet-connected devices.’ AWS IoT services span three categories: Device Software, Connectivity and Control, and Analytics.
In this post, we will focus on three AWS IOT services, one from each category, including AWS IoT Device SDKs, AWS IoT Core, and AWS IoT Analytics. According to AWS, the AWS IoT Device SDKs include open-source libraries and developer and porting guides with samples to help you build innovative IoT products or solutions on your choice of hardware platforms. AWS IoT Core is a managed cloud service that lets connected devices easily and securely interact with cloud applications and other devices. AWS IoT Core can process and route messages to AWS endpoints and other devices reliably and securely. Finally, AWS IoT Analytics is a fully-managed IoT analytics service, designed specifically for IoT, which collects, pre-processes, enriches, stores, and analyzes IoT device data at scale.
To learn more about AWS IoT, specifically the AWS IoT services we will be exploring within this post, I recommend reading my recent post published on Towards Data Science, Getting Started with IoT Analytics on AWS.
Hardware Selection
In this post, we will use the following hardware.
IoT Device with Embedded Sensors
An Arduino single-board microcontroller will serve as our IoT device. The 3.3V AI-enabled Arduino Nano 33 BLE Sense board (Amazon: USD 36.00), released in August 2019, comes with the powerful nRF52840 processor from Nordic Semiconductors, a 32-bit ARM Cortex-M4 CPU running at 64 MHz, 1MB of CPU Flash Memory, 256KB of SRAM, and a NINA-B306 stand-alone Bluetooth 5 low energy (BLE) module.

The Sense contains an impressive array of embedded sensors:
- 9-axis Inertial Sensor (LSM9DS1): 3D digital linear acceleration sensor, a 3D digital
angular rate sensor, and a 3D digital magnetic sensor - Humidity and Temperature Sensor (HTS221): Capacitive digital sensor for relative humidity and temperature
- Barometric Sensor (LPS22HB): MEMS nano pressure sensor: 260–1260 hectopascal (hPa) absolute digital output barometer
- Microphone (MP34DT05): MEMS audio sensor omnidirectional digital microphone
- Gesture, Proximity, Light Color, and Light Intensity Sensor (APDS9960): Advanced Gesture detection, Proximity detection, Digital Ambient Light Sense (ALS), and Color Sense (RGBC).
The Arduino Sense is an excellent, low-cost single-board microcontroller for learning about the collection and transmission of IoT sensor data.
IoT Gateway
An IoT Gateway, according to TechTarget, is a physical device or software program that serves as the connection point between the Cloud and controllers, sensors, and intelligent devices. All data moving to the Cloud, or vice versa, goes through the gateway, which can be either a dedicated hardware appliance or software program.
LoRa Gateways, to paraphrase The Things Network, form the bridge between devices and the Cloud. Devices use low power networks like LoRaWAN to connect to the Gateway, while the Gateway uses high bandwidth networks like WiFi, Ethernet, or Cellular to connect to the Cloud.

A third-generation Raspberry Pi 3 Model B+ single-board computer (SBC) will serve as our LoRa IoT Gateway. This Raspberry Pi model features a 1.4GHz Cortex-A53 (ARMv8) 64-bit quad-core processor System on a Chip (SoC), 1GB LPDDR2 SDRAM, dual-band wireless LAN, Bluetooth 4.2 BLE, and Gigabit Ethernet (Amazon: USD 42.99).

LoRa Transceiver Modules
To transmit the IoT sensor data between the IoT device, containing the embedded sensors, and the IoT gateway, I have used the REYAX RYLR896 LoRa transceiver module (Amazon: USD 19.50 x 2). The transceiver modules are commonly referred to as a universal asynchronous receiver-transmitter (UART). A UART is a computer hardware device for asynchronous serial communication in which the data format and transmission speeds are configurable.
According to the manufacturer, REYAX, the RYLR896 contains the Semtech SX1276 long-range, low power transceiver. The RYLR896 module provides ultra-long range spread spectrum communication and high interference immunity while minimizing current consumption. Each RYLR896 module contains a small, PCB integrated, helical antenna. This transceiver operates at both the 868 and 915 MHz frequency ranges. In this demonstration, we will be transmitting at 915 MHz for North America.
The Arduino Sense (IoT device) transmits data, using one of the RYLR896 modules (shown below front). The Raspberry Pi (IoT Gateway), connected to the other RYLR896 module (shown below rear), receives the data.

LoRaWAN Security
The RYLR896 is capable of AES 128-bit data encryption. Using the Advanced Encryption Standard (AES), we will encrypt the data sent from the IoT device to the IoT gateway, using a 32 hex digit password (128 bits / 4 bits/hex digit).
Provisioning AWS Resources
To start, we will create the necessary AWS IoT and associated resources on the AWS cloud platform. Once these resources are in place, we can then proceed to configure the IoT device and IoT gateway to securely transmit the sensor data to the Cloud.
All the source code for this post is on GitHub. Use the following command to git clone a local copy of the project.
git clone --branch master --single-branch --depth 1 --no-tags \
https://github.com/garystafford/iot-aws-lora-demo.git
AWS CloudFormation
The CloudFormation template, iot-analytics.yaml, will create an AWS IoT CloudFormation stack containing the following resources.
- AWS IoT Thing
- AWS IoT Thing Policy
- AWS IoT Core Topic Rule
- AWS IoT Analytics Channel, Pipeline, Data store, and Data set
- AWS Lambda and Lambda Permission
- Amazon S3 Bucket
- Amazon SageMaker Notebook Instance
- AWS IAM Roles
Please be aware of the costs involved with the AWS resources used in the CloudFormation template before continuing. To create the AWS CloudFormation stack from the included CloudFormation template, execute the following AWS CLI command.
The resulting CloudFormation stack should contain 16 AWS resources.

Additional Resources
Unfortunately, AWS CloudFormation cannot create all the AWS IoT resources we require for this demonstration. To complete the AWS provisioning process, execute the following series of AWS CLI commands, aws_cli_commands.md. These commands will create the remaining resources, including an AWS IoT Thing Type, Thing Group, Thing Billing Group, and an X.509 Certificate.
IoT Device Configuration
With the AWS resources deployed, we can configure the IoT device and IoT Gateway.
Arduino Sketch
For those not familiar with Arduino, a sketch is the name that Arduino uses for a program. It is the unit of code that is uploaded into non-volatile flash memory and runs on an Arduino board. The Arduino language is a set of C and C++ functions. All standard C and C++ constructs supported by the avr-g++ compiler should work in Arduino.
For this post, the sketch, lora_iot_demo_aws.ino, contains the code necessary to collect and securely transmit the environmental sensor data, including temperature, relative humidity, barometric pressure, Red, Green, and Blue (RGB) color, and ambient light intensity, using the LoRaWAN protocol.
AT Commands
Communications with the RYLR896’s long-range modem is done using AT commands. AT commands are instructions used to control a modem. AT is the abbreviation of ATtention. Every command line starts with AT. That is why modem commands are called AT commands, according to Developer’s Home. A complete list of AT commands can be downloaded as a PDF from the RYLR896 product page.
To efficiently transmit the environmental sensor data from the IoT sensor to the IoT gateway, the sketch concatenates the sensor ID and the sensor values together in a single string. The string will be incorporated into an AT command, sent to the RYLR896 LoRa transceiver module. To make it easier to parse the sensor data on the IoT gateway, we will delimit the sensor values with a pipe (|), as opposed to a comma. According to REYAX, the maximum length of the LoRa payload is approximately 330 bytes.
Below, we see an example of an AT command used to send the sensor data from the IoT sensor and the corresponding unencrypted data received by the IoT gateway. Both contain the LoRa transmitter Address ID, payload length (62 bytes in the example), and the payload. The data received by the IoT gateway also has the Received signal strength indicator (RSSI), and Signal-to-noise ratio (SNR).
Receiving Data on IoT Gateway
The Raspberry Pi will act as a LoRa IoT gateway, receiving the environmental sensor data from the IoT device, the Arduino, and sending the data to AWS. The Raspberry Pi runs a Python script, rasppi_lora_receiver_aws.py, which will receive the data from the Arduino Sense, decrypt the data, parse the sensor values, and serialize the data to a JSON payload, and finally, transmit the payload in an MQTT-protocol message to AWS. The script uses the pyserial, the Python Serial Port Extension, which encapsulates the access for the serial port for communication with the RYLR896 module. The script uses the AWS IoT Device SDK for Python v2 to communicate with AWS.
Running the IoT Gateway Python Script
To run the Python script on the Raspberry Pi, we will use a helper shell script, rasppi_lora_receiver_aws.sh. The shell script helps construct the arguments required to execute the Python script.
To run the helper script, we execute the following command, substituting the input parameter, the AWS IoT endpoint, with your endpoint.
sh ./rasppi_lora_receiver_aws.sh \
a1b2c3d4e5678f-ats.iot.us-east-1.amazonaws.com
You should see the console output, similar to the following.

The script starts by configuring the RYLR896 module and outputting that configuration to a log file, output.log
. If successful, we should see the following debug information logged.
DEBUG:root:Connecting to a1b2c3d4e5f6-ats.iot.us-east-1.amazonaws.com with client ID 'lora-iot-gateway-01'
DEBUG:root:Connecting to REYAX RYLR896 transceiver module
DEBUG:root:Connected!
DEBUG:root:Address set? +OK
DEBUG:root:Network Id set? +OK
DEBUG:root:AES-128 password set? +OK
DEBUG:root:Module responding? +OK
DEBUG:root:Address: +ADDRESS=116
DEBUG:root:Network id: +NETWORKID=6
DEBUG:root:UART baud rate: +IPR=115200
DEBUG:root:RF frequency: +BAND=915000000
DEBUG:root:RF output power: +CRFOP=15
DEBUG:root:Work mode: +MODE=0
DEBUG:root:RF parameters: +PARAMETER=12,7,1,4
DEBUG:root:AES128 password of the network: +CPIN=92A0ECEC9000DA0DCF0CAAB0ABA2E0EF
That sensor data is also written to the log file for debugging purposes. This first line in the log (shown below) is the raw decrypted data received from the IoT device via LoRaWAN. The second line is the JSON-serialized payload, sent securely to AWS, using the MQTT protocol.
DEBUG:root:b'+RCV=116,59,0447383033363932003C0034|23.46|41.89|99.38|230|692|833|1116,-48,39\r\n'
DEBUG:root:{'ts': 1598305503.7041512, 'data': {'humidity': 41.89, 'temperature': 23.46, 'device_id': '0447383033363932003C0034', 'gateway_id': '00000000f62051ce', 'pressure': 99.38, 'color': {'red': 230.0, 'blue': 833.0, 'ambient': 1116.0, 'green': 692.0}}}
DEBUG:root:b'+RCV=116,59,0447383033363932003C0034|23.46|41.63|99.38|236|696|837|1127,-49,35\r\n'
DEBUG:root:{'ts': 1598305513.7918658, 'data': {'humidity': 41.63, 'temperature': 23.46, 'device_id': '0447383033363932003C0034', 'gateway_id': '00000000f62051ce', 'pressure': 99.38, 'color': {'red': 236.0, 'blue': 837.0, 'ambient': 1127.0, 'green': 696.0}}}
DEBUG:root:b'+RCV=116,59,0447383033363932003C0034|23.44|41.57|99.38|232|686|830|1113,-48,32\r\n'
DEBUG:root:{'ts': 1598305523.8556132, 'data': {'humidity': 41.57, 'temperature': 23.44, 'device_id': '0447383033363932003C0034', 'gateway_id': '00000000f62051ce', 'pressure': 99.38, 'color': {'red': 232.0, 'blue': 830.0, 'ambient': 1113.0, 'green': 686.0}}}
DEBUG:root:b'+RCV=116,59,0447383033363932003C0034|23.51|41.44|99.38|205|658|802|1040,-48,36\r\n'
DEBUG:root:{'ts': 1598305528.8890748, 'data': {'humidity': 41.44, 'temperature': 23.51, 'device_id': '0447383033363932003C0034', 'gateway_id': '00000000f62051ce', 'pressure': 99.38, 'color': {'red': 205.0, 'blue': 802.0, 'ambient': 1040.0, 'green': 658.0}}}
AWS IoT Core
The Raspberry Pi-based IoT gateway will be registered with AWS IoT Core. IoT Core allows users to connect devices quickly and securely to AWS.
Things
According to AWS, IoT Core can reliably scale to billions of devices and trillions of messages. Registered devices are referred to as things in AWS IoT Core. A thing is a representation of a specific device or logical entity. Information about a thing is stored in the registry as JSON data.
Below, we see an example of the Thing created by CloudFormation. The Thing, lora-iot-gateway-01
, represents the physical IoT gateway. We have assigned the IoT gateway a Thing Type, LoRaIoTGateway
, a Thing Group, LoRaIoTGateways
, and a Thing Billing Group, IoTGateways
.

In a real IoT environment, containing hundreds, thousands, even millions of IoT devices, gateways, and sensors, these classification mechanisms, Thing Type, Thing Group, and Thing Billing Group, will help to organize IoT assets.

Device Gateway and Message Broker
IoT Core provides a Device Gateway, which manages all active device connections. The Gateway currently supports MQTT, WebSockets, and HTTP 1.1 protocols. Behind the Message Gateway is a high-throughput pub/sub Message Broker, which securely transmits messages to and from all IoT devices and applications with low latency. Below, we see a typical AWS IoT Core architecture containing multiple Topics, Rules, and Actions.

AWS IoT Security
AWS IoT Core provides mutual authentication and encryption, ensuring all data is exchanged between AWS and the devices are secure by default. In the demonstration, all data is sent securely using Transport Layer Security (TLS) 1.2 with X.509 digital certificates on port 443. Below, we see an example of an X.509 certificate assigned to the Thing, lora-iot-gateway-01
, which represents the physical IoT gateway. The X.509 certificate and the private key, generated using the AWS CLI, previously, are installed on the IoT gateway.

Authorization of the device to access any resource on AWS is controlled by AWS IoT Core Policies. These policies are similar to AWS IAM Policies. Below, we see an example of an AWS IoT Core Policy, LoRaDevicePolicy
, which is assigned to the IoT gateway.

AWS IoT Core Rules
Once an MQTT message is received from the IoT gateway (a thing), we use AWS IoT Rules to send message data to an AWS IoT Analytics Channel. Rules give your devices the ability to interact with AWS services. Rules are analyzed, and Actions are performed based on the MQTT topic stream. Below, we see an example rule that forwards our messages to an IoT Analytics Channel.

Rule query statements are written in standard Structured Query Language (SQL). The datasource for the Rule query is an IoT Topic.
AWS IoT Analytics
AWS IoT Analytics is composed of five primary components: Channels, Pipelines, Data stores, Data sets, and Notebooks. These components enable you to collect, prepare, store, analyze, and visualize your IoT data.

Below, we see a typical AWS IoT Analytics architecture. IoT messages are received from AWS IoT Core, thought a Rule Action. Amazon QuickSight provides business intelligence, visualization. Amazon QuickSight ML Insights adds anomaly detection and forecasting.

IoT Analytics Channel
An AWS IoT Analytics Channel pulls messages or data into IoT Analytics from other AWS sources, such as Amazon S3, Amazon Kinesis, or Amazon IoT Core. Channels store data for IoT Analytics Pipelines. Both Channels and Data store support storing data in your own Amazon S3 bucket or an IoT Analytics service-managed S3 bucket. In the demonstration, we are using a service managed S3 bucket.
When creating a Channel, you also decide how long to retain the data. For the demonstration, we have set the data retention period for 21 days. Channels are generally not used for long term storage of data. Typically, you would only retain data in the Channel for the period you need to analyze. For long term storage of IoT message data, I recommend using an AWS IoT Core Rule to send a copy of the raw IoT data to Amazon S3, using a service such as Amazon Kinesis Data Firehose.

IoT Analytics Pipeline
An AWS IoT Analytics Pipeline consumes messages from one or more Channels. Pipelines transform, filter, and enrich the messages before storing them in IoT Analytics Data stores. A Pipeline is composed of an ordered list of activities. Logically, you must specify both a Channel
(source) and a Datastore
(destination) activity. Optionally, you may choose as many as 23 additional activities in the pipelineActivities
array.
In our demonstration’s Pipeline, iot_analytics_pipeline
, we have specified three additional activities, including DeviceRegistryEnrich
, Filter
, and Lambda
. Other activity types include Math
, SelectAttributes
, RemoveAttributes,
and AddAttributes
.

The Filter
activity ensures the sensor values are not Null or otherwise erroneous; if true, the message is dropped. The Lambda
Pipeline activity executes an AWS Lambda function to transform the messages in the pipeline. Messages are sent in an event object to the Lambda. The message is modified, and the event object is returned to the activity.

The Python-based Lambda function easily handles typical IoT data transformation tasks, including converting the temperature
from Celsius to Fahrenheit, pressure
from kilopascals (kPa) to inches of Mercury (inHg), and 12-bit RGBA values to 8-bit color values (0–255). The Lambda function also rounds down all the values to between 0 and 2 decimal places of precision.
The demonstration’s Pipeline also enriches the IoT data with metadata from the IoT device’s AWS IoT Core Registry. The metadata includes additional information about the device that generated the IoT data, including the custom attributes such as LoRa transceiver manufacturer and model, and the IoT gateway manufacturer.

A notable feature of Pipelines is the ability to reprocess messages. If you make changes to the Pipeline, which often happens during the data preparation stage, you can reprocess any or all the IoT data in the associated Channel, and overwrite the IoT data in the Data set.
IoT Analytics Data store
An AWS IoT Analytics Data store stores prepared data from an AWS IoT Analytics Pipeline, in a fully-managed database. Both Channels and Data store support storing IoT data in your own Amazon S3 bucket or an IoT Analytics managed S3 bucket. In the demonstration, we are using a service-managed S3 bucket to store the IoT data in our Data store, iot_analytics_data_store
.

IoT Analytics Data set
An AWS IoT Analytics Data set automatically provides regular, up-to-date insights for data analysts by querying a Data store using standard SQL. Periodic updates are implemented using a cron expression. For the demonstration, we are updating our Data set, iot_analytics_data_set
, at a 15-minute interval. The time interval can be increased or reduced, depending on the desired ‘near real-time’ nature of the IoT data being analyzed.
Below, we see messages in the Result preview pane of the Data set. Note the SQL query used to obtain the messages, which queries the Data store. The Data store, as you will recall, contains the transformed messages from the Pipeline.

IoT Analytics Data sets also support sending content results, which are materialized views of your IoT Analytics data, to an Amazon S3 bucket.

The CloudFormation stack created an encrypted Amazon S3 Bucket. This bucket receives a copy of the messages from the IoT Analytics Data set whenever the cron expression runs the scheduled update.

IoT Analytics Notebook
An AWS IoT Analytics Notebook allows users to perform statistical analysis and machine learning on IoT Analytics Data sets using Jupyter Notebooks. The IoT Analytics Notebook service includes a set of notebook templates that contain AWS-authored machine learning models and visualizations. Notebook Instances can be linked to a GitHub or other source code repository. Notebooks created with IoT Analytics Notebook can also be accessed directly through Amazon SageMaker. For the demonstration, the Notebooks Instance is cloned from our project’s GitHub repository.

The repository contains a sample Jupyter Notebook, LoRa_IoT_Analytics_Demo.ipynb, based on the conda_python3
kernel. This preinstalled environment includes the default Anaconda installation and Python 3.

The Notebook uses pandas, matplotlib, and plotly to manipulate and visualize the sample IoT data stored in the IoT Analytics Data set.



The Notebook can be modified, and the changes pushed back to GitHub. You could easily fork the demonstration’s GitHub repository and modify the CloudFormation template to point to your source code repository.

Amazon QuickSight
Amazon QuickSight provides business intelligence (BI) and visualization. Amazon QuickSight ML Insights adds anomaly detection and forecasting. We can use Amazon QuickSight to visualize the IoT message data, stored in the IoT Analytics Data set.
Amazon QuickSight has both a Standard and an Enterprise Edition. AWS provides a detailed product comparison of each edition. For the post, I am demonstrating the Enterprise Edition, which includes additional features, such as ML Insights, hourly refreshes of SPICE (super-fast, parallel, in-memory, calculation engine), and theme customization.
Please be aware of the costs of Amazon QuickSight if you choose to follow along with this part of the demo. Although there is an Amazon QuickSight API, Amazon QuickSight is not automatically enabled or configured with CloudFormation or using the AWS CLI in this demonstration.
QuickSight Data Sets
Amazon QuickSight has a wide variety of data source options for creating Amazon QuickSight Data sets, including the ones shown below. Do not confuse Amazon QuickSight Data sets with IoT Analytics Data sets; they are two different service features.

For the demonstration, we will create an Amazon QuickSight Data set that will use our IoT Analytics Data set, iot_analytics_data_set
.

Amazon QuickSight gives you the ability to view and modify QuickSight Data sets before visualizing. QuickSight even provides a wide variety of functions, enabling us to perform dynamic calculations on the field values. For this demonstration, we will leave the data unchanged since all transformations were already completed in the IoT Analytics Pipeline.

QuickSight Analysis
Using the QuickSight Data set, built from the IoT Analytics Data set as a data source, we create a QuickSight Analysis. The QuickSight Analysis console is shown below. An Analysis is primarily a collection of Visuals (aka Visual types). QuickSight provides several Visual types. Each visual is associated with a Data set. Data for the QuickSight Analysis or each visual within the Analysis can be filtered. For the demo, I have created a simple QuickSight Analysis, including a few typical QuickSight visuals.

QuickSight Dashboards
To share a QuickSight Analysis, we can create a QuickSight Dashboard. Below, we see a few views of the QuickSight Analysis, shown above, as a Dashboard. Although viewers of the Dashboard cannot edit the visuals, they can apply filtering and interactively drill-down into data in the Visuals.



Amazon QuickSight ML Insights
According to Amazon, ML Insights leverages AWS’s machine learning (ML) and natural language capabilities to gain deeper insights from data. QuickSight’s ML-powered Anomaly Detection continuously analyze data to discover anomalies and variations inside of the aggregates, giving you the insights to act when business changes occur. QuickSight’s ML-powered Forecasting can be used to predict your business metrics accurately, and perform interactive what-if analysis with point-and-click simplicity. QuickSight’s built-in algorithms make it easy for anyone to use ML that learns from your data patterns to provide you with accurate predictions based on historical trends.
Below, we see the ML Insights tab (left) in the demonstration’s QuickSight Analysis. Individually detected anomalies can be added to the QuickSight Analysis, like Visuals, and configured to tune the detection parameters. Observe the temperature, humidity, and barometric pressure anomalies, identified by ML Insights, based on their Anomaly Score, which is higher or lower, given a minimum delta of five percent. These anomalies accurately reflected an actual failure of the IoT device, caused by overheated during testing, which resulted in abnormal sensor readings.

Receiving the Messages on AWS
To confirm the IoT gateway is sending messages, we can use a packet analyzer, like tcpdump
, on the IoT gateway. Running tcpdump
on the IoT gateway, below, we see outbound encrypted MQTT messages being sent to AWS on port 443.

To confirm those messages are being received from the IoT gateway on AWS, we can use the AWS IoT Core Test feature and subscribe to the lora-iot-demo
topic. We should see messages flowing in from the IoT gateway at approximately 5-second intervals.

The JSON payload structure of the incoming MQTT messages will look similar to the below example. The device_id
is the unique id of the IoT device that transmitted the message using LoRaWAN. The gateway_id
is the unique id of the IoT gateway that received the message using LoRaWAN and sent it to AWS. A single IoT gateway would usually manage messages from multiple IoT devices, each with a unique id.
The SQL query used by the AWS IoT Rule described earlier, transforms and flattens the nested JSON payload structure, before passing it to the AWS IoT Analytics Channel, as shown below.
We can measure the near real-time nature of the IoT data using the ts
and msg_received
data fields. The ts
data field is date and time when the sensor reading occurred on the IoT device, while the msg_received
data field is the date and time when the message was received on AWS. The delta between the two values is a measure of how near real-time the sensor readings are being streamed to the AWS IoT Analytics Channel. In the below example, the difference between ts
(2020–08–27T11:45:31.986) and msg_received
(2020–08–27T11:45:32.074) is 88 ms.
Final IoT Data Message Structure
Once the message payload passes through the AWS IoT Analytics Pipeline and lands in the AWS IoT Analytics Data set, its final data structure looks as follows. Note that the device’s attribute metadata has been added from the AWS IoT Core device registry. Regrettably, the metadata is not well-formatted JSON and will require additional transformation to be usable.
A set of sample messages is included in the GitHub project’s sample_messages directory.
Conclusion
In this post, we explored the use of the LoRa and LoRaWAN protocols to transmit environmental sensor data from an IoT device to an IoT gateway. Given its low energy consumption, long-distance transmission capabilities, and well-developed protocols, LoRaWAN is an ideal long-range wireless protocol for IoT devices. We then demonstrated how to use AWS IoT Device SDKs, AWS IoT Core, and AWS IoT Analytics to securely collect, analyze, and visualize streaming messages from the IoT device, in near real-time.
This blog represents my own viewpoints and not of my employer, Amazon Web Services.
LoRa and LoRaWAN for IoT: Getting Started with LoRa and LoRaWAN Protocols for Low Power, Wide Area Networking of IoT
Posted by Gary A. Stafford in C++ Development, IoT, Python, Raspberry Pi, Software Development, Technology Consulting on August 10, 2020
Introduction
According to the LoRa Alliance, Low-Power, Wide-Area Networks (LPWAN) are projected to support a major portion of the billions of devices forecasted for the Internet of Things (IoT). LoRaWAN is designed from the bottom up to optimize LPWANs for battery lifetime, capacity, range, and cost. LoRa and LoRaWAN permit long-range connectivity for the Internet of Things (IoT) devices in different types of industries. According to Wikipedia, LoRaWAN defines the communication protocol and system architecture for the network, while the LoRa physical layer enables the long-range communication link.
LoRa
Long Range (LoRa), the low-power wide-area network (LPWAN) protocol developed by Semtech, sits at layer 1, the physical layer, of the seven-layer OSI model (Open Systems Interconnection model) of computer networking. The physical layer defines the means of transmitting raw bits over a physical data link connecting network nodes. LoRa uses license-free sub-gigahertz radio frequency (RF) bands, including 433 MHz, 868 MHz (Europe), 915 MHz (Australia and North America), and 923 MHz (Asia). LoRa enables long-range transmissions with low power consumption.
LoRaWAN
LoRaWAN is a cloud-based medium access control (MAC) sublayer (layer 2) protocol but acts mainly as a network layer (layer 3) protocol for managing communication between LPWAN gateways and end-node devices as a routing protocol, maintained by the LoRa Alliance. The MAC sublayer and the logical link control (LLC) sublayer together make up layer 2, the data link layer, of the OSI model.
LoRaWAN is often cited as having greater than a 10-km-wide coverage area in rural locations. However, according to other sources, it is generally more limited. According to the Electronic Design article, 11 Myths About LoRaWAN, a typical LoRaWAN network range depends on numerous factors—indoor or outdoor gateways, the payload of the message, the antenna used, etc. On average, in an urban environment with an outdoor gateway, you can expect up to 2- to 3-km-wide coverage, while in the rural areas it can reach beyond 5 to 7 km. LoRa’s range depends on the “radio line-of-sight.” Radio waves in the 400 MHz to 900 MHz range may pass through some obstructions, depending on their composition, but will be absorbed or reflected otherwise. This means that the signal can potentially reach as far as the horizon, as long as there are no physical barriers to block it.
In the following hands-on post, we will explore the use of the LoRa and LoRaWAN protocols to transmit and receive sensor data, over a substantial distance, between an IoT device, containing a number of embedded sensors, and an IoT gateway.
Recommended Hardware
For this post, I have used the following hardware.
IoT Device with Embedded Sensors
I have used an Arduino single-board microcontroller as an IoT sensor, actually an array of sensors. The 3.3V AI-enabled Arduino Nano 33 BLE Sense board (Amazon: USD 36.00), released in August 2019, comes with the powerful nRF52840 processor from Nordic Semiconductors, a 32-bit ARM Cortex-M4 CPU running at 64 MHz, 1MB of CPU Flash Memory, 256KB of SRAM, and a NINA-B306 stand-alone Bluetooth 5 low energy (BLE) module.
The Sense also contains an impressive array of embedded sensors:
- 9-axis Inertial Sensor (LSM9DS1): 3D digital linear acceleration sensor, a 3D digital
angular rate sensor, and a 3D digital magnetic sensor - Humidity and Temperature Sensor (HTS221): Capacitive digital sensor for relative humidity and temperature
- Barometric Sensor (LPS22HB): MEMS nano pressure sensor: 260–1260 hectopascal (hPa) absolute digital output barometer
- Microphone (MP34DT05): MEMS audio sensor omnidirectional digital microphone
- Gesture, Proximity, Light Color, and Light Intensity Sensor (APDS9960): Advanced Gesture detection, Proximity detection, Digital Ambient Light Sense (ALS), and Color Sense (RGBC).
The Arduino Sense is an excellent, low-cost single-board microcontroller for learning about the collection and transmission of IoT sensor data.
IoT Gateway
An IoT Gateway, according to TechTarget, is a physical device or software program that serves as the connection point between the Cloud and controllers, sensors, and intelligent devices. All data moving to the Cloud, or vice versa goes through the gateway, which can be either a dedicated hardware appliance or software program.
I have used an a third-generation Raspberry Pi 3 Model B+ single-board computer (SBC), to serve as an IoT Gateway. This Raspberry Pi model features a 1.4GHz Cortex-A53 (ARMv8) 64-bit quad-core processor System on a Chip (SoC), 1GB LPDDR2 SDRAM, dual-band wireless LAN, Bluetooth 4.2 BLE, and Gigabit Ethernet (Amazon: USD 42.99).
To follow along with the post, you could substitute the Raspberry Pi for any Linux-based machine to run the included sample Python script.
LoRa Transceiver Modules
To transmit the IoT sensor data between the IoT device, containing the embedded sensors, and the IoT gateway, I have used the REYAX RYLR896 LoRa transceiver module (Amazon: USD 19.50 x 2). The transceiver modules are commonly referred to as a universal asynchronous receiver-transmitter (UART). A UART is a computer hardware device for asynchronous serial communication in which the data format and transmission speeds are configurable.
According to the manufacturer, REYAX, the RYLR896 contains the Semtech SX1276 long-range, low power transceiver. The RYLR896 module provides ultra-long range spread spectrum communication and high interference immunity while minimizing current consumption. This transceiver operates at both the 868 and 915 MHz frequency ranges. We will be transmitting at 915 MHz for North America, in this post. Each RYLR896 module contains a small, PCB integrated, helical antenna.
Security
The RYLR896 is capable of the AES 128-bit data encryption. Using the Advanced Encryption Standard (AES), we will encrypt the data sent from the IoT device to the IoT gateway, using a 32 hex digit password (128 bits / 4 bits/hex digit = 32 hex digits). Using hexadecimal notation, the password is limited to digits 0–9 and characters A–F.
USB to TTL Serial Converter Adapter
Optionally, to configure, test, and debug the RYLR896 LoRa transceiver module directly from your laptop, you can use a USB to TTL serial converter adapter. I currently use the IZOKEE FT232RL FTDI USB to TTL Serial Converter Adapter Module for 3.3V and 5V (Amazon: USD 9.49 for 2). The 3.3V RYLR896 module easily connects to the USB to TTL Serial Converter Adapter using the TXD/TX, RXD/RX, VDD/VCC, and GND pins. We use serial communication to send and receive data through TX (transmit) and RX (receive) pins. The wiring is shown below: VDD to VCC, GND to GND, TXD to RX, and RXD to TX.
The FT232RL has support for baud rates up to 115,200 bps, which is the speed we will use to communicate with the RYLR896 module.
Arduino Sketch
For those not familiar with Arduino, a sketch is the name that Arduino uses for a program. It is the unit of code that is uploaded into non-volatile flash memory and runs on an Arduino board. The Arduino language is a set of C and C++ functions. All standard C and C++ constructs supported by the avr-g++ compiler should work in Arduino.
For this post, the sketch, lora_iot_demo.ino, contains all the code necessary to collect and securely transmit the environmental sensor data, including temperature, relative humidity, barometric pressure, RGB color, and ambient light intensity, using the LoRaWAN protocol. All code for this post, including the sketch, can be found on GitHub.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
/* | |
Description: Transmits Arduino Nano 33 BLE Sense sensor telemetry over LoRaWAN, | |
including temperature, humidity, barometric pressure, and color, | |
using REYAX RYLR896 transceiver modules | |
http://reyax.com/wp-content/uploads/2020/01/Lora-AT-Command-RYLR40x_RYLR89x_EN.pdf | |
Author: Gary Stafford | |
*/ | |
#include <Arduino_HTS221.h> | |
#include <Arduino_LPS22HB.h> | |
#include <Arduino_APDS9960.h> | |
const int UPDATE_FREQUENCY = 5000; // update frequency in ms | |
const float CALIBRATION_FACTOR = –4.0; // temperature calibration factor (Celsius) | |
const int ADDRESS = 116; | |
const int NETWORK_ID = 6; | |
const String PASSWORD = "92A0ECEC9000DA0DCF0CAAB0ABA2E0EF"; | |
const String DELIMITER = "|"; | |
void setup() | |
{ | |
Serial.begin(9600); | |
Serial1.begin(115200); // default baud rate of module is 115200 | |
delay(1000); // wait for LoRa module to be ready | |
// needs all need to be same for receiver and transmitter | |
Serial1.print((String)"AT+ADDRESS=" + ADDRESS + "\r\n"); | |
delay(200); | |
Serial1.print((String)"AT+NETWORKID=" + NETWORK_ID + "\r\n"); | |
delay(200); | |
Serial1.print("AT+CPIN=" + PASSWORD + "\r\n"); | |
delay(200); | |
Serial1. |