MQTT API¶

How to access the emalyticscloud.com MQTT API.

Why MQTT?¶

MQTT is a lightweight publish/subscribe messaging protocol. Originally designed for machine to machine (M2M) telemetry in low bandwidth environments, MQTT has nowadays become one of the main protocols for (data collection in) Internet of Things (IoT) deployments [1,2].

For simple messaging use cases, MQTT has multiple advantages over HTTP [3] and other protocols:

Due to its binary encoding and minimal packet overhead, MQTT is 20x faster than HTTP, uses 50x less traffic, and consumes 20% less energy than HTTP according to the direct performance comparison presented in [4].
Contrary to HTTP's client/server architecture, MQTT's publish/subscribe pattern decouples data sources from data sinks through a third party, the MQTT broker. According to [5], decoupling means that sources never directly talk to sinks which implies that:
1. they do not need to know each other,
2. they do not need to run at the same time, and
3. they do not require synchronization
All this easily allows building flexible 1-to-many, many-to-1, and many-to-many data pipelines. - MQTT has message filtering built-in, i.e., data sinks can subscribe to arbitrary subsets of the data collected from the sources specified through a hierarchical topic theme [6].

When you are building an application that streams data into or from the emalyticscloud.com platform, MQTT is probably a better choice than the HTTP API.

Part of the emalyticscloud.com platform is an MQTT broker which serves as the single logical point of data ingress to the emalyticscloud.com, e.g., all data collected in the field through the emalyticscloud Edge Devices is ingested to emalyticscloud.com through this MQTT broker and, in turn, can also be subscribed to. The MQTT broker is clustered, i.e., distributed over multiple independent servers, to ensure seamless scalability and high availability.

MQTT Broker¶

The MQTT API is providd by an MQTT broker. To use the MQTT API, you need to connect to this broker with a unique client id using login credentials for authentication and authorization.

Connecting¶

If you are using the emalyticscloud.com cloud platform, the MQTT broker's URL is

mqtt.emalyticscloud.com

If you are using a dedicated emalyticscloud.com platform instance, remember to use the correct subdomain

mqtt.<REALM>.emalyticscloud.com

where <REALM> is specific to your dedicated platform.

Dedicated platforms and subdomains

In the following and throughtout this documentation, we will use URLs that refer to the emalyticscloud.com cloud platform. If you are using a dedicated platform, remember to use the correct subdomain as explained above.

The MQTT broker accepts connections on two ports:

Port 8883 accepts plain MQTT, i.e., connections that transport MQTT directly via TLS. This is the standard case and, if in doubt, this port is the right choice.
Port 9001 accepts websockets connections, i.e., connections that transport the MQTT protocol within the websockets protocol which in turn is transported via TLS. MQTT via websockets is the right choice when you want to send/receive MQTT data directly from a web browser. Read more about MQTT over websockets here and here.

MQTT brokers only accept TLS 1.2 and TLS 1.3 encrypted connections, i.e., all plain TCP connections are rejected. The MQTT broker authenticates itself towards the client using an X.509 certificate issued by Let's Encrypt. Your operating system (OS) will accept this certificate if the root certificates of Let's Encrypt are installed as a trusted root certification authority (CA) in your OS. Don't worry, this is probably the case and you don't have to do anything.

Test your connection

To test the connection to the MQTT broker, run the following command on your command line:

openssl s_client -connect mqtt.emalyticscloud.com:8883 -servername mqtt.emalyticscloud.com

There will be a warning verify error:num=20:unable to get local issuer certificate at the top, which can be fixed by providing option -CAfile or -CApath and pointing to the right locations depending on your OS, e.g., -CAfile /etc/ssl/cert.pem on Mac OS.

Client Identifiers¶

Clients are identified by a unique client_id. As per the MQTT 3.1.1 specification (Section 3.1.3.1), client identifiers are between 1 and 23 alphanumeric characters (0-9, a-z, A-Z). Most MQTT brokers support longer client identifiers from the the full range of UTF-8 characters and only characters /, +, and # which have a special meaning in MQTT and are disallowed for security reasons.

It is important to note that there can only be one connection per client_id per Broker. If two clients connect with the same client_id the older connection is terminated in favor of the newer. This restriction does not extend to your login credentials (see authentication). You can open multiple connections using the same login credentials as long as you use a different client_id for each concurrent connection.

Choose a client_id and avoid special characters. Postfix the client_id with a random string or integer to ensure it is unique across concurrent connections.

Authentication¶

The MQTT brokers only accept connections from authenticated clients. To authenticate, the MQTT client has to present login credentials (username and password) to the MQTT broker within the initial MQTT handshake, i.e., as part of the CONNECT message.

Client credentials can be obtained with limited and unlimited validity.

Credentials with unlimited validity are provided only on request by the emalyticscloud staff. Please email us at support@emalyticscloud.com.
Credentials with limited validity can be created through the emalyticscloud.com HTTP API. Please refer to the corresponding guide for further instructions and details.

Authorization¶

The MQTT broker authorizes clients to subscribe and publish based on topics.

Once connected and authenticated, the client can publish or subscribe to one or multiple topics, but not without authorization. To subscribe, the client needs read access to that topic. To publish, the client needs write access. Note that write access does not imply read access.

Authorization is specified through a list of topics (following exactly MQTT's topic syntax and semantics) where for each topic it is specified whether the user has read and/or write access. Make sure to familiarize yourself with MQTT's topic structure, especially with hierarchy levels and the # wildcard.

Topic hierarchy¶

All MQTT topics on emalyticscloud.com have a hierarchy that consists of two main parts, i.e., a fixed prefix and a variable postfix.

The prefix has two hierarchies and is assigned by emalyticscloud:

<load-balancing-group>/<project-handle>

The top level hierarchy, the load-balancing-group, is fixed and assigned by emalyticscloud. It serves to separate different customers and projects and ensures that each customer and project is guaranteed separate and sufficient processing and communication resources.
The second level hierarchy, the project-handle, is a fixed (human-readable) string assigned by emalyticscloud that uniquely identifies your project. This hierarchy separates different projects on the same load balancing group.

As an emalyticscloud.com customer, you receive authorization to this prefix, i.e., to the topic <load-balancing-group>/<project-handle>/#, i.e., you can publish and subscribe to "anything below" the project level of the topic hierarchy.

The postfix matching the # can generally have arbitrary length or structure as long as they are UTF-8 strings.

If you've purchased an emalyticscloud Edge Device, e.g., this device collects data from different datapoints on your building network and publishes them to the postfixes datapoint_1, ..., datapoint_n. Via MQTT, you thus have datapoint-level publish/subscribe access to the datapoints of your building. For efficiency reasons, the Edge Device uses short 4 to 12 characters long identifiers generated from the full datapoint names, e.g., the 8-character hash identifier of the datapoint my_very_veeeeeery_long_datapoint_id is just VD0pZLej.
emalyticscloud's hash identifiers creation
1. Build a SHA1 hash out of the UTF-8 encoded datapoint id.
2. base62-encode the shortenend hash.
3. Cut the first hash_id_length characters where hash_id_length is configured individually per project (default = 8).
Here's a sample implementation in Python using the pybase62 module.
```
import base62
import hashlib

def base62id(s: str, hash_id_length: int = 8):
    return base62.encodebytes(hashlib.sha1(s.encode('utf-8')).digest())[:hash_id_length]
```
If you ingest data yourself, you can publish to arbitrary postfixes since the # wildcard at the end of your topic authorization matches any number of sublevels.

It is important to note two things about publishing your own data via MQTT:

The postfix is only used for routing messages on the broker, e.g., you can use it to group data for different subscribers. The postfix does not determine which time series data is stored to. This is determined by the payload of your messages (see below).
emalyticscloud does not prevent you from writing data to datapoints that are at the same time written by the emalyticscloud Edge Device. If you have a datapoint_A on your local building network that is discovered by the Edge Device and you also write to datapoint_A yourself, this data will be stored and intermingled in the same time-series.

Sparkplug B¶

Sparkplug B is an open-source specification designed to make MQTT (Message Queuing Telemetry Transport) a truly interoperable and effective protocol for industrial applications, particularly in the context of the Industrial Internet of Things (IIoT). While MQTT is a lightweight and efficient messaging protocol, it lacks a standardized way to define the structure and context of the data being transmitted. Sparkplug B fills this gap by providing a set of rules for the MQTT topic namespace, the payload, and state management.

At its core, Sparkplug B establishes a "source of truth" for the industrial environment. It defines a standardized message format using Google's Protocol Buffers (Protobuf), which is a highly efficient binary format. This ensures that the data is not only compact and fast to transmit, but also self-describing, so that any application or device can easily understand it without prior configuration.

This inherent capability for self-description is where the connection to the Digital Twin concept becomes crucial. A digital twin is a virtual representation of a physical asset, process, or system. It's a living, digital model that is continuously synchronized with real-world data from its physical counterpart. Sparkplug B is a key enabler for creating these digital twins because it provides the necessary framework to:

Broadcast the Digital Model: At startup, a Sparkplug B compliant device (an "Edge Node") publishes a NBIRTH (Node Birth) message. This message contains a complete, contextualized representation of all the data points (metrics) it will report. This is essentially the birth of the digital twin, it provides the structure and metadata, such as units, data types, and display names, to any subscribing application.

Maintain State Awareness: Sparkplug B's robust state management ensures that any application consuming the data is always aware of the real-time status of all connected devices. If a device disconnects, a clear notification is sent, allowing the digital twin to accurately reflect the offline status of the physical asset.

Enable Real-Time Synchronization: By leveraging MQTT's "Report by Exception" model, Sparkplug B only sends data when a value changes. This is highly efficient and ensures that the digital twin is always up-to-date with the most relevant information without overwhelming the network with continuous polling.

Sparkplug B in the context of emalyticscloud¶

In Emalytics and emalyticscloud.com, a transparent driver is available that allows you to work with Sparkplug B in the familiar Emalytics context. No special knowledge of Sparkplug B is required. The driver includes the following features in particular.

Status management
Fail-safe in case of connection failure
Sending data
Receiving data from the cloud (Controls)
Metadata

Example Sparkplug B message-type (NBIRTH, DBIRTH)¶

Within the "meta" PropertySet, you'll find all available tags provided by Emalytics Automation, such as Haystack or Brickschema information.
This information is only available on the NBIRTH and DBIRTH topics.
The content is defined in a SparkplugB PropertySet, and the number of tags can vary depending on the specific application.

Name	Type	Value	Property List/Set	Description
`pointType`	String	Example: `bool`	`propertyList = properties`	SparkplugB Basic Types. Only for Emalytics applications.
`currentPriority`	UInt8	1-17	`propertyList = properties`	Indicates the priority with which the value was written in the application.
`facets`	String	Example: `"trueText=Ja,true\\|falseText=Nein,false"`	`propertyList = properties`	Emalytics Automation specific value for units and Min-Max values. Only for Emalytics applications.
`writable`	Bool	`true/false`	`propertyList = properties`	`true` = The metric is writable via the broker. `false` = The metric cannot be written via the broker.
`mode`	String	`add` \| `update` \| `clean`	`propertyList = properties`	(optional) If the `mode` value is present, there is another PropertySet `meta`. Add = Transfers only the new tags to the host application. Update = Tags that need to be updated in the host application. Clean = List of tags to be deleted in the host application.
`meta`	PropertySet	List with Properties	`propertyList = properties`	(optional, only if `mode` is set) List with tags, e.g., Haystack or Brick, ...
`n:displayName`	String	Example: `"Local Station 1"`	`propertyList = properties`, `propertySet = meta`	`n:displayName` n = Namespace displayName = Value
`n:point`	unknown	`null`	`propertyList = properties`, `propertySet = meta`	Tags of type Marker are represented with the data type `unknown` and the value `null`.

Example Sparkplug B payload in json formatted¶

{
    "timestamp": 1750764408284,
    "metrics": [
        {
            "name": "metricName",
            "timestamp": 1750764408251,
            "dataType": "Boolean",
            "properties": {
                "mode": {
                    "type": "String",
                    "value": "add, update"
                },
                "pointType": {
                    "type": "String",
                    "value": "bool"
                },
                "meta": {
                    "type": "PropertySet",
                    "value": {
                        "n:point": {
                            "type": "Unknown",
                            "value": null
                        },
                        "n:displayName": {
                            "type": "String",
                            "value": "Test_Bool_Monitor"
                        }
                    }
                },
                "currentPriority": {
                    "type": "UInt8",
                    "value": 10
                },
                "facets": {
                    "type": "String",
                    "value": "trueText=s:true|falseText=s:false"
                },
                "writable": {
                    "type": "Boolean",
                    "value": false
                }
            },
            "value": false
        },
        {
            "name": "bdSeq",
            "timestamp": 1750764408284,
            "dataType": "Int64",
            "value": 16
        },
        {
            "name": "Node Control/Rebirth",
            "timestamp": 1750764408284,
            "dataType": "Boolean",
            "value": false
        }
    ],
    "seq": 232
}

Connection Rate Limiting¶

We as emalyticscloud often face situtations where customers use non-unique client ids. Unfortunately, this leads to a "connection ping-pong" between the two (or more) clients that use the same id leading to excessive connection rates. In order to ensure a smooth user experience, we have thus decided to technically enforce a connection rate limit on the MQTT broker.

Rate limiting looks at the connection attempts per IP address. It restricts access if too many connection attempts were made in the last window_sec seconds.

If there are more than throttle_threshold connection attempts in this window, each connection attempt is slowed down by throttle_delay_ms ms.
If more than leaky_connection_threshold connection attempts were made in this window, the connections are slowed down by throttle_delay_ms ms and additionally only every leaky_connection_factor-th connection is accepted at all and all other connections from this IP are dropped.

Please note that the following parameters are subject to change without prior notification:

Parameter	Value
window_sec	10
throttle_threshold	2
throttle_delay_ms	2000
leaky_connection_threshold	4
leaky_connection_factor	6

Fair use¶

We as emalyticscloud give our best to ensure seamless scalability and the highest availability of our MQTT services. Since we give priority to a clean and simple user experience, we currently do not enforce any rate limits on the MQTT ingress and egress. Deliberately, this allows you to send bursts of data, e.g., to import a batch of historical data.

This being said, we will negotiate a quota with each customer that we consider the basis of fair use of our MQTT services. In favor of your user experience, this quota will be monitored but not strictly enforced. emalyticscloud reserves the right to technically enforce the fair use quota on repeated violations without prior notice.

FAQ¶

What are Quality of Service (QoS) levels and how can I use them?
There are three QoS levels, which, by standard, define the guarantee of delivering messages.

With QoS0 messages are being send without recognition of the receiving status. Therefore this method is called fire and forget.
On QoS1 the message is repeatedly send, until it is received at least once.
With QoS2 the message it exactly received once.

The QoS level is handled by the client on subscription to the broker. Find the detailed explanation in this often referenced article of HiveMQ.

Does the MQTT broker buffer messages while my client is disconnected?
When a client-broker connection with QoS1 or 2 is interrupted, the missing messages could be send to the client afterwards. This depends on the broker's configuration.

My client can't connect or keeps reconnecting?
Usually, the error is one of the following:

Connecting on the wrong port.
Connecting to a TLS endpoint without TLS.
User's local or network firewall blocks outgoing MQTT connections.
Wrong client credentials.
Using a client id that is already connected. MQTT will kick the older connection. If the other client then reconnects, these two clients play ping-pong.

References¶

_{[1] Introduction to MQTT: http://www.steves-internet-guide.com/mqtt/}
_{[2] Introduction to MQTT: https://www.hivemq.com/blog/mqtt-essentials-part-1-introducing-mqtt/}
_{[3] MQTT vs. HTTP: https://iotdunia.com/mqtt-and-http/}
_{[4] MQTT vs. HTTP: https://flespi.com/blog/http-vs-mqtt-performance-tests}
_{[5] MQTT publish/subscribe: https://www.hivemq.com/blog/mqtt-essentials-part2-publish-subscribe/}
_{[6] MQTT topics: https://www.hivemq.com/blog/mqtt-essentials-part-5-mqtt-topics-best-practices/}