SQL
You can use your SQL database as an Egress Sink and start directly receiving Messages from your IoT Devices into your SQL tables. Your Devices send JSON data, and you can specify an Egress Route where you define which JSON properties should map to which SQL tables and columns. Additionally, you can also specify conditions based on the Device JSON data.
See Consume Data in Your Systems page to learn more about configuring Egress Sinks and Routes.
Supported egress events:
Kind | Is supported |
---|---|
Messages | ✅ |
Batch Completions | ❌ |
Supported databases:
- ✅ PostgreSQL (including TimescaleDB extension).
- ✅ Any PostgreSQL-compatible database.
Configuration
To configure the SQL Egress Sink, the following parameters are required:
-
Connectiong String
- The connection string containing information about target server, database, and authentication. Following properties are mandatory:
Host
: Hostname of the target server.Database
: Name of the target database.Username
: Name of the database user.Password
: Password of the database user.
- Example:
Server=fpsql-tds-prd.postgres.database.azure.com;Database=example_db;Port=5432;User ID=test;Password=secret
.
- The connection string containing information about target server, database, and authentication. Following properties are mandatory:
The associated Egress Routes must provide so called Egress Mapping which specifies how the Messages are mapped to SQL rows. Please see details below.
Egress mapping
Egress Sink corresponds to your PostgreSQL Database, and Egress Route specifies which Streams should store the Messages in the PostgreSQL Database. The Egress Route contains config, specifying which JSON properties are stored to which tables and columns and which Messages should be ignored.
If one Message is mapped to multiple rows in one or more tables, the SQL rows are inserted within one transaction.
Column mapping
In this example:
- One SQL row is created for
table1
- Value for
col1
is obtained from Message’s JSON propertyproperty1
- Value for
col2
is obtained from Message’s JSON arrayproperty2
from the first object’s propertyinnerProperty1
. - Value for
col3
is obtained from the message’s metadata - Device ID.
- Value for
- One SQL row is created for
table2
- Value for
col1
is set to a constant valuemyString
. - Value for
col2
is obtained from the message’s metadata - Message ID.
- Value for
- Both SQL rows are inserted into the target PostgreSQL Database within one transaction.
{
"$schema": "https://schema-store.datamole.cloud/json/data-platform/sql-egress/egress-mapping.v1.schema.json",
"rules": [
{
"mapping": {
"table1.col1": "message.property1",
"table1.col2": "message.property2[0].innerProperty1",
"table1.col3": "ingressMetadata.deviceId",
"table2.col1": {
"const": "myString"
},
"table2.col2": "ingressMetadata.messageId"
}
}
]
}
When Device device-01
sends the following JSON Message with Message ID message-01
:
{
"property1": "example-value",
"property2": [
{
"innerProperty1": "inner-example-value1"
},
{
"innerProperty1": "inner-example-value2"
}
]
}
The SQL Database will look like this:
table1
:
col1 | col2 | col3 |
---|---|---|
example-value | inner-example-value1 | device-01 |
table2
:
col1 | col2 |
---|---|
myString | message-01 |
Property selectors
Below are possible selectors that define the input value for the columns.
message.property1
message.property1.property2
message.property1[0]
message[0].property1
Ingress metadata
Apart from properties from the message's content, you can also use following metadata properties as a value for any SQL column. Such properties are available under ingressMetadata.*
object (e.g. ingressMetadata.streamGroupName
).
Available metadata properties are:
Name | Type | Example | Description |
---|---|---|---|
ingressMetadata.streamGroupName | string | group-a | Name of the stream group the message was sent into. |
ingressMetadata.streamName | string | telemetry | Name of the stream the message was sent into. |
ingressMetadata.deviceId | string | robot-125 | Id of the device that sent the message. |
ingressMetadata.batchId | string | 2023-12-19 | Identifier of batch. It is provided by device or auto-filled by the platform (if configured). |
ingressMetadata.batchSliceId | string | logs | Identifier of batch slice (if it was provided by the device). |
ingressMetadata.messageId | string | m00767 | Identifier of the message. It is provided by device or auto-filled by the platform (if configured). |
ingressMetadata.workspaceId | string | 69f09b3f-ec0d-4b9e-a5ec-87150b935296 | Identifier of the Workspace that originating Device and Stream belong into. Formatted as GUID/UUID with 32 hexadecimal digits (lowercase) separated by hyphens. |
Conditions
Conditions can be specified to filter which messages are mapped to SQL rows. They can be specified on the global level, level of individual column mappings, or both.
The example below shows how to filter messages based on the presence of message.property1
on the global level and value of message.property2
on the mapping level:
{
"$schema": "https://schema-store.datamole.cloud/json/data-platform/sql-egress/egress-mapping.v1.schema.json",
"if": "message.property1",
"rules": [
{
"mapping": {
"table1.col1": "message.property1"
}
},
{
"if": {"selector": "message.property2", "greaterThan": 10},
"mapping": {
"table1.col2": "message.property2"
}
}
]
}
When if
is specified on both levels, the conditions are combined with AND semantics.
The if
property can have one of the following forms:
1. Simple string referencing a property in the message or metadata that must exist for the mapping to be applied.
if: "message.property1"
2. Object with selector
property referencing a property in the message or metadata that must satifies the condition defined by the other property such as greaterThan
. See details below.
if: {"selector": "message.property1", "greaterThan": 10}
3. Array of conditions in the form speficied by 1. or 2. with combined with AND semantics.
if: [
"message.property1",
"message.property2",
{"selector": "message.property3", "greaterThan": 10}
]
The example below shows more complex conditions and combining multiple conditions for a single column mapping.
{
"$schema": "https://schema-store.datamole.cloud/json/data-platform/sql-egress/egress-mapping.v1.schema.json",
"if": {
"selector": "message.property1",
"notIn": ["x", "y"]
},
"rules": [
{
"if": [
{
"selector": "message.property1",
"greaterThan": 10
},
{
"selector": "message.property2",
"equals": "x"
}
],
"mapping": {
"table1.col1": "message.property1"
}
}
]
}
Supported conditions
Following conditions are supported:
For any value
Condition | Value |
---|---|
exists | true or false |
isNull | true or false |
isNullOrNotExists | true or false |
The simplified condition string form mentioned above (e.g. if: "message.property"
) has the same semantics as if: {"selector": "message.property1", "exists": true}
.
For primitive values
Condition | Value |
---|---|
equals | <primitive-value> |
notEquals | <primitive-value> |
in | array of <primitive-value> |
notIn | array of <primitive-value> |
For numeric values
Condition | Value |
---|---|
greaterThan | <numeric-value> |
greaterThanOrEqual | <numeric-value> |
lessThan | <numeric-value> |
lessThanOrEqual | <numeric-value> |
For string values
Condition | Value |
---|---|
hasSubstring | <string> or array of <string> with OR semantics |
notHasSubstring | <string> or array of <string> with OR semantics |
hasPrefix | <string> or array of <string> with OR semantics |
notHasPrefix | <string> or array of <string> with OR semantics |
hasSuffix | <string> or array of <string> with OR semantics |
notHasSuffix | <string> or array of <string> with OR semantics |
For arrays
Condition | Value |
---|---|
arrayContains | <primitive-value> or array of <primitive-value> with OR semantics |
notArrayContains | <primitive-value> or array of <primitive-value> with OR semantics |
Data types and conversions
While JSON type system is quite simple, SQL type system is relatively complex. Given that the core functionality of SQL Egress is mapping JSON values to SQL values, data type conversion is covered in depth.
The conversions are applied on a best-effort basis to maximize usability but always in a safe and deterministic way. Conversions that are ambiguous or not safe are not used.
Supported PostgreSQL types
Type | From string | From number | From boolean | Arrays supported |
---|---|---|---|---|
bigint | Yes | Yes | Yes | Yes |
integer | Yes | Yes | Yes | Yes |
smallint | Yes | Yes | Yes | Yes |
numeric | Yes | Yes | Yes | Yes |
real | Yes | Yes | Yes | Yes |
double precision | Yes | Yes | Yes | Yes |
text | Yes | Yes | Yes | Yes |
character | Yes | Yes | Yes | Yes |
character varying | Yes | Yes | Yes | Yes |
date | Yes | Yes | No | No |
time | Yes | Yes | No | No |
time with time zone | Yes | Yes | No | No |
timestamp | Yes | Yes | No | No |
timestamp with time zone | Yes | Yes | No | No |
boolean | Yes | Yes | Yes | Yes |
uuid | Yes | No | No | No |
json | Yes | Yes | Yes | Yes |
jsonb | Yes | Yes | Yes | Yes |
JSON numbers to booleans
When converting a JSON number to a boolean
data type, the number must be 0
or 1
integer.
JSON booleans to strings or numbers
The logical values are converted to false
and true
string literals when converting JSON boolean values to string data types such as text
or varchar
.
When converting JSON boolean values to numeric data types such as bigint
or double precision
, the logical values are converted to 0
and 1
constants.
JSON strings to numbers
When converting JSON string values to any numeric data type, the string must represent a valid number in the decimal number system.
The dot (.
) can be used as a decimal separator for fractional types such as double precision
or numeric
. Notation starting with a decimal separator is supported (e.g. .1
= 0.1
).
No decimal separator is allowed for integer types such as bigint
or integer. Parsing integer numbers with decimal separators will result in an error.
For all types, the following rules apply:
- Leading signs (
+
,-
) are supported (e.g.-1
). - Leading and trailing whitespace is supported.
- Whitespace between digits is not supported (e.g.
1 000
). - Leading zeros are supported (e.g.
01
). - Exponential notation is supported (e.g.
1e3
). The significand/mantissa can be signed and have a decimal point. Exponent can be signed, but it cannot have a decimal point.
When converting JSON string to 64-bit integer data type (bigint
) and the string does not represent a valid number (as described above), it is also checked whether it represents a valid date-time value in ISO 8601 format (see details below). If a valid date-time is found, it is converted to a UNIX timestamp in milliseconds (e.g. 2024-02-22T18:21:00+02:00
-> 1708618860000
).
JSON strings to date & time
When converting JSON string values to data types representing date and time, such as timestamp
, the string must represent a valid date or time in the ISO 8601 format (e.g. 2024-02-26T17:18:33.412+01:00
).
More specifically, the format is as follows:
{year}-{month}-{day}T{hour}:{minute}:{second}.{fractional-second}{offset}
, where
{year}
is a four-digit year,{month}
is a two-digit month,{day}
is a two-digit day,{hour}
is a two-digit hour,{minute}
is a two-digit minute,{second}
is a two-digit second,{fractional-second}
is optional and can specified up to the precision of 100 nanoseconds (7 digits, e.g.2024-02-26T17:18:33.4120001
),{offset}
is optional and can be eitherZ
(UTC+0) or±{hours}:{minutes}
(e.g.2024-02-26T17:18:33Z
or2024-02-26T17:18:33+01:00
).
If the offset is not specified, UTC+0 is assumed.
When converting JSON string to date-only, resp. time-only types such as date
, resp. time
, the string format is similar to the complete date-time format above:
- Date-only:
{year}-{month}-{day}
(e.g.2024-02-26
). - Time-only:
{hour}:{minute}:{second}.{fractional-second}
(e.g.17:18:33.412
).
In both cases, the offset cannot be specified.
Obtaining date-only, resp. time-only value from a complete date-time value is also supported: for example, string 2024-02-26T17:18:33.412+01:00
would be converted to 2024-02-26
date-only, resp. 17:18:33.412
values. Time-only values are converted without any adjustments for the offset.
Obtaining date-time value from date-only value is supported. In such case, the time part is assumed to be 00:00:00Z
(e.g. 2024-02-26
will be converted to 2024-02-26T00:00:00Z
).
JSON numbers to date & time
When converting JSON numbers to data types related to date or time, such as date
or timestamp
, the number must be a non-negative integer and represent a valid UNIX timestamp in seconds or milliseconds.
UNIX time limitations
To support specifying UNIX timestamps in both seconds and milliseconds, the following rule is applied: if the value is larger than 126 230 400 000, the value is considered to be in milliseconds. Otherwise, the value is considered to be in seconds. This means that the timestamps earlier than 1974-01-01T00:00:00.000
cannot be represented in milliseconds, and timestamps later than 5970-01-31T00:00:00
cannot be represented in seconds.
JSON strings to UUID/GUID
When converting JSON string values to uuid
data type, the string must represent a valid UUID/GUID in the canonical form: 32 hexadecimal digits, divided into five groups (8-4-4-4-12) separated by hyphens, a total of 36 characters (e.g. 0f93c107-de01-4bbb-8033-d8da313f584e
).
Other notations such as {0f93c107-de01-4bbb-8033-d8da313f584e}
are not supported.
The json
and jsonb
types
In addition to the above types, the json
and jsonb
types are supported as well. Any incoming JSON primitive value (string, boolean, number or null) as well objects and arrays can be converted into these types.
If the target SQL column is nullable, the JSON null
value is converted to SQL NULL
. If the target SQL column is not nullable, the JSON null
value is stored as is.
The json
and jsonb
types used as PostgreSQL arrays (json[]
and jsonb[]
) are not supported.
Conflict policy
It might happen, that egress mapping yields an SQL row with conflicting primary key in the target DB. For such situations, conflict policy can be set through onConflictPolicy
egress mapping property:
{
"$schema": "https://schema-store.datamole.cloud/json/data-platform/sql-egress/egress-mapping.v1.schema.json",
"onConflictPolicy": "<policy>",
"rules": [
{
"mapping": {
"table1.col1": "message.property1"
}
}
]
}
The value for <policy>
can be:
DoUpdate
: The relevant columns of the row will be updated with new values. Default value.DoNothing
: Existing row will be kept intact and new row is discarded.
For PostgreSQL, the ON CONFLICT
clause is used to implement this feature with DO UPDATE
or DO NOTHING
action.
Preprocessing
Egress mapping might contain directive of preprocessing incoming messages. Currently supported feature is splitting of incoming messages can be split into multiple.
Preprocessing is configured through preprocessing egress mapping property:
{
"$schema": "https://schema-store.datamole.cloud/json/data-platform/sql-egress/egress-mapping.v1.schema.json",
"preprocessing": {
"split": "<split-type>"
},
"rules": [
{
"mapping": {
"table1.col1": "message.property1"
}
}
]
}
The <split-type>
value can be one of the following:
- None (default)
- Array: incoming message is expected to be JSON array where each item of this array will be extracted as individual message.
- Newlines: incoming message is split to multiple messages by new lines.
If setting split = Newlines
, following message will be split into three where each of the new messages is a valid JSON token:
{"property1": 42}
{"property1": 43}
{"property1": 44}
Similarly, if setting split = Array, following message will be split into three where each of the new messages is a valid JSON token:
[
{"property1": 42},
{"property1": 43},
{"property1": 44}
]
Limitations
- Total size of the message must not exceed 2 MiB.
- Depth of the JSON document must not exceed 16.
TimescaleDB integration
There is no special configuration needed to use TimescaleDB with SQL Egress.
Common issues
ERROR: invalid INSERT on the root table of hypertable <table-name>
: this error is typically caused by broken TimescaleDB extension installation. Please see the following GitHub issues for more information: