Delta Tables Schema - Binary
tip
See Delta Egress Sink page to learn more about sending data to Delta Tables in general.
When using this schema, single table is created right at the Directory Path
specified in the Egress Route configuration.
The incoming message's payload can be any sequence of bytes:
- Each message produces a single row in the Delta Table.
- The payload can be up to 16 MiB in size. Larger payloads are discarded but the corresponding Delta Table row is still created with null value for
payload
column.
Schema
Output tables always contain the following columns:
Name | Type | Example | Description |
---|---|---|---|
payload | binary | b'{"property1": 42}' | Unmodified bytes of message payload |
payload_content_type | string | application/json | Content type of the payload. It can be any string but it is recommended to use one of the common MIME types. |
kind | string | Message | Identifier that allows to distinguish amongst different kinds of events. Currently, this is always set to Message because no other event kinds are supported. |
stream_group_name | string | group-a | Name of the stream group the message was sent into. |
stream_name | string | telemetry | Name of the stream the message was sent into. |
site_id | string | factory-a51 | Id of site that the device was at during sending the message. It is not always available, depending on the Stream configuration. |
device_id | string | robot-125 | Id of the device that sent the message. |
batch_id | string | 2023-12-19 | Identifier of batch. It is provided by device or auto-filled by the platform (if configured). |
batch_slice_id | string | logs | Identifier of batch slice (if it was provided by the device). |
message_id | string | m00767 | Identifier of the message. It is provided by device or auto-filled by the platform (if configured). |
workspace_id | string | 69f09b3f-ec0d-4b9e-a5ec-87150b935296 | Identifier of the Workspace that originating Device and Stream belong into. Formatted as GUID/UUID with 32 hexadecimal digits (lowercase) separated by hyphens. |
ingress_enqueued_date_time | timestamp | 2023-12-19T11:25:56.1408925+01:00 | Time when the Message was ingested by the platform. ISO 8601 format. |
ingress_enqueued_date | date | 2023-12-05 | The UTC date generated from ingress_enqueued_date_time. |
Spark SQL - interpretable schema
payload BINARY,
payload_content_type STRING,
kind STRING,
stream_group_name STRING,
stream_name STRING,
site_id STRING,
device_id STRING,
batch_id STRING,
batch_slice_id STRING,
message_id STRING,
workspace_id STRING,
ingress_enqueued_date_time TIMESTAMP,
ingress_enqueued_date DATE
Partition key columns
Tables are partitioned by ingress_enqueued_date
column.