Pulsar
Publish observability events to Apache Pulsar topics
status: beta
delivery: at-least-once
acknowledgements: no
egress: stream
state: stateless
Configuration
Example configurations
{
"sinks": {
"my_sink_id": {
"type": "pulsar",
"inputs": [
"my-source-or-transform-id"
],
"endpoint": "pulsar://127.0.0.1:6650",
"encoding": {
"codec": "json"
},
"healthcheck": null,
"topic": "topic-1234"
}
}
}
[sinks.my_sink_id]
type = "pulsar"
inputs = [ "my-source-or-transform-id" ]
endpoint = "pulsar://127.0.0.1:6650"
topic = "topic-1234"
[sinks.my_sink_id.encoding]
codec = "json"
---
sinks:
my_sink_id:
type: pulsar
inputs:
- my-source-or-transform-id
endpoint: pulsar://127.0.0.1:6650
encoding:
codec: json
healthcheck: null
topic: topic-1234
{
"sinks": {
"my_sink_id": {
"type": "pulsar",
"inputs": [
"my-source-or-transform-id"
],
"auth": null,
"endpoint": "pulsar://127.0.0.1:6650",
"buffer": null,
"encoding": {
"codec": "json"
},
"healthcheck": null,
"topic": "topic-1234"
}
}
}
[sinks.my_sink_id]
type = "pulsar"
inputs = [ "my-source-or-transform-id" ]
endpoint = "pulsar://127.0.0.1:6650"
topic = "topic-1234"
[sinks.my_sink_id.encoding]
codec = "json"
---
sinks:
my_sink_id:
type: pulsar
inputs:
- my-source-or-transform-id
auth: null
endpoint: pulsar://127.0.0.1:6650
buffer: null
encoding:
codec: json
healthcheck: null
topic: topic-1234
auth
optional objectOptions for the authentication strategy.
auth.token
optional string literalThe basic authentication password.
buffer
optional objectConfigures the sink specific buffer behavior.
buffer.max_events
optional uintThe maximum number of events allowed in the buffer.
Relevant when:
type = "memory"
default:
500
(events)buffer.type
optional string literal enumThe buffer’s type and storage mechanism.
Enum options
Option | Description |
---|---|
disk | Stores the sink’s buffer on disk. This is less performant, but durable. Data will not be lost between restarts. Will also hold data in memory to enhance performance. WARNING: This may stall the sink if disk performance isn’t on par with the throughput. For comparison, AWS gp2 volumes are usually too slow for common cases. |
memory | Stores the sink’s buffer in memory. This is more performant, but less durable. Data will be lost if Vector is restarted forcefully. |
default:
memory
buffer.when_full
optional string literal enumThe behavior when the buffer becomes full.
Enum options
Option | Description |
---|---|
block | Applies back pressure when the buffer is full. This prevents data loss, but will cause data to pile up on the edge. |
drop_newest | Drops new data as it’s received. This data is lost. This should be used when performance is the highest priority. |
default:
block
encoding
required objectConfigures the encoding specific sink behavior.
Note: When data in encoding
is malformed, currently only a very generic error “data did not match any variant of untagged enum EncodingConfig” is reported. Follow this issue to track progress on improving these error messages.
encoding.codec
optional string literal enumThe encoding codec used to serialize the events before outputting.
Enum options
Option | Description |
---|---|
json | JSON encoded event. |
text | The message field from the event. |
encoding.except_fields
optional [string]Prevent the sink from encoding the specified fields.
encoding.only_fields
optional [string]Makes the sink encode only the specified fields.
encoding.timestamp_format
optional string literal enumHow to format event timestamps.
Enum options
Option | Description |
---|---|
rfc3339 | Formats as a RFC3339 string |
unix | Formats as a unix timestamp |
default:
rfc3339
endpoint
required string literalEndpoint to which the pulsar client should connect to.
inputs
required [string]A list of upstream source or transform
IDs. Wildcards (*
) are supported.
See configuration for more info.
Telemetry
Metrics
linkbuffer_byte_size
gaugeThe number of bytes current in the buffer.
component_id
required
The Vector component ID.
component_kind
required
The Vector component kind.
component_name
required
Deprecated, use
component_id
instead. The value is the same as component_id
.component_type
required
The Vector component type.
host
optional
The hostname of the system Vector is running on.
pid
optional
The process ID of the Vector instance.
buffer_discarded_events_total
counterThe number of events dropped by this non-blocking buffer.
component_id
required
The Vector component ID.
component_kind
required
The Vector component kind.
component_name
required
Deprecated, use
component_id
instead. The value is the same as component_id
.component_type
required
The Vector component type.
host
optional
The hostname of the system Vector is running on.
pid
optional
The process ID of the Vector instance.
buffer_events
gaugeThe number of events currently in the buffer.
component_id
required
The Vector component ID.
component_kind
required
The Vector component kind.
component_name
required
Deprecated, use
component_id
instead. The value is the same as component_id
.component_type
required
The Vector component type.
host
optional
The hostname of the system Vector is running on.
pid
optional
The process ID of the Vector instance.
buffer_received_event_bytes_total
counterThe number of bytes received by this buffer.
component_id
required
The Vector component ID.
component_kind
required
The Vector component kind.
component_name
required
Deprecated, use
component_id
instead. The value is the same as component_id
.component_type
required
The Vector component type.
host
optional
The hostname of the system Vector is running on.
pid
optional
The process ID of the Vector instance.
buffer_received_events_total
counterThe number of events received by this buffer.
component_id
required
The Vector component ID.
component_kind
required
The Vector component kind.
component_name
required
Deprecated, use
component_id
instead. The value is the same as component_id
.component_type
required
The Vector component type.
host
optional
The hostname of the system Vector is running on.
pid
optional
The process ID of the Vector instance.
buffer_sent_event_bytes_total
counterThe number of bytes sent by this buffer.
component_id
required
The Vector component ID.
component_kind
required
The Vector component kind.
component_name
required
Deprecated, use
component_id
instead. The value is the same as component_id
.component_type
required
The Vector component type.
host
optional
The hostname of the system Vector is running on.
pid
optional
The process ID of the Vector instance.
buffer_sent_events_total
counterThe number of events sent by this buffer.
component_id
required
The Vector component ID.
component_kind
required
The Vector component kind.
component_name
required
Deprecated, use
component_id
instead. The value is the same as component_id
.component_type
required
The Vector component type.
host
optional
The hostname of the system Vector is running on.
pid
optional
The process ID of the Vector instance.
component_received_event_bytes_total
counterThe number of event bytes accepted by this component either from
tagged origins like file and uri, or cumulatively from other origins.
component_id
required
The Vector component ID.
component_kind
required
The Vector component kind.
component_name
required
Deprecated, use
component_id
instead. The value is the same as component_id
.component_type
required
The Vector component type.
container_name
optional
The name of the container from which the data originated.
file
optional
The file from which the data originated.
host
optional
The hostname of the system Vector is running on.
mode
optional
The connection mode used by the component.
peer_addr
optional
The IP from which the data originated.
peer_path
optional
The pathname from which the data originated.
pid
optional
The process ID of the Vector instance.
pod_name
optional
The name of the pod from which the data originated.
uri
optional
The sanitized URI from which the data originated.
component_received_events_count
histogramA histogram of Vector the number of events passed in each internal batch in Vector’s internal topology.
Note that this is separate than sink-level batching. It is mostly useful for low level debugging
performance issues in Vector due to small internal batches.
component_id
required
The Vector component ID.
component_kind
required
The Vector component kind.
component_name
required
Deprecated, use
component_id
instead. The value is the same as component_id
.component_type
required
The Vector component type.
container_name
optional
The name of the container from which the data originated.
file
optional
The file from which the data originated.
host
optional
The hostname of the system Vector is running on.
mode
optional
The connection mode used by the component.
peer_addr
optional
The IP from which the data originated.
peer_path
optional
The pathname from which the data originated.
pid
optional
The process ID of the Vector instance.
pod_name
optional
The name of the pod from which the data originated.
uri
optional
The sanitized URI from which the data originated.
component_received_events_total
counterThe number of events accepted by this component either from tagged
origins like file and uri, or cumulatively from other origins.
component_id
required
The Vector component ID.
component_kind
required
The Vector component kind.
component_name
required
Deprecated, use
component_id
instead. The value is the same as component_id
.component_type
required
The Vector component type.
container_name
optional
The name of the container from which the data originated.
file
optional
The file from which the data originated.
host
optional
The hostname of the system Vector is running on.
mode
optional
The connection mode used by the component.
peer_addr
optional
The IP from which the data originated.
peer_path
optional
The pathname from which the data originated.
pid
optional
The process ID of the Vector instance.
pod_name
optional
The name of the pod from which the data originated.
uri
optional
The sanitized URI from which the data originated.
encode_errors_total
counterThe total number of errors encountered when encoding an event.
host
optional
The hostname of the system Vector is running on.
pid
optional
The process ID of the Vector instance.
events_in_total
counterThe number of events accepted by this component either from tagged
origins like file and uri, or cumulatively from other origins.
This metric is deprecated and will be removed in a future version.
Use
component_received_events_total
instead.component_id
required
The Vector component ID.
component_kind
required
The Vector component kind.
component_name
required
Deprecated, use
component_id
instead. The value is the same as component_id
.component_type
required
The Vector component type.
container_name
optional
The name of the container from which the data originated.
file
optional
The file from which the data originated.
host
optional
The hostname of the system Vector is running on.
mode
optional
The connection mode used by the component.
peer_addr
optional
The IP from which the data originated.
peer_path
optional
The pathname from which the data originated.
pid
optional
The process ID of the Vector instance.
pod_name
optional
The name of the pod from which the data originated.
uri
optional
The sanitized URI from which the data originated.
utilization
gaugeA ratio from 0 to 1 of the load on a component. A value of 0 would indicate a completely idle component that is simply waiting for input. A value of 1 would indicate a that is never idle. This value is updated every 5 seconds.
component_id
required
The Vector component ID.
component_kind
required
The Vector component kind.
component_name
required
Deprecated, use
component_id
instead. The value is the same as component_id
.component_type
required
The Vector component type.
host
optional
The hostname of the system Vector is running on.
pid
optional
The process ID of the Vector instance.
How it works
Health checks
Health checks ensure that the downstream service is
accessible and ready to accept data. This check is performed
upon sink initialization. If the health check fails an error
will be logged and Vector will proceed to start.
Require health checks
If you’d like to exit immediately upon a health check failure, you can pass the
--require-healthy
flag:
vector --config /etc/vector/vector.toml --require-healthy
Disable health checks
If you’d like to disable health checks for this sink you can set the
healthcheck
option to
false
.