AWS Kinesis Firehose Source

The Vector aws_kinesis_firehose source receives logs from AWS Kinesis Firehose.

Requirements

Configuration

[sources.my_source_id]
type = "aws_kinesis_firehose" # required
access_key = "A94A8FE5CCB19BA61C4C08" # optional, no default
address = "0.0.0.0:443" # required
  • commonoptionalstring

    access_key

    AWS Kinesis Firehose can be configured to pass along an access key to authenticate requests. If configured, access_key should be set to the same value. If not specified, vector will treat all requests as authenticated. See Forwarding CloudWatch Log events for more info.

    • Syntax: literal
  • commonrequiredstring

    address

    The address to listen for connections on

    • Syntax: literal
  • optionaltable

    tls

    Configures the TLS options for incoming connections.

    • optionalstring

      ca_file

      Absolute path to an additional CA certificate file, in DER or PEM format (X.509), or an in-line CA certificate in PEM format.

      • Syntax: literal
    • optionalstring

      crt_file

      Absolute path to a certificate file used to identify this server, in DER or PEM format (X.509) or PKCS#12, or an in-line certificate in PEM format. If this is set, and is not a PKCS#12 archive, key_file must also be set. This is required if enabled is set to true.

      • Syntax: literal
    • optionalbool

      enabled

      Require TLS for incoming connections. If this is set, an identity certificate is also required.

      • Default: false
    • optionalstring

      key_file

      Absolute path to a private key file used to identify this server, in DER or PEM format (PKCS#8), or an in-line private key in PEM format.

      • Syntax: literal
    • optionalstring

      key_pass

      Pass phrase used to unlock the encrypted key file. This has no effect unless key_file is set.

      • Syntax: literal
    • optionalbool

      verify_certificate

      If true, Vector will require a TLS certificate from the connecting host and terminate the connection if the certificate is not valid. If false (the default), Vector will not request a certificate from the client.

      • Default: false

Output

This component outputs log events with the following fields:

{
"message" : "Started GET / for 127.0.0.1 at 2012-03-10 14:28:14 +0100",
"request_id" : "ed1d787c-b9e2-4631-92dc-8e7c9d26d804",
"source_arn" : "arn:aws:firehose:us-east-1:111111111111:deliverystream/test",
"timestamp" : "2020-10-10T17:07:36+00:00"
}
  • commonrequiredstring

    message

    The raw record from the incoming payload.

    • Syntax: literal
  • commonrequiredstring

    request_id

    The AWS Kinesis Firehose request ID, value of the X-Amz-Firehose-Request-Id header.

    • Syntax: literal
  • commonrequiredstring

    source_arn

    The AWS Kinises Firehose delivery stream that issued the request, value of the X-Amz-Firehose-Source-Arn header.

    • Syntax: literal
  • commonrequiredtimestamp

    timestamp

    The exact time the event was ingested into Vector.

Telemetry

This component provides the following metrics that can be retrieved through the internal_metrics source. See the metrics section in the monitoring page for more info.

  • counter

    request_read_errors_total

    The total number of request read errors for this component. This metric includes the following tags:

    • component_kind - The Vector component kind.

    • component_name - The Vector component ID.

    • component_type - The Vector component type.

    • instance - The Vector instance identified by host and port.

    • job - The name of the job producing Vector metrics.

  • counter

    events_out_total

    The total number of events emitted by this component. This metric includes the following tags:

    • component_kind - The Vector component kind.

    • component_name - The Vector component ID.

    • component_type - The Vector component type.

    • instance - The Vector instance identified by host and port.

    • job - The name of the job producing Vector metrics.

  • counter

    requests_received_total

    The total number of requests received by this component. This metric includes the following tags:

    • component_kind - The Vector component kind.

    • component_name - The Vector component ID.

    • component_type - The Vector component type.

    • instance - The Vector instance identified by host and port.

    • job - The name of the job producing Vector metrics.

Examples

Given the following input:

{
"requestId": "ed1d787c-b9e2-4631-92dc-8e7c9d26d804",
"timestamp": 1600110760138,
"records": [
{
"data": "H4sIABk1bV8AA52TzW7bMBCE734KQ2db/JdI3QzETS8FAtg91UGgyOuEqCQq5Mqua+TdS8lu0hYNUpQHAdoZDcn9tKfJdJo0EEL5AOtjB0kxTa4W68Xdp+VqtbheJrPB4A4t+EFiv6yzVLuHa+/6blARAr5UV+ihbH4vh/4+VN52aF37wdYIPkTDlyhF8SrabFsOWhIrtz+Dlnto8dV3Gp9RstshXKhMi0xpqk3GpNJccpFRKYw0WvCM5kIbzrVWipm4VK55rrSk44HGHLTx/lg2wxVYRiljVGWGCvPiuPRn2O60Se6P8UKbpOBZrulsk2xLhCEjljYJk2QFHeGU04KxQqpCsumcSko3SfQ+uoBnn8pTJmjKWZYyI0axAXx021G++bweS5136CpXj8WP6/UNYek5ycMOPPhReETsQkHI4XBIO2/bynZlXXkXwryrS9w536TWkab0XwED6e/tU2/R9eGS9NTD5VgEvnWwtQikcu0e/AO0FYyu4HpfwR3Gf2R0Btza9qxgiUNUISiLr30AP7fbyMzu7OWA803ynIzdfJ69B1EZpoVhsWMRZ8a5UVJoRoUyUlDNspxzZWiEnOXiXYiSvQOR5TnN/xsiNalmKZcy5Yr/yfB6+RZD/gbDC0IbOx8wQrMhxGGYx4lBW5X1wJBLkpO981jWf6EXogvIrm+rYYrKOn4Hgbg4b439/s8cFeVvcNwBtHBkOdWvQIdRnTxPfgCXvyEgSQQAAA=="
}
]
}

And the following configuration:

vector.toml
[sources.aws_kinesis_firehose]
type = "aws_kinesis_firehose"
address = "0.0.0.0:443"

The following Vector log event will be output:

[
{
"request_id": "ed1d787c-b9e2-4631-92dc-8e7c9d26d804",
"source_arn": "arn:aws:firehose:us-east-1:111111111111:deliverystream/test",
"timestamp": "2020-09-14T19:12:40.138Z",
"message": "{\"messageType\":\"DATA_MESSAGE\",\"owner\":\"111111111111\",\"logGroup\":\"test\",\"logStream\":\"test\",\"subscriptionFilters\":[\"Destination\"],\"logEvents\":[{\"id\":\"35683658089614582423604394983260738922885519999578275840\",\"timestamp\":1600110569039,\"message\":\"{\\\"bytes\\\":26780,\\\"datetime\\\":\\\"14/Sep/2020:11:45:41 -0400\\\",\\\"host\\\":\\\"157.130.216.193\\\",\\\"method\\\":\\\"PUT\\\",\\\"protocol\\\":\\\"HTTP/1.0\\\",\\\"referer\\\":\\\"https://www.principalcross-platform.io/markets/ubiquitous\\\",\\\"request\\\":\\\"/expedite/convergence\\\",\\\"source_type\\\":\\\"stdin\\\",\\\"status\\\":301,\\\"user-identifier\\\":\\\"-\\\"}\"},{\"id\":\"35683658089659183914001456229543810359430816722590236673\",\"timestamp\":1600110569041,\"message\":\"{\\\"bytes\\\":17707,\\\"datetime\\\":\\\"14/Sep/2020:11:45:41 -0400\\\",\\\"host\\\":\\\"109.81.244.252\\\",\\\"method\\\":\\\"GET\\\",\\\"protocol\\\":\\\"HTTP/2.0\\\",\\\"referer\\\":\\\"http://www.investormission-critical.io/24/7/vortals\\\",\\\"request\\\":\\\"/scale/functionalities/optimize\\\",\\\"source_type\\\":\\\"stdin\\\",\\\"status\\\":502,\\\"user-identifier\\\":\\\"feeney1708\\\"}\"}]}"
}
]

How It Works

Context

By default, the aws_kinesis_firehose source will augment events with helpful context keys as shown in the "Output" section.

Forwarding CloudWatch Log events

This source is the recommended way to ingest logs from AWS CloudWatch logs via [AWS CloudWatch Log subscriptions][aws_cloudwatch_logs_subscriptions]. To set this up:

  1. Deploy vector with a publicly exposed HTTP endpoint using this source. You will likely also want to use the [aws_cloudwatch_logs_subscription_parser][vector_transform_aws_cloudwatch_logs_subscription_parser] transform to extract the log events. Make sure to set the access_key to secure this endpoint. Your configuration might look something like:

    [sources.firehose]
    # General
    type = "aws_kinesis_firehose"
    address = "127.0.0.1:9000"
    access_key = "secret"
    [transforms.cloudwatch]
    type = "aws_cloudwatch_logs_subscription_parser"
    inputs = ["firehose"]
    [sinks.console]
    type = "console"
    inputs = ["cloudwatch"]
    encoding.codec = "json"
  2. Create a Kinesis Firewatch delivery stream in the region where the CloudWatch Logs groups exist that you want to ingest.

  3. Set the stream to forward to your Vector instance via its HTTP Endpoint destination. Make sure to configure the same access_key you set earlier.

  4. Setup a [CloudWatch Logs subscription][aws_cloudwatch_logs_subscriptions] to forward the events to your delivery stream

State

This component is stateless, meaning its behavior is consistent across each input.

Transport Layer Security (TLS)

Vector uses Openssl for TLS protocols. You can adjust TLS behavior via the tls.* options.