Concepts

In order to understand Vector, you must first understand the fundamental concepts. The following concepts are ordered progressively, starting with the individual unit of data (events) and broadening all the way to Vector's deployment models (pipelines).

Events

"Events" represent the individual units of data in Vector. They must fit into one of the following types.

Data model

Logs

A "log" event is a generic key/value representation of an event.

Log events

Metrics

A "metric" event is a first-class representation of numerical operation performed on a time series. Vector's metric events are fully interoperable.

Metric events

Components

"Component" is the generic term we use for sources, transforms, and sinks. Components ingest, transform, and route events. You compose components to create topologies.

Components

Sources

Vector wouldn't be very useful if it couldn't ingest data. A "source" defines where Vector should pull data from, or how it should receive data pushed to it. A topology can have any number of sources, and as they ingest data they proceed to normalize it into events (see next section). This sets the stage for easy and consistent processing of your data. Examples of sources include file, syslog, StatsD, and stdin.

Sources

Transforms

A "transform" is responsible for mutating events as they are transported by Vector. This might involve parsing, filtering, sampling, or aggregating. You can have any number of transforms in your pipeline and how they are composed is up to you.

Transforms

Sinks

A "sink" is a destination for events. Each sink's design and transmission method is dictated by the downstream service it is interacting with. For example, the socket sink will stream individual events, while the aws_s3 sink will buffer and flush data.

Sinks

Pipeline

A "Pipeline" is a directed acyclic graph of components. Each component is a node on the graph with directed edges. Data must flow in one direction, from sources to sinks. Components can produce zero or more events.

Pipeline model

Roles

A "role" refers to a deployment role that Vector fills in order to create end-to-end pipelines.

Deployment roles

Agent

The "agent" role is designed for deploying Vector to the edge, typically for data collection.

Agent role

Aggregator

The "aggregator" role is designed to collect and process data from multiple upstream sources. These upstream sources could be other Vector agents or non-Vector agents such as Syslog-ng.

Aggregator role

Topology

A "topology" refers to the end result of deploying Vector into your infrastructure. A topology may be as simple as deploying Vector as an agent, or it may be as complex as deploying Vector as an agent and routing data through multiple Vector aggregators.

Deployment topologies