Skip to content

Supported Metrics#

Warning

Note that CTA Metrics is in its early stages and considered experimental. Published metrics and their attributes are likely to change.

Metric and attribute names follow OpenTelemetry semantic conventions, with CTA-specific prefixes for internal domains (e.g., cta.taped, cta.scheduler).

Info

For how to configure CTA to publish metrics, see Enabling Metrics.

Metrics#

Metric Name Type Unit Description Attributes
db.client.connection.count UpDownCounter 1 The number of connections that are currently in a state described by the state attribute. db.namespace
db.system.name
state
db.client.operation.duration Histogram ms Duration of database client operations. db.namespace
db.system.name
(error.type)
cta.frontend.request.duration Histogram ms Duration the frontend takes to process a request. event.name
(error.type)
cta.frontend.active_requests UpDownCounter ms Number of in-flight frontend requests. event.name
cta.scheduler.operation.duration Histogram ms Duration of a CTA scheduling operation. cta.scheduler.operation.name
cta.objectstore.lock.acquire.duration Histogram ms Duration taken to acquire an objectstore lock. lock.type
cta.taped.transfer.file.count Counter 1 Number of files transferred using the io medium in the given io direction. cta.io.direction
cta.io.medium
(error.type)
cta.taped.transfer.file.size Counter by Bytes transferred using the io medium in the given io direction. cta.io.direction
cta.io.medium
cta.taped.transfer.active UpDownCounter 1 Number of threads actively transferring using the io medium in the given io direction. cta.io.direction
cta.io.medium
cta.taped.buffer.usage Gauge by Bytes in use by the memory buffer in cta-taped.
cta.taped.buffer.limit Gauge by Total bytes available for the memory buffer in cta-taped.
cta.taped.mount.duration Histogram s Duration to mount a tape. cta.io.direction
cta.taped.mount.type UpDownCounter 1 Number of drive sessions with the given mount type. cta.taped.mount.type
cta.taped.drive.status UpDownCounter 1 Number of drives in a given state. cta.taped.drive.state

Resource Attributes#

Attribute Name Description
service.namespace Logical namespace of the service emitting the metric. Equivalent to the instance name in CTA.
service.name Name of the service emitting the metric (e.g. cta.taped, cta.frontend).
service.version Version of the service emitting the metric.
service.instance.id Unique identifier for the specific service instance. Useful when multiple replicas run under the same namespace.
process.title Title of the process within the service. For cta.taped, this means per-drive.
host.name Host on which the service is running.
cta.scheduler.namespace Logical name of the scheduler backend in use (e.g. disk, tape).
tape.drive.name Name of the tape drive (only exposed for cta-taped).
tape.library.logical.name Name of the logical library of the tape drive (only exposed for cta-taped).

Metric Attributes#

Attribute Name Description
db.namespace Database namespace (schema or logical grouping).
db.system.name Name of the database system (e.g., postgresql, oracle).
cta.scheduler.operation.name Name of the CTA scheduling operation (e.g. enqueueArchive, cancelRepack).
cta.frontend.requester.name Name of the frontend event requester (e.g. user, subsystem, or service calling the API).
cta.io.direction Direction of the transfer (read or write).
cta.io.medium Medium used for io (disk or tape).
cta.taped.thread_pool.name Name of the thread pool handling taped operations.
cta.taped.drive.state State that the drive is in
cta.taped.mount.type Type of mount.
lock.type Type of lock being acquired in the object store or internal resource (e.g., read, write).
event.name Name of the event being tracked (e.g., frontend or scheduler event).
error.type Classification of an error that occurred (e.g., network, timeout, permission_denied).
state Operational or lifecycle state represented by the metric (e.g., active, queued, failed).
le Histogram bucket upper bound (“less than or equal” duration in ms).

Note on Resource vs Metric Attributes#

Resource attributes describe the entity that produced the telemetry (service/process/host). They are not automatically attached to every metric by all backends.

  • In Prometheus, resource attributes are exposed as a separate time series named target_info (and related info series). They are not labels on each metric by default.
  • If you need resource attributes as labels on every series, either:
  • enable resource -> metric label conversion in your OpenTelemetry Collector pipeline, or
  • join metrics with target_info in PromQL (e.g., on(...) group_left(...)) at query time.
  • Prometheus label keys are sanitized to be valid identifiers (e.g., service.instance.id -> service_instance_id).