- Instrumentation classes
- Types of counters
- Test counters manually using your Rails console
- Generate the SQL query
-
Optimize queries with
#database-lab
- Add the metric definition
- Add the metric to the Versions Application
- Create a merge request
- Verify your metric
- Set up and test Service Ping locally
- Test Prometheus-based Service Ping
- Aggregated metrics
Implement Service Ping
Service Ping consists of two kinds of data:
- Counters: Track how often a certain event happened over time, such as how many CI/CD pipelines have run. They are monotonic and always trend up.
- Observations: Facts collected from one or more GitLab instances and can carry arbitrary data. There are no general guidelines for how to collect those, due to the individual nature of that data.
To implement a new metric in Service Ping, follow these steps:
- Implement the required counter
- Name and place the metric
- Test counters manually using your Rails console
- Generate the SQL query
- Optimize queries with
#database-lab
- Add the metric definition to the Metrics Dictionary
- Add the metric to the Versions Application
- Create a merge request
- Verify your metric
- Set up and test Service Ping locally
Instrumentation classes
usage_data.rb
is deprecated.
When you add or change a Service Ping Metric, you must migrate metrics to instrumentation classes.
For information about the progress on migrating Service Ping metrics, see this epic.For example, we have the following instrumentation class:
lib/gitlab/usage/metrics/instrumentations/count_boards_metric.rb
.
You should add it to usage_data.rb
as follows:
boards: add_metric('CountBoardsMetric', time_frame: 'all'),
Types of counters
There are several types of counters for metrics:
- Batch counters: Used for counts, sums, and averages.
- Redis counters: Used for in-memory counts.
- Alternative counters: Used for settings and configurations.
Batch counters
For large tables, PostgreSQL can take a long time to count rows due to MVCC (Multi-version Concurrency Control). Batch counting is a counting method where a single large query is broken into multiple smaller queries. For example, instead of a single query querying 1,000,000 records, with batch counting, you can execute 100 queries of 10,000 records each. Batch counting is useful for avoiding database timeouts as each batch query is significantly shorter than one single long running query.
For GitLab.com, there are extremely large tables with 15 second query timeouts, so we use batch counting to avoid encountering timeouts. Here are the sizes of some GitLab.com tables:
Table | Row counts in millions |
---|---|
merge_request_diff_commits |
2280 |
ci_build_trace_sections |
1764 |
merge_request_diff_files |
1082 |
events |
514 |
Batch counting requires indexes on columns to calculate max, min, and range queries. In some cases, you must add a specialized index on the columns involved in a counter.
Ordinary batch counters
Create a new database metrics instrumentation class with count
operation for a given ActiveRecord_Relation
Method:
add_metric('CountIssuesMetric', time_frame: 'all')
Examples:
Examples using usage_data.rb
have been deprecated. We recommend to use instrumentation classes.
Distinct batch counters
Create a new database metrics instrumentation class with distinct_count
operation for a given ActiveRecord_Relation
.
Method:
add_metric('CountUsersAssociatingMilestonesToReleasesMetric', time_frame: 'all')
Examples:
Examples using usage_data.rb
have been deprecated. We recommend to use instrumentation classes.
Sum batch operation
Sum the values of a given ActiveRecord_Relation on given column and handles errors.
Handles the ActiveRecord::StatementInvalid
error
Method:
add_metric('JiraImportsTotalImportedIssuesCountMetric')
Average batch operation
Average the values of a given ActiveRecord_Relation
on given column and handles errors.
Method:
add_metric('CountIssuesWeightAverageMetric')
Examples:
Examples using usage_data.rb
have been deprecated. We recommend to use instrumentation classes.
Grouping and batch operations
The count
, distinct_count
, sum
, and average
batch counters can accept an ActiveRecord::Relation
object, which groups by a specified column. With a grouped relation, the methods do batch counting,
handle errors, and returns a hash table of key-value pairs.
Examples:
count(Namespace.group(:type))
# returns => {nil=>179, "Group"=>54}
distinct_count(Project.group(:visibility_level), :creator_id)
# returns => {0=>1, 10=>1, 20=>11}
sum(Issue.group(:state_id), :weight))
# returns => {1=>3542, 2=>6820}
average(Issue.group(:state_id), :weight))
# returns => {1=>3.5, 2=>2.5}
Add operation
Sum the values given as parameters. Handles the StandardError
.
Returns -1
if any of the arguments are -1
.
Method:
add(*args)
Examples:
project_imports = distinct_count(::Project.where.not(import_type: nil), :creator_id)
bulk_imports = distinct_count(::BulkImport, :user_id)
add(project_imports, bulk_imports)
Estimated batch counters
Introduced in GitLab 13.7.
Estimated batch counter functionality handles ActiveRecord::StatementInvalid
errors
when used through the provided estimate_batch_distinct_count
method.
Errors return a value of -1
.
When correctly used, the estimate_batch_distinct_count
method enables efficient counting over
columns that contain non-unique values, which can not be assured by other counters.
estimate_batch_distinct_count method
Method:
estimate_batch_distinct_count(relation, column = nil, batch_size: nil, start: nil, finish: nil)
The method includes the following arguments:
-
relation
: The ActiveRecord_Relation to perform the count. -
column
: The column to perform the distinct count. The default is the primary key. -
batch_size
: FromGitlab::Database::PostgresHll::BatchDistinctCounter::DEFAULT_BATCH_SIZE
. Default value: 10,000. -
start
: The custom start of the batch count, to avoid complex minimum calculations. -
finish
: The custom end of the batch count to avoid complex maximum calculations.
The method includes the following prerequisites:
- The supplied
relation
must include the primary key defined as the numeric column. For example:id bigint NOT NULL
. - The
estimate_batch_distinct_count
can handle a joined relation. To use its ability to count non-unique columns, the joined relation must not have a one-to-many relationship, such ashas_many :boards
. -
Both
start
andfinish
arguments should always represent primary key relationship values, even if the estimated count refers to another column, for example:estimate_batch_distinct_count(::Note, :author_id, start: ::Note.minimum(