You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
bring in prometheus/parquet-common code to new package (#11490)
* bring in prometheus/parquet-common code to new package; replace efficient-go errors with pkg/errors; satisfy mimir-prometheus ChunkSeries interface
* revert breaking upgrade to thanos/objstore
* fix test require
* attempt to update go version for strange errors
* fix stringlabels issues
* update license headers with AGPL and upstream attribution
* fix errors.Is lints
fix errors.Is lints
* fix sort and cancel cause lints
* correct go.mod & vendor in from main to solve conflicts
* use env var to flag parquet promql acceptance
* fix deps from main again
* fix deps from main again
* fix deps from main again
* fix deps from main again
implement new parquet-converter service (#11499)
* bring in parquet-converter from parquet-mimir PoC
* make docs
* make reference-help
* stop using the compactor's config
* remove BlockRanges config, convert all levels of blocks
* drop unused BlockWithExtension struct
* rename ownBlock to own
* move index fetch outside of for loop
* lowercase logs
* wording: compact => convert
* some cleanup
* skip blocks for which compaction mark failed download
* simplfy convertBlock function
* cleanup
* Write Compact Mark
* remove parquetIndex, we don't neeed it
yet at least
* use MetaFetcher to discover blocks
* make reference-help and mark as experimental
* cleanup: we don't need indexes anymore
* revert index loader changes
* basic TestParquetConverter
* make reference-help
* lint
* happy linter
* make docs
* fix: correctly initialize memerlist KV for parquet converter
* lint: sort lines
* more wording fixes: compact => convert
* licence header
* version 1
* remove parquet-converter from 'backend' and 'all' modules
it's experimental and meant to be run alone
* address docs feedback
* remove unused consts
* increase timeout for a test
TestPartitionReader_ShouldNotMissRecordsIfKafkaReturnsAFetchBothWithAnErrorAndSomeRecords
parquet-converter: Introduce metrics and ring test (#11600)
* parquet-converter: Introduce metrics and ring test
This commit introduces a ring test to verify that sharding is working as
expected.
It also introduces metrics to measure total conversions, failures and
durations.
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
converter: proper error handling to measure failures
parquet converter in docker compose (#11633)
* add parquet-converter to docker-compose microservices setup
* format jsonnet
fix(parquet converter): close TSDB block after conversion (#11635)
parquet: vendor back from parquet-common (#11644)
introduce store-gateway.parquet-enabled flag & docs (#11722)
upgrade prometheus parquet-common dependency (#11723)
parquet store-gateways introduce stores interface (#11724)
* declare Stores interface satisfied by BucketStores and future Parquet store
* add casts to for uses of existing impl which are not protected by interface
* stub out parquet bucket stores implementation
* most minimal initialization of Parquet Bucket Stores when flag is enabled
* license header
parquet: Scaffolding for parquet bucket store Series() (#11729)
* parquet: Scaffolding for parquet bucket store
* use parquetshardopener and be sure to close them
* gci pkg/storegateway/parquet_bucket_stores.go
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
---------
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
fix split between Parquet Stores and each tenant's Store (#11735)
fix split between Parquet Stores and each tenant's Store
parquet store-gateways blocks sync and lazy reader (#11759)
parquet-bucket-store: finish implementing Stores interface (#11772)
We're trying to mirror the existing bucket store structure for the
parquet implementation and in this PR i'm just trying to implement some
of the necessary methods starting with building up the series sets for
labels calls.
- Series
- LabelNames
- LabelValues
---------
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Co-authored-by: Nicolas Pazos <nicolas.pazos-mendez@grafana.com>
Co-authored-by: Nicolás Pazos <npazosmendez@gmail.com>
fix(parquet): share `ReaderPoolMetrics` instance (#11851)
We create multiple instances of `ReaderPool`, passing the registry and
creating the metrics on the fly causes panics.
fix(parquet store gateway): close things that should be closed (#11865)
feat(parquet store gateway): support download labels file without validating (#11866)
Signed-off-by: Nicolás Pazos <npazosmendez@gmail.com>
Co-authored-by: francoposa <franco@francoposa.io>
fix(parquet store gateway): pass blockReader to bucket block constructor (#11875)
fix: don't stop nil services
fix(parquet store gateways): correctly locate labels parquet files locally (#11894)
parquet bucket store: add some debug logging (#11925)
Adding few log statements to the existing code path with useful
information to understand when and why we are returning 0 series.
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
parquet store gateways: several fixes and basic tests (#11929)
Co-authored-by: francoposa <franco@francoposa.io>
Co-authored-by: Jesus Vazquez <jesus.vazquez@grafana.com>
parquet converter: include user id in converter counter metrics (#11966)
Adding user id to the converter metrics to better track converter
progress through tenants.
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
Parquet converter: Implement priority queue for block conversion (#11980)
This PR redesigns the parquet converter to use a non-blocking priority
queue that prioritises recently uploaded blocks for conversion.
* Priority Queue Implementation:
- Replaces blocking nested loops with a thread-safe priority queue using
container/heap
- Blocks are prioritized by ULID timestamp, ensuring older blocks are
processed first
* Separate block discovery:
- There is a new discovery goroutine that periodically discovers users
and blocks, enqueuing them for processing
- If the block was previously processed it will be marked as converted
and skipped the next time its discovered.
- There is a new configuration flag `parquet-converter.max-block-age`
that allows us to have a rolling window of blocks so we dont queue up
all the work at once. We can set this to 30 days and only blocks up to
30 days old will be converted, when the work is completed we can go and
increase that window again.
- There is a new processing goroutine that continuously consumes from
the priority queue and converts blocks
- Main Loop remains responsive and handles only service lifecycle events
* New metrics
- Since we added a priority queue, I added 5 new metrics for queue
monitoring:
- cortex_parquet_converter_queue_size - Current queue depth
- cortex_parquet_converter_queue_wait_time_seconds - Time blocks spend
queued
- cortex_parquet_converter_queue_items_enqueued_total - Total blocks
enqueued
- cortex_parquet_converter_queue_items_processed_total - Total blocks
processed
- cortex_parquet_converter_queue_items_dropped_total - Total blocks
dropped when queue closed
The idea here is that by looking at the queue metrics we can have an
idea of how much scaling up we need to deal with the pending work. Also,
before this PR we had no idea of how much work was left to be done but
now we will.
---------
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
fix(parquet store gateway): obey query sharding matchers (#12018)
Inefficient, but at least correct query sharding. The new test on
sharding fails on the base branch.
It's not trivial to add caching to the hashes like the main path does,
because we don't have a `SeriesRef` to use as a cache key at the block
level (to match what the main path does). We could in theory use
something like the row number in the parquet file, but we don't have
easy access to that in this part of the code. In any case, the priority
right now is correctness, we'll work on optimizing later as appropriate.
For referece, see how query sharding is handled on the main path:
https://github.com/grafana/mimir/blob/604775d447c0a9e893fa6930ef8f2d403ebe6757/pkg/storegateway/series_refs.go#L1021-L1047
fix(parquet store gateway): panic in Series call with SkipChunks (#12020)
`chunksIt` is `nil` when `SkipChunks` is `true`.
parquet-converter debug log messages (#12021)
Co-authored-by: Jesus Vazquez <jesus.vazquez@grafana.com>
chore(parquet): Bump parquet-common dependency (#12023)
Brings the last commit from parquet-common
[0811a700a852759c16799358b4424d9888afec3f](prometheus-community/parquet-common@0811a70)
See link for the diff between the two commits
prometheus-community/parquet-common@76512c6...0811a70
---------
Co-authored-by: francoposa <franco@francoposa.io>
feature(parquet): Implement store-gateway limits (#12040)
This PR is based on the upstream work
prometheus-community/parquet-common#81
The idea is to implement a set of basic quota limiters that can protect
us against potential bad queries for the gateways.
Note we had to bring bits of the code available in the querier in
upstream because we have our own chunk querier in Mimir.
---------
Signed-off-by: Jesus Vazquez <jesus.vazquez@grafana.com>
Maximum size in bytes that can be fetched from the parquet labels and chunks data files combined in a single query. If the size exceeds this value the query will stop with limit error.
Max size - in bytes - of a gap for which the partitioner aggregates together two bucket GET object requests. (default 524288)
614
620
-blocks-storage.bucket-store.posting-offsets-in-mem-sampling int
@@ -1969,6 +1975,90 @@ Usage of ./cmd/mimir/mimir:
1969
1975
Maximum time to wait for ring stability at startup. If the overrides-exporter ring keeps changing after this period of time, it will start anyway. (default 5m0s)
Minimum time to wait for ring stability at startup, if set to positive value. Set to 0 to disable.
1978
+
-parquet-converter.conversion-interval duration
1979
+
The frequency at which the conversion runs. (default 1m0s)
1980
+
-parquet-converter.data-dir string
1981
+
Directory to temporarily store blocks during conversion. This directory is not required to persist between restarts. (default "./data-parquet-converter/")
Comma-separated list of tenants that cannot have their TSDB blocks converted into Parquet. If specified, and the Parquet-converter would normally pick a given tenant to convert the blocks to Parquet (via -parquet-converter.enabled-tenants or sharding), it is ignored instead.
1984
+
-parquet-converter.discovery-interval duration
1985
+
The frequency at which user and block discovery runs. (default 5m0s)
Comma-separated list of tenants that can have their TSDB blocks converted into Parquet. If specified, the Parquet-converter only converts these tenants. Otherwise, it converts all tenants. Subject to sharding.
1988
+
-parquet-converter.max-block-age duration
1989
+
Maximum age of blocks to convert. Blocks older than this will be skipped. Set to 0 to disable age filtering.
Maximum time to wait for ring stability at startup. If the Parquet-converter ring keeps changing after this period of time, the Parquet-converter starts anyway. (default 5m0s)
Comma separated list of tenants that can be loaded by the store-gateway. If specified, only blocks for these tenants will be loaded by the store-gateway, otherwise all tenants can be loaded. Subject to sharding.
3202
+
-store-gateway.parquet-enabled
3203
+
[experimental] Whether to query Parquet files for block instead of the native Prometheus TSDB files.
3112
3204
-store-gateway.sharding-ring.auto-forget-enabled
3113
3205
[deprecated] When enabled, a store-gateway is automatically removed from the ring after failing to heartbeat the ring for a period longer than the configured -store-gateway.sharding-ring.auto-forget-unhealthy-periods times the configured -store-gateway.sharding-ring.heartbeat-timeout. This setting is deprecated. Set -store-gateway.sharding-ring.auto-forget-unhealthy-periods to 0 to disable auto-forget. (default true)
3114
3206
-store-gateway.sharding-ring.auto-forget-unhealthy-periods int
Copy file name to clipboardExpand all lines: cmd/mimir/help.txt.tmpl
+20Lines changed: 20 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -543,6 +543,26 @@ Usage of ./cmd/mimir/mimir:
543
543
List of network interface names to look up when finding the instance IP address. (default [<private network interfaces>])
544
544
-overrides-exporter.ring.store string
545
545
Backend storage to use for the ring. Supported values are: consul, etcd, inmemory, memberlist, multi. (default "memberlist")
546
+
-parquet-converter.conversion-interval duration
547
+
The frequency at which the conversion runs. (default 1m0s)
548
+
-parquet-converter.data-dir string
549
+
Directory to temporarily store blocks during conversion. This directory is not required to persist between restarts. (default "./data-parquet-converter/")
550
+
-parquet-converter.discovery-interval duration
551
+
The frequency at which user and block discovery runs. (default 5m0s)
552
+
-parquet-converter.max-block-age duration
553
+
Maximum age of blocks to convert. Blocks older than this will be skipped. Set to 0 to disable age filtering.
554
+
-parquet-converter.ring.consul.hostname string
555
+
Hostname and port of Consul. (default "localhost:8500")
0 commit comments