1️⃣ Metastore (Tarchia) #529
Replies: 4 comments
-
Performance considerations: T-digest is slower than Distogram by about a factor of 10, Distograms are able to be combined, with some enhancements (already written) For each field record source information for each field. Record individual values, up to 8 unique values. We can't estimate cardinality over discreet datasets at speed, (we can use HLL, but it isn't fast enough) so don't try. |
Beta Was this translation helpful? Give feedback.
-
Detach the physical path from the logical name, allow tables to have more consistent naming without moving the existing files. Alias columns, to provide better compatibility across systems Fill default values for schema evolution - base on the catalogue, not page 1. |
Beta Was this translation helpful? Give feedback.
-
We have three district collections
We're writing a dataset catalogue first, we can realize the benefits in planning before we start to do any work with statistics. The binder should map which field comes from which table as a priority. This will allow pushdowns in more complex queries. This in turn will allow us to do blob pruning, maybe only BRIN / small unique values pruning. |
Beta Was this translation helpful? Give feedback.
-
Can we use some like an Trie to get distribution details for strings. We could try to have a Trie no more than three layers deep and count at the nodes. |
Beta Was this translation helpful? Give feedback.
-
Tarchia will be a third system in the suite
Mabel (or Opteryx) adding a blob should trigger the statistics being built for it, ANALYZE should read the blobs rather than Tarchia - in effect this is ad hoc stats creation and refresh.
Initial implementation should just be via the API, Opteryx can support via CREATE statements later.
Mabel should trigger a call to Tarchia direct, via Cloud Tasks and via PubSub to add new blobs.
Statistics have three key purposes:
The Optimizer will be able to take advantage of new facts when doing planning after binding, such as knowing the range of a column to push a read filter to a joined table.
To do these things we need volume and count information, the bounds of the values (BRIN) and distributions of the values (a histogram). For non-blob stores, this would be a single set of statistics for the entire dataset.
We should probably start with an
ANALYZE TABLE
query to create metadata, record per partition, with nodes per blob within that partition. For non-blob stores we should just have min, max, count and AVG numbers as full profiling may be slow.Should also implement
SHOW STATISTICS FOR table
( see https://trino.io/docs/current/sql/show-stats.html) to recall them. This will break the temporal filters.Should use the KVStore model, with extra interfaces to filter results.
A KVStore should be written for a relational store (MySQL/ Postgres)
The store should be referenceable by partition path, contain the blobs in the partition, the fields in the blobs and attributes of the columns.
Partition (count)
This then allows fast COUNT, MIN, MAN, AVG responses.
Also helps prune blobs from the query if they don't contain values covering the range being searched for.
Finally, also gives cardinality and distribution approximations for query optimization (join ordering, filter ordering)
The metastore API should be as declarative as practical to allow for a wide range of implementations underneath. E.g. these blobs, these predicates and aggregates, you work out what to do.
Beta Was this translation helpful? Give feedback.
All reactions