Dynamic Programming

Dynamic Programming#

class postbound.opt.dynprog.DynamicProgrammingEnumerator(*args, **kwargs)#

A very basic dynamic programming-based plan enumerator.

This enumerator is very basic because it does not implement any sophisticated pruning rules or traversal strategies and only focuses on a small subset of possible operators. It simply enumerates all possible access paths and join paths and picks the cheapest one. This should only serve as a starting point when lacking an actual decent enumerator implementation (see Limitation below). Its purpose is mainly to shield users that are only interested in the cost model or the cardinality estimator from having to implement their own enumerator in order to use the TextBookOptimizationPipeline. Notice that for experiments based on PostgreSQL, a much more sophisticated implementation is available with the PostgresDynProg enumerator (and this enumerator is automatically selected when using the textbook pipeline with a Postgres target database).

Limitations#

Only the cheapest access paths are considered, without taking sort orders into account. This prevents free merge join optimizations, i.e. if an access path is more expensive but already sorted, it will be discarded in favor of a cheaper alternative, even though a later merge join might become much cheaper due to the sort order.
No optimizations to intermediates are considered, i.e. no materialization or memoization of subplans.
Only the basic scan and join operators are considered. For scans, this includes sequential scan, index scan, index-only scan and bitmap scan. For joins, this includes nested loop join, hash join and sort merge join. These can be further restricted through the supported_scan_ops and supported_join_ops parameters.
Only simple SPJ queries are supported. Importantly, the query may not contain any set operations, subqueries, CTEs etc. All joins must be inner equijoins and no cross products are allowed.
Aggregations, sorting, etc. are not considered. In this way, the enumerator is comparable to the join_search_hook of PostgreSQL. We assume that such “technicalities” are handled when creating appropriate hints for the target database or when executing the query on the target database at the latest.

param supported_scan_ops:: The set of scan operators that should be considered during the enumeration. This should be a subset of the following operators: sequential scan, index scan, index-only scan, bitmap scan. If any other operators are included, these are simply never considered. By default all operators that are available on the target_db are allowed.
type supported_scan_ops:: Optional[set[ScanOperator]], optional
param supported_join_ops:: The set of join operators that should be considered during the enumeration. This should be a subset of the following operators: nested loop join, hash join, sort merge join. If any other operators are included, these are simply never considered. By default all operators that are available on the target_db are allowed.
type supported_join_ops:: Optional[set[JoinOperator]], optional
param target_db:: The target database system for which the optimization pipeline is intended. If not omitted, the database is inferred from the DatabasePool.
type target_db:: Optional[Database], optional

generate_execution_plan(query, *, cost_model, cardinality_estimator) → QueryPlan#

Computes the optimal plan to execute the given query.

Parameters:

query (SqlQuery) – The query to optimize
cost_model (CostModel) – The cost model to compare different candidate plans
cardinality_estimator (CardinalityEstimator) – The cardinality estimator to calculate the sizes of intermediate results

Returns:

The query plan

Return type:

QueryPlan

Notes

The precise generation “style” (e.g. top-down vs. bottom-up, complete plans vs. plan fragments, etc.) is completely up to the specific algorithm. Therefore, it is really hard to provide a more expressive interface for the enumerator beyond just generating a plan. Generally the enumerator should query the cost model to compare different candidates. The top-most operator of each candidate will usually not have a cost estimate set at the beginning and it is the enumerator’s responsibility to set the estimate correctly. The jointree.update_cost_estimate function can be used to help with this.

pre_check() → OptimizationPreCheck#

Provides requirements that input query or database system have to satisfy for the optimizer to work properly.

Returns:: The check instance. Can be an empty check if no specific requirements exist.
Return type:: OptimizationPreCheck

describe() → jsondict#

Provides a JSON-serializable representation of the specific strategy, as well as important parameters.

Returns:: The description
Return type:: jsondict

Dynamic Programming

Contents

Dynamic Programming#

Limitations#