Native optimization strategies#
Native strategies obtain execution plans from actual database management systems.
Instead of performing optimizations on their own, the native stages delegate all decisions to a specific database system. Afterwards, they analyze the query plan and encode the relevant information in a stage-specific format.
Notes
By combining native stages with different target database systems, the optimizers of the respective systems can be combined. For example, combining a join ordering stage with an Oracle backend and an operator selection stage with a Postgres backend would provide a combined query optimizer with Oracle’s join ordering algorithm and Postgres’ operator selection.
- exception postbound.opt.native.CostEstimationWarning#
- class postbound.opt.native.NativeCostModel(*args, **kwargs)#
Obtains the cost of a query plan by using the cost model of an actual database system.
- Parameters:
raise_on_error (bool) – Whether the cost model should raise an error if anything goes wrong during the estimation. For example, this can happen if the query plan cannot be executed on the target database system. If this is off (the default), failure results in an infinite cost.
verbose (bool, optional) – Whether the cost model should issue warnings if anything goes wrong during the estimation. This includes cases where the cost of some operators cannot be estimated by the target database system.
- estimate_cost(query: SqlQuery, plan: QueryPlan) float#
Computes the cost estimate for a specific plan.
The following conventions are used for the estimation: the root node of the plan will not have any cost set. However, all input nodes will have already been estimated by earlier calls to the cost model. Hence, while estimating the cost of the root node, all earlier costs will be available as inputs. It is further assumed that all nodes already have associated cardinality estimates. This method explicitly does not make any assumption regarding the relationship between query and plan. Specifically, it does not assume that the plan is capable of computing the entire result set nor a correct result set. Instead, the plan might just be a partial plan that computes a subset of the query (e.g. a join of some of the tables). It is the implementation’s responsibility to figure out the appropriate course of action.
It is not the responsibility of the cost model to set the estimate on the plan, this is the task of the enumerator (which can decide whether the plan should be considered any further).
- describe() jsondict#
Provides a JSON-serializable representation of the specific strategy, as well as important parameters.
- Returns:
The description
- Return type:
jsondict
See also
OptimizationPipeline.describe
- class postbound.opt.native.NativeCardinalityEstimator(*args, **kwargs)#
Obtains the cardinality of a query plan by using the cardinality estimator of an actual database system.
- Parameters:
target_db (Optional[Database])
- calculate_estimate(query: SqlQuery, intermediate: TableReference | Iterable[TableReference]) Cardinality#
Determines the cardinality of a specific intermediate.
- Parameters:
query (SqlQuery) – The query being optimized
intermediate (TableReference | Iterable[TableReference]) – The intermediate for which the cardinality should be estimated. All filter predicates, etc. that are applicable to the intermediate can be assumed to be applied.
- Returns:
The estimated cardinality of the specific intermediate
- Return type:
- describe() jsondict#
Provides a JSON-serializable representation of the specific strategy, as well as important parameters.
- Returns:
The description
- Return type:
jsondict
See also
OptimizationPipeline.describe
- initialize(target_db: Database, query: SqlQuery) None#
Hook method that is called before the actual optimization process starts.
This method can be overwritten to set up any necessary data structures, etc. and will be called before each query. The default implementation stores the target database and query as attributes for later use.
- class postbound.opt.native.NativeJoinOrderOptimizer(*args, **kwargs)#
Obtains the join order for an input query by using the optimizer of an actual database system.
- Parameters:
db_instance (db.Database) – The target database whose optimization algorithm should be used.
- optimize_join_order(query: SqlQuery) JoinTree | None#
Performs the actual join ordering process.
The join tree can be further annotated with an initial operator assignment, if that is an inherent part of the specific optimization strategy. However, this is generally discouraged and the multi-stage pipeline will discard such operators to prepare for the subsequent physical operator selection.
Other than the join order and operator assignment, the algorithm should add as much information to the join tree as possible, e.g. including join conditions and cardinality estimates that were calculated for the selected joins. This enables other parts of the optimization process to re-use that information.
- Parameters:
query (SqlQuery) – The query to optimize
- Returns:
The join order. If for some reason there is no valid join order for the given query (e.g. queries with just a single selected table), None can be returned. Otherwise, the selected join order has to be described using a JoinTree.
- Return type:
Optional[LogicalJoinTree]
- describe() jsondict#
Provides a JSON-serializable representation of the specific strategy, as well as important parameters.
- Returns:
The description
- Return type:
jsondict
See also
OptimizationPipeline.describe
- class postbound.opt.native.NativePhysicalOperatorSelection(*args, **kwargs)#
Obtains the physical operators for an input query by using the optimizer of an actual database system.
Since this process normally is the second stage in the optimization pipeline, the operators are selected according to a specific join order. If no such order exists, it is also determined by the database system.
- Parameters:
db_instance (db.Database) – The target database whose optimization algorithm should be used.
- select_physical_operators(query: SqlQuery, join_order: JoinTree | None) PhysicalOperatorAssignment#
Performs the operator assignment.
- Parameters:
- Returns:
The operator assignment. If for some reason no operators can be assigned, an empty assignment can be returned
- Return type:
Notes
The operator selection should handle a None join order gracefully. This can happen if the query does not require any joins (e.g. processing of a single table.
Depending on the specific optimization settings, it is also possible to raise an error if such a situation occurs and there is no reasonable way to deal with it.
- describe() jsondict#
Provides a JSON-serializable representation of the specific strategy, as well as important parameters.
- Returns:
The description
- Return type:
jsondict
See also
OptimizationPipeline.describe
- class postbound.opt.native.NativePlanParameterization(*args, **kwargs)#
Obtains the plan parameters for an inpuit querry by using the optimizer of an actual database system.
This process determines the parameters according to a join order and physical operators. If no such information exists, it is also determined by the database system.
- Parameters:
db_instance (db.Database) – The target database whose optimization algorithm should be used.
- generate_plan_parameters(query: SqlQuery, join_order: JoinTree | None, operator_assignment: PhysicalOperatorAssignment | None) PlanParameterization | None#
Executes the actual parameterization.
- Parameters:
query (SqlQuery) – The query to optimize
join_order (Optional[JoinTree]) – The selected join order for the query.
operator_assignment (Optional[PhysicalOperatorAssignment]) – The selected operators for the query
- Returns:
The parameterization. If for some reason no parameters can be determined, an empty parameterization can be returned
- Return type:
Notes
Since this is the final stage of the optimization process, a number of special cases have to be handled:
the previous phases might not have determined any join order or operator assignment
there might not have been a physical operator selection, but only a join ordering (which potentially included an initial selection of physical operators)
there might not have been a join order optimization, but only a selection of physical operators
both join order and physical operators might have been optimized (in which case only the actual operator assignment matters, not any assignment contained in the join order)
- describe() jsondict#
Provides a JSON-serializable representation of the specific strategy, as well as important parameters.
- Returns:
The description
- Return type:
jsondict
See also
OptimizationPipeline.describe
- class postbound.opt.native.NativeOptimizer(*args, **kwargs)#
Obtains a complete query execution plan by using the optimizer of an actual database system.
- Parameters:
db_instance (db.Database) – The target database whose optimization algorithm should be used.
- optimize_query(query: SqlQuery) QueryPlan#
Constructs the optimized execution plan for an input query.
- describe() jsondict#
Provides a JSON-serializable representation of the specific strategy, as well as important parameters.
- Returns:
The description
- Return type:
jsondict
See also
OptimizationPipeline.describe