Query Validation

Contents

Query Validation#

Module name: postbound.validation

Pre-checks make sure that optimization strategies and input can be optimized as indicated.

These checks should prevent the optimization of queries that contain features that the optimization algorithm does not support, as well as the usage of optimization algorithms that make decisions that the target database cannot enforce.

The OptimizationPreCheck defines the abstract interface that all checks should adhere to.

class postbound.validation.PreCheckResult(passed: bool = True, failure_reason: str | list[str] = '')#

Wrapper for a validation result.

The result is used in two different ways: to model the check for supported database systems for optimization strategies and to model the check for supported queries for optimization strategies.

The ensure_all_passed method can be used to quickly assert that no problems occurred.

Parameters:
  • passed (bool)

  • failure_reason (str | list[str])

passed#

Indicates whether problems were detected

Type:

bool

failure_reason#

Gives details about the problem(s) that were detected

Type:

str | list[str], optional

static with_all_passed() PreCheckResult#

Generates a check result without any problems.

Returns:

The check result

Return type:

PreCheckResult

static merge(checks: Iterable[PreCheckResult]) PreCheckResult#

Merges multiple check results into a single result.

The result is passed if all input checks are passed. If any of the checks failed, the failure reasons are merged into a single list.

Parameters:

checks (Iterable[PreCheckResult]) – The check results to merge

Returns:

The merged check result

Return type:

PreCheckResult

with_failure() PreCheckResult#

Generates a check result for a specific failure.

Parameters:

failure (str | list[str]) – The failure message(s)

Returns:

The check result

Return type:

PreCheckResult

ensure_all_passed(context: SqlQuery | Database | None = None) None#

Raises an error if the check contains any failures.

Depending on the context, a more specific error can be raised. The context is used to infer whether an optimization strategy does not work on a database system, or whether an input query is not supported by an optimization strategy.

Parameters:

context (SqlQuery | Database | None, optional) – An indicator of the kind of check that was performed. This influences the kind of error that will be raised in case of failure. Defaults to None if no further context is available.

Raises:
Return type:

None

exception postbound.validation.UnsupportedQueryError(query: SqlQuery, features: str | list[str] = '')#

Error to indicate that a specific query cannot be optimized by a selected algorithms.

Parameters:
  • query (SqlQuery) – The unsupported query

  • features (str | list[str], optional) – The features of the query that are unsupported. Defaults to an empty string

Return type:

None

exception postbound.validation.UnsupportedSystemError(db_instance: Database, reason: str = '')#

Error to indicate that a selected query plan cannot be enforced on a target system.

Parameters:
  • db_instance (Database) – The database system without a required feature

  • reason (str, optional) – The features that are not supported. Defaults to an empty string

Return type:

None

class postbound.validation.OptimizationPreCheck(name: str)#

The pre-check interface.

This is the type that all concrete pre-checks must implement. It contains two check methods that correpond to the checks on the database system and to the check on the input query. Both methods pass on all input data by default and must be overwritten to execute the necessary checks.

Parameters:

name (str) – The name of the check. It should describe what features the check tests and will be used to represent the checks that are present in an optimization pipeline.

check_supported_query(query: SqlQuery) PreCheckResult#

Validates that a specific query does not contain any features that cannot be handled by an optimization strategy.

Examples of such features can be non-equi join predicates, dependent subqueries or aggregations.

Parameters:

query (SqlQuery) – The query to check

Returns:

A description of whether the check passed and an indication of the failures.

Return type:

PreCheckResult

check_supported_database_system(database_instance: Database) PreCheckResult#

Validates that a specific database system provides all features that are required by an optimization strategy.

Examples of such features can be support for cardinality hints or specific operators.

Parameters:

database_instance (Database) – The database to check

Returns:

A description of whether the check passed and an indication of the failures.

Return type:

PreCheckResult

abstractmethod describe() dict#

Provides a JSON-serializable representation of the specific check, as well as important parameters.

Returns:

The description

Return type:

dict

See also

postbound.postbound.OptimizationPipeline.describe

class postbound.validation.EmptyPreCheck#

Dummy check that does not actually validate anything.

check_supported_query(query: SqlQuery) PreCheckResult#

Validates that a specific query does not contain any features that cannot be handled by an optimization strategy.

Examples of such features can be non-equi join predicates, dependent subqueries or aggregations.

Parameters:

query (SqlQuery) – The query to check

Returns:

A description of whether the check passed and an indication of the failures.

Return type:

PreCheckResult

describe() dict#

Provides a JSON-serializable representation of the specific check, as well as important parameters.

Returns:

The description

Return type:

dict

See also

postbound.postbound.OptimizationPipeline.describe

class postbound.validation.CompoundCheck(checks: Iterable[OptimizationPreCheck])#

A compound check combines an arbitrary number of base checks and asserts that all of them are satisfied.

If multiple checks fail, the failure_reason of the result contains all individual failure reasons.

Parameters:

checks (Iterable[OptimizationPreCheck]) – The checks that must all be passed.

check_supported_query(query: SqlQuery) PreCheckResult#

Validates that a specific query does not contain any features that cannot be handled by an optimization strategy.

Examples of such features can be non-equi join predicates, dependent subqueries or aggregations.

Parameters:

query (SqlQuery) – The query to check

Returns:

A description of whether the check passed and an indication of the failures.

Return type:

PreCheckResult

describe() dict#

Provides a JSON-serializable representation of the specific check, as well as important parameters.

Returns:

The description

Return type:

dict

See also

postbound.postbound.OptimizationPipeline.describe

postbound.validation.merge_checks(checks: OptimizationPreCheck | Iterable[OptimizationPreCheck], *more_checks) OptimizationPreCheck#

Combines all of the supplied checks into one compound check.

This method is smarter than creating a compound check directly. It eliminates duplicate checks as far as possible and ignores empty checks.

If there is only a single (unique) check, this is returned directly

Parameters:
Returns:

A check that combines all of the given checks.

Return type:

OptimizationPreCheck

class postbound.validation.ImplicitQueryPreCheck#

Check to assert that an input query is a ImplicitSqlQuery.

check_supported_query(query: SqlQuery) PreCheckResult#

Validates that a specific query does not contain any features that cannot be handled by an optimization strategy.

Examples of such features can be non-equi join predicates, dependent subqueries or aggregations.

Parameters:

query (SqlQuery) – The query to check

Returns:

A description of whether the check passed and an indication of the failures.

Return type:

PreCheckResult

describe() dict#

Provides a JSON-serializable representation of the specific check, as well as important parameters.

Returns:

The description

Return type:

dict

See also

postbound.postbound.OptimizationPipeline.describe

class postbound.validation.CrossProductPreCheck#

Check to assert that a query does not contain any cross products.

check_supported_query(query: SqlQuery) PreCheckResult#

Validates that a specific query does not contain any features that cannot be handled by an optimization strategy.

Examples of such features can be non-equi join predicates, dependent subqueries or aggregations.

Parameters:

query (SqlQuery) – The query to check

Returns:

A description of whether the check passed and an indication of the failures.

Return type:

PreCheckResult

describe() dict#

Provides a JSON-serializable representation of the specific check, as well as important parameters.

Returns:

The description

Return type:

dict

See also

postbound.postbound.OptimizationPipeline.describe

class postbound.validation.VirtualTablesPreCheck#

Check to assert that a query does not contain any virtual tables.

check_supported_query(query: SqlQuery) PreCheckResult#

Validates that a specific query does not contain any features that cannot be handled by an optimization strategy.

Examples of such features can be non-equi join predicates, dependent subqueries or aggregations.

Parameters:

query (SqlQuery) – The query to check

Returns:

A description of whether the check passed and an indication of the failures.

Return type:

PreCheckResult

describe() dict#

Provides a JSON-serializable representation of the specific check, as well as important parameters.

Returns:

The description

Return type:

dict

See also

postbound.postbound.OptimizationPipeline.describe

class postbound.validation.EquiJoinPreCheck(*, allow_conjunctions: bool = False, allow_nesting: bool = False)#

Check to assert that a query only contains equi-joins.

This does not restrict the filters in any way. The determination of joins is based on QueryPredicates.joins.

Parameters:
  • allow_conjunctions (bool)

  • allow_nesting (bool)

check_supported_query(query: SqlQuery) PreCheckResult#

Validates that a specific query does not contain any features that cannot be handled by an optimization strategy.

Examples of such features can be non-equi join predicates, dependent subqueries or aggregations.

Parameters:

query (SqlQuery) – The query to check

Returns:

A description of whether the check passed and an indication of the failures.

Return type:

PreCheckResult

describe() dict#

Provides a JSON-serializable representation of the specific check, as well as important parameters.

Returns:

The description

Return type:

dict

See also

postbound.postbound.OptimizationPipeline.describe

class postbound.validation.InnerJoinPreCheck#

Check to assert that a query only contains inner joins.

check_supported_query(query: SqlQuery) PreCheckResult#

Validates that a specific query does not contain any features that cannot be handled by an optimization strategy.

Examples of such features can be non-equi join predicates, dependent subqueries or aggregations.

Parameters:

query (SqlQuery) – The query to check

Returns:

A description of whether the check passed and an indication of the failures.

Return type:

PreCheckResult

describe() dict#

Provides a JSON-serializable representation of the specific check, as well as important parameters.

Returns:

The description

Return type:

dict

See also

postbound.postbound.OptimizationPipeline.describe

class postbound.validation.SubqueryPreCheck#

Check to assert that a query does not contain any subqueries.

check_supported_query(query: SqlQuery) PreCheckResult#

Validates that a specific query does not contain any features that cannot be handled by an optimization strategy.

Examples of such features can be non-equi join predicates, dependent subqueries or aggregations.

Parameters:

query (SqlQuery) – The query to check

Returns:

A description of whether the check passed and an indication of the failures.

Return type:

PreCheckResult

describe() dict#

Provides a JSON-serializable representation of the specific check, as well as important parameters.

Returns:

The description

Return type:

dict

See also

postbound.postbound.OptimizationPipeline.describe

class postbound.validation.DependentSubqueryPreCheck#

Check to assert that a query does not contain any dependent subqueries.

check_supported_query(query: SqlQuery) PreCheckResult#

Validates that a specific query does not contain any features that cannot be handled by an optimization strategy.

Examples of such features can be non-equi join predicates, dependent subqueries or aggregations.

Parameters:

query (SqlQuery) – The query to check

Returns:

A description of whether the check passed and an indication of the failures.

Return type:

PreCheckResult

describe() dict#

Provides a JSON-serializable representation of the specific check, as well as important parameters.

Returns:

The description

Return type:

dict

See also

postbound.postbound.OptimizationPipeline.describe

class postbound.validation.SetOperationsPreCheck#

Check to assert that a query does not contain any set operations (UNION, EXCEPT, etc.).

check_supported_query(query: SqlQuery) PreCheckResult#

Validates that a specific query does not contain any features that cannot be handled by an optimization strategy.

Examples of such features can be non-equi join predicates, dependent subqueries or aggregations.

Parameters:

query (SqlQuery) – The query to check

Returns:

A description of whether the check passed and an indication of the failures.

Return type:

PreCheckResult

describe() dict#

Provides a JSON-serializable representation of the specific check, as well as important parameters.

Returns:

The description

Return type:

dict

See also

postbound.postbound.OptimizationPipeline.describe

class postbound.validation.SupportedHintCheck(hints: HintType | ScanOperator | JoinOperator | IntermediateOperator | Iterable[HintType | ScanOperator | JoinOperator | IntermediateOperator])#

Check to assert that a number of operators are supported by a database system.

Parameters:

hints (HintType | PhysicalOperator | Iterable[HintType | PhysicalOperator]) – The operators and hints that have to be supported by the database system. Can be either a single hint, or an iterable of hints.

See also

HintService.supports_hint

check_supported_database_system(database_instance: Database) PreCheckResult#

Validates that a specific database system provides all features that are required by an optimization strategy.

Examples of such features can be support for cardinality hints or specific operators.

Parameters:

database_instance (Database) – The database to check

Returns:

A description of whether the check passed and an indication of the failures.

Return type:

PreCheckResult

describe() dict#

Provides a JSON-serializable representation of the specific check, as well as important parameters.

Returns:

The description

Return type:

dict

See also

postbound.postbound.OptimizationPipeline.describe

class postbound.validation.CustomCheck(name: str = 'custom-check', *, query_check: Callable[[SqlQuery], PreCheckResult] | None = None, db_check: Callable[[Database], PreCheckResult] | None = None)#

Check to quickly implement arbitrary one-off checks.

The custom check somewhat clashes with directly implementing the OptimizationPreCheck interface. The latter is generally preferred since it is more readable and easier to understand. However, the custom check can be useful for checks that will not be used in multiple places and are not worth the effort of creating a separate class.

Parameters:
  • name (str, optional) – The name of the check. It is heavily recommended to supply a descriptive name, even though a default value exists.

  • query_check (Optional[Callable[[SqlQuery], PreCheckResult]], optional) – Check to apply to each query

  • db_check (Optional[Callable[[Database], PreCheckResult]], optional) – Check to apply to the database

check_supported_query(query: SqlQuery) PreCheckResult#

Validates that a specific query does not contain any features that cannot be handled by an optimization strategy.

Examples of such features can be non-equi join predicates, dependent subqueries or aggregations.

Parameters:

query (SqlQuery) – The query to check

Returns:

A description of whether the check passed and an indication of the failures.

Return type:

PreCheckResult

check_supported_database_system(database_instance: Database) PreCheckResult#

Validates that a specific database system provides all features that are required by an optimization strategy.

Examples of such features can be support for cardinality hints or specific operators.

Parameters:

database_instance (Database) – The database to check

Returns:

A description of whether the check passed and an indication of the failures.

Return type:

PreCheckResult