Query Abstraction Layer

static create_function(fn: str, *, args: Sequence[SqlExpression] | None = None, kwargs: Mapping[str, SqlExpression] | None = None, distinct: bool = False, filter_where: AbstractPredicate | None = None, target_name: str = '') → BaseProjection#

Shorthand to create a project for an (arbitrary) function.

All parameters (except for the target name) are passed to the corresponding FunctionExpression, see its documentation for details.

Parameters:

fn (str)
args (Sequence[SqlExpression] | None)
kwargs (Mapping[str, SqlExpression] | None)
distinct (bool)
filter_where (AbstractPredicate | None)
target_name (str)

Return type:

static create_max(column: ColumnReference | SqlExpression, *, target_name: str = '') → BaseProjection#

Shorthand to create a projection with a MAX() aggregation.

Parameters:

column (ColumnReference | SqlExpression)
target_name (str)

Return type:

static create_min(column: ColumnReference | SqlExpression, *, target_name: str = '') → BaseProjection#

Shorthand to create a projection with a MIN() aggregation.

Parameters:

column (ColumnReference | SqlExpression)
target_name (str)

Return type:

static create_sum(column: ColumnReference | SqlExpression, *, target_name: str = '') → BaseProjection#

Shorthand to create a projection with a SUM() aggregation.

Parameters:

column (ColumnReference | SqlExpression)
target_name (str)

Return type:

static create_window(fn: FunctionExpression | str, *, partitioning: Sequence[SqlExpression | ColumnReference] | None = None, ordering: OrderBy | None = None, filter_condition: AbstractPredicate | None = None, target_name: str = '') → BaseProjection#

Shorthand to create a projection with a window function.

Parameters:

fn (FunctionExpression | str)
partitioning (Sequence[SqlExpression | ColumnReference] | None)
ordering (OrderBy | None)
filter_condition (AbstractPredicate | None)
target_name (str)

Return type:

property expression: SqlExpression#

Get the expression that forms the column.

Returns:: The expression
Return type:: SqlExpression

is_star() → bool#

Checks, whether this projection is a star projection.

Return type:: bool

static star() → BaseProjection#

Shortcut method to create a * (as in SELECT * FROM R) projection.

Returns:: The projection
Return type:: BaseProjection

property target_name: str#

Get the alias under which the column should be accessible.

Can be empty to indicate the absence of a target name.

Returns:: The name
Return type:: str

class postbound.qal.BetweenPredicate(column: SqlExpression, interval: tuple[SqlExpression, SqlExpression])#

A BETWEEN predicate is a special case of a conjunction of two binary predicates.

Each BETWEEN predicate has a structure of <col> BETWEEN <a> AND <b>, where <col> describes the (column) expression to which the condition should apply and <a> and <b> are the expressions that denote the valid bounds.

Each BETWEEN predicate can be represented by a conjunction of binary predicates: <col> BETWEEN <a> AND <b> is equivalent to <col> >= <a> AND <col> <= <b>.

Parameters:

column (SqlExpression) – The value that is checked by the predicate
interval (tuple[SqlExpression, SqlExpression]) – The allowed range in which the column values must lie. The range is inclusive at both endpoints. This has to be a pair (2-tuple) of expressions.

Raises:

ValueError – If the interval is not a pair of values.

Notes

A BETWEEN predicate can be a join predicate as in R.a BETWEEN 42 AND S.b. Furthermore, some systems even allow the <col> part to be an arbitrary expression. For example, in Postgres this is a valid query:

SELECT *
FROM R JOIN S ON R.a = S.b
WHERE 42 BETWEEN R.c AND S.d

accept_visitor(visitor: PredicateVisitor[VisitorResult] | SqlExpressionVisitor[VisitorResult], *args, **kwargs) → VisitorResult#

Enables processing of the current predicate by a predicate visitor.

Parameters:

visitor (PredicateVisitor[VisitorResult]) – The visitor
args – Additional arguments to pass to the visitor
kwargs – Additional keyword arguments to pass to the visitor

Return type:

VisitorResult

property column: SqlExpression#

Get the column that is tested (R.a in SELECT * FROM R WHERE R.a BETWEEN 1 AND 42).

Returns:: The expression
Return type:: SqlExpression

columns() → set[ColumnReference]#

Provides all columns that are referenced by this predicate.

Returns:: The columns. If the predicate contains a subquery, all columns of that query are included.
Return type:: set[ColumnReference]

property interval: tuple[SqlExpression, SqlExpression]#

Get the interval (as (lower, upper)) that is tested against.

Returns:: The allowed range of values. This interval is inclusive at both endpoints.
Return type:: tuple[SqlExpression, SqlExpression]

property interval_end: SqlExpression#

Get the upper bound of the interval that is tested against.

Returns:: The upper value. This value is inclusive, i.e. the comparison values must be less or equal.
Return type:: SqlExpression

property interval_start: SqlExpression#

Get the lower bound of the interval that is tested against.

Returns:: The lower value. This value is inclusive, i.e. the comparison values must be greater or equal.
Return type:: SqlExpression

is_join() → bool#

Checks, whether this predicate encodes a join between two tables.

PostBOUND uses the following criteria to determine, whether a predicate is join or not:

all predicates of the form <col 1> <operator> <col 2> where <col 1> and <col 2> come from different tables are joins. The columns can optionally be modified by value casts or static expressions, e.g. R.a::integer + 7
all functions that access columns from multiple tables are joins, e.g. my_udf(R.a, S.b)
all subqueries are treated as filters, no matter whether they are dependent subqueries or not. This means that both R.a = (SELECT MAX(S.b) FROM S) and R.a = (SELECT MAX(S.b) FROM S WHERE R.c = S.d) are treated as filters and not as joins, even though the second subquery will require some sort of the join in the query plan.
BETWEEN and IN predicates are treated according to rule 1 since they can be emulated via base predicates (subqueries in IN predicates are evaluated according to rule 3.)

Returns:: Whether the predicate is a join of different relations.
Return type:: bool

itercolumns() → Iterable[ColumnReference]#

Provides all columns that are referenced by this predicate.

If a column is referenced multiple times, it is also returned multiple times.

Returns:: All columns in exactly the order in which they are used.
Return type:: Iterable[ColumnReference]

iterexpressions() → Iterable[SqlExpression]#

Provides access to all expressions that are directly contained in this predicate.

Nested expressions can be accessed from these expressions in a recursive manner (see the SqlExpression interface for details).

Returns:: The expressions
Return type:: Iterable[SqlExpression]

join_partners() → set[tuple[ColumnReference, ColumnReference]]#

Provides all pairs of columns that are joined within this predicate.

Returns:: The pairs of joined columns. These pairs are normalized, such that two predicates which join the same columns provide the join partners in the same order.
Return type:: set[tuple[ColumnReference, ColumnReference]]
Raises:: NoJoinPredicateError –

class postbound.qal.BinaryPredicate(operation: MathOperator | LogicalOperator | CompoundOperator, first_argument: SqlExpression, second_argument: SqlExpression)#

A binary predicate combines exactly two base expressions with a comparison operation.

This is the most typical kind of predicate and appears in most joins (e.g. R.a = S.b) and many filters (e.g. R.a = 42).

Parameters:

operation (SqlOperator) – The operation that combines the input arguments
first_argument (SqlExpression) – The first comparison value
second_argument (SqlExpression) – The second comparison value

accept_visitor(visitor: PredicateVisitor[VisitorResult] | SqlExpressionVisitor[VisitorResult], *args, **kwargs) → VisitorResult#

Enables processing of the current predicate by a predicate visitor.

Parameters:

visitor (PredicateVisitor[VisitorResult]) – The visitor
args – Additional arguments to pass to the visitor
kwargs – Additional keyword arguments to pass to the visitor

Return type:

VisitorResult

columns() → set[ColumnReference]#

Provides all columns that are referenced by this predicate.

Returns:: The columns. If the predicate contains a subquery, all columns of that query are included.
Return type:: set[ColumnReference]

static equal(first_argument: SqlExpression, second_argument: SqlExpression) → BinaryPredicate#

Generates an equality predicate between two arguments.

Parameters:

first_argument (SqlExpression)
second_argument (SqlExpression)

Return type:

BinaryPredicate

property first_argument: SqlExpression#

Get the first argument of the predicate.

Returns:: The argument
Return type:: SqlExpression

is_join() → bool#

Checks, whether this predicate encodes a join between two tables.

PostBOUND uses the following criteria to determine, whether a predicate is join or not:

all predicates of the form <col 1> <operator> <col 2> where <col 1> and <col 2> come from different tables are joins. The columns can optionally be modified by value casts or static expressions, e.g. R.a::integer + 7
all functions that access columns from multiple tables are joins, e.g. my_udf(R.a, S.b)
all subqueries are treated as filters, no matter whether they are dependent subqueries or not. This means that both R.a = (SELECT MAX(S.b) FROM S) and R.a = (SELECT MAX(S.b) FROM S WHERE R.c = S.d) are treated as filters and not as joins, even though the second subquery will require some sort of the join in the query plan.
BETWEEN and IN predicates are treated according to rule 1 since they can be emulated via base predicates (subqueries in IN predicates are evaluated according to rule 3.)

Returns:: Whether the predicate is a join of different relations.
Return type:: bool

itercolumns() → Iterable[ColumnReference]#

Provides all columns that are referenced by this predicate.

If a column is referenced multiple times, it is also returned multiple times.

Returns:: All columns in exactly the order in which they are used.
Return type:: Iterable[ColumnReference]

iterexpressions() → Iterable[SqlExpression]#

Provides access to all expressions that are directly contained in this predicate.

Nested expressions can be accessed from these expressions in a recursive manner (see the SqlExpression interface for details).

Returns:: The expressions
Return type:: Iterable[SqlExpression]

join_partners() → set[tuple[ColumnReference, ColumnReference]]#

Provides all pairs of columns that are joined within this predicate.

Returns:: The pairs of joined columns. These pairs are normalized, such that two predicates which join the same columns provide the join partners in the same order.
Return type:: set[tuple[ColumnReference, ColumnReference]]
Raises:: NoJoinPredicateError –

property operation: MathOperator | LogicalOperator | CompoundOperator#

Get the operation that is used to obtain matching (pairs of) tuples.

Returns:: The operation if it exists
Return type:: Optional[SqlOperator]

property second_argument: SqlExpression#

Get the second argument of the predicate.

Returns:: The argument
Return type:: SqlExpression

class postbound.qal.CaseExpression(cases: Sequence[tuple[SqlExpression, SqlExpression]], *, simple_expr: SqlExpression | None = None, else_expr: SqlExpression | None = None)#

Represents a case expression in SQL.

Parameters:#

casesSequence[tuple[SqlExpression, SqlExpression]]: A sequence of tuples representing the cases in the case expression. The cases are passed as a sequence rather than a dictionary, because the evaluation order of the cases is important. The first case that evaluates to true determines the result of the entire case statement.
simple_expr: Optional[SqlExpression], optional: The expression to evaluate against the cases. This “simple form” compares the expression directly against each of the values in cases, similar to a switch statement.
else_exprOptional[SqlExpression], optional: The expression to be evaluated if none of the cases match. If no case matches and no else expression is provided, the entire case expression should evaluate to NULL.

accept_visitor(visitor: SqlExpressionVisitor[VisitorResult], *args, **kwargs) → VisitorResult#

Enables processing of the current expression by an expression visitor.

Parameters:

visitor (SqlExpressionVisitor[VisitorResult]) – The visitor
args – Additional arguments that are passed to the visitor
kwargs – Additional keyword arguments that are passed to the visitor

Return type:

VisitorResult

property cases: Sequence[tuple[SqlExpression, SqlExpression]]#

Get the different cases.

Returns:: The cases. At least one case will be present.
Return type:: Sequence[tuple[SqlExpression]]

columns() → set[ColumnReference]#

Provides all columns that are referenced by this expression.

Returns:: The columns
Return type:: set[ColumnReference]

property else_expression: SqlExpression | None#

Get the expression to use if none of the cases match.

Returns:: The expression. Can be None, in which case the case expression evaluates to NULL.
Return type:: Optional[SqlExpression]

iterchildren() → Iterable[SqlExpression]#

Provides unified access to all child expressions of the concrete expression type.

Returns:: The expressions
Return type:: Iterable[SqlExpression]

itercolumns() → Iterable[ColumnReference]#

Provides all columns that are referenced by this expression.

If a column is referenced multiple times, it is also returned multiple times.

Returns:: All columns in exactly the order in which they are used.
Return type:: Iterable[ColumnReference]

property simple_expression: SqlExpression | None#

Get the expression to evaluate against the cases.

This is only set for the “simple form” of the case expression, where the expression is compared directly against the values in the cases. In this form, each case has to be a plain value instead of a full predicate, similar to a switch statement:

SELECT  CASE R.a
            WHEN 1 THEN 'one'
            WHEN 2 THEN 'two'
            ELSE 'other'
        END AS foo
FROM R;

tables() → set[TableReference]#

Provides all tables that are accessed by this expression.

Returns:: All tables. This includes virtual tables if such tables are present in the expression.
Return type:: set[TableReference]

Parameters:

cases (Sequence[tuple[SqlExpression, SqlExpression]])
simple_expr (Optional[SqlExpression])
else_expr (Optional[SqlExpression])

class postbound.qal.CastExpression(expression: SqlExpression, target_type: str, *, type_params: Sequence[SqlExpression] | None = None, array_type: bool = False)#

An expression that casts the type of another nested expression.

Note that PostBOUND itself does not know about the semantics of the actual types or casts. Eventual errors due to illegal casts are only caught at runtime by the actual database system.

Parameters:

expression (SqlExpression) – The expression that is casted to a different type.
target_type (str) – The type to which the expression should be converted to. This cannot be empty.
type_params (Optional[Sequence[SqlExpression]], optional) – Additional arguments to parameterize the type, such as in NUMERIC(4, 2) or VARCHAR(255). For example, when casting to VARCHAR(255), the 255 would be an additional parameter, represented as a single static value expression. When casting to NUMERIC(4, 2), the 4 and 2 would be the additional parameters (in that order).
array_type (bool, optional) – Whether the target type is an array type.

Raises:

ValueError – If the target_type is empty.

accept_visitor(visitor: SqlExpressionVisitor[VisitorResult], *args, **kwargs) → VisitorResult#

Enables processing of the current expression by an expression visitor.

Parameters:

visitor (SqlExpressionVisitor[VisitorResult]) – The visitor
args – Additional arguments that are passed to the visitor
kwargs – Additional keyword arguments that are passed to the visitor

Return type:

VisitorResult

property array_type: bool#: Get whether the target type is an array type.

property casted_expression: SqlExpression#

Get the expression that is being casted.

Returns:: The expression
Return type:: SqlExpression

columns() → set[ColumnReference]#

Provides all columns that are referenced by this expression.

Returns:: The columns
Return type:: set[ColumnReference]

iterchildren() → Iterable[SqlExpression]#

Provides unified access to all child expressions of the concrete expression type.

Returns:: The expressions
Return type:: Iterable[SqlExpression]

itercolumns() → Iterable[ColumnReference]#

Provides all columns that are referenced by this expression.

If a column is referenced multiple times, it is also returned multiple times.

Returns:: All columns in exactly the order in which they are used.
Return type:: Iterable[ColumnReference]

tables() → set[TableReference]#

Provides all tables that are accessed by this expression.

Returns:: All tables. This includes virtual tables if such tables are present in the expression.
Return type:: set[TableReference]

property target_type: str#

Get the type to which to cast to.

Returns:: The desired type. This is never empty.
Return type:: str

property type_params: Sequence[SqlExpression]#

Get additional arguments that parameterize the type.

For example, when casting to VARCHAR(255), the 255 would be an additional parameter, represented as a single static value expression. When casting to NUMERIC(4, 2), the 4 and 2 would be the additional parameters (in that order).

Returns:: The type parameters or an empty sequence if there are none.
Return type:: Sequence[SqlExpression]

class postbound.qal.ClauseVisitor#

Basic visitor to operate on arbitrary clause lists.

See also

BaseClause

References

class postbound.qal.ColumnExpression(column: ColumnReference)#

A column expression wraps the reference to a column.

This is a leaf expression, i.e. a column expression cannot have any more child expressions. It corresponds directly to an access to the values of the wrapped column with no modifications.

Parameters:: column (ColumnReference) – The column being wrapped

accept_visitor(visitor: SqlExpressionVisitor[VisitorResult], *args, **kwargs) → VisitorResult#

Enables processing of the current expression by an expression visitor.

Parameters:

visitor (SqlExpressionVisitor[VisitorResult]) – The visitor
args – Additional arguments that are passed to the visitor
kwargs – Additional keyword arguments that are passed to the visitor

Return type:

VisitorResult

property column: ColumnReference#

Get the column that is wrapped by this expression.

Returns:: The column
Return type:: ColumnReference

columns() → set[ColumnReference]#

Provides all columns that are referenced by this expression.

Returns:: The columns
Return type:: set[ColumnReference]

iterchildren() → Iterable[SqlExpression]#

Provides unified access to all child expressions of the concrete expression type.

Returns:: The expressions
Return type:: Iterable[SqlExpression]

itercolumns() → Iterable[ColumnReference]#

Provides all columns that are referenced by this expression.

If a column is referenced multiple times, it is also returned multiple times.

Returns:: All columns in exactly the order in which they are used.
Return type:: Iterable[ColumnReference]

static of(column_name: str, *, table: str | TableReference) → ColumnExpression#

Shortcut method to create a new column reference + expression.

Parameters:

column_name (str)
table (str | TableReference)

Return type:

ColumnExpression

tables() → set[TableReference]#

Provides all tables that are accessed by this expression.

Returns:: All tables. This includes virtual tables if such tables are present in the expression.
Return type:: set[TableReference]

class postbound.qal.CommonTableExpression(with_queries: Iterable[WithQuery], *, recursive: bool = False)#

The WITH clause of a query, consisting of at least one CTE query.

Parameters:

with_queries (Iterable[WithQuery]) – The common table expressions that form the WITH clause.
recursive (bool, optional) – Whether the WITH clause is recursive or not. Defaults to False.

Raises:

ValueError – If with_queries does not contain any CTE

Warning

The tables() method provides all tables that are referenced as part of the CTEs as well as their aliases. The referenced_tables() method does not include the CTE aliases. Likewise, the aliases() method provides all the aliases, but not the tables that are referenced within the CTEs.

accept_visitor(visitor: ClauseVisitor[VisitorResult], *args, **kwargs) → VisitorResult#

Enables processing of the current clause by a visitor.

Parameters:

visitor (ClauseVisitor[VisitorResult]) – The visitor
args – Additional arguments that are passed to the visitor
kwargs – Additional keyword arguments that are passed to the visitor

Return type:

VisitorResult

aliases() → set[TableReference]#

Provides all aliases that are used in the CTEs.

Returns:: The aliases.
Return type:: set[TableReference]

columns() → set[ColumnReference]#

Provides all columns that are referenced in the clause.

Returns:: The columns
Return type:: set[ColumnReference]

static create_for(query: SqlQuery, target_name: str) → CommonTableExpression#

Shortcut method to create a CTE clause with a single query.

Parameters:

query (SqlQuery)
target_name (str)

Return type:

CommonTableExpression

itercolumns() → Iterable[ColumnReference]#

Provides access to all column in this clause.

Returns:: All columns exactly in the order in which they are used
Return type:: Iterable[ColumnReference]

iterexpressions() → Iterable[SqlExpression]#

Provides access to all directly contained expressions in this clause.

Nested expressions can be accessed from these expressions in a recursive manner (see the SqlExpression interface for details).

Returns:: The expressions
Return type:: Iterable[SqlExpression]

property queries: Sequence[WithQuery]#

Get CTEs that form the WITH clause

Returns:: The CTEs in the order in which they were originally specified.
Return type:: Sequence[WithQuery]

property recursive: bool#

Check whether the WITH clause is recursive or not.

Returns:: Whether the WITH clause is recursive
Return type:: bool

referenced_tables() → set[TableReference]#

Provides all tables that are referenced in the CTEs. This does not include the CTE aliases.

Returns:: The tables.
Return type:: set[TableReference]

tables() → set[TableReference]#

Provides all tables that are referenced in the clause.

Returns:: All tables. This includes virtual tables if such tables are present in the clause
Return type:: set[TableReference]

class postbound.qal.CompoundOperator(*values)#: The supported compound operators.

class postbound.qal.CompoundPredicate(operation: CompoundOperator, children: AbstractPredicate | Sequence[AbstractPredicate])#

A compound predicate creates a composite hierarchical structure of other predicates.

Currently, PostBOUND supports 3 kinds of compound predicates: negations, conjunctions and disjunctions. Depending on the specific compound operator, a diferent number of child predicates is allowed.

Parameters:

operation (LogicalSqlCompoundOperators) – The operation that glues together the individual child predicates.
children (AbstractPredicate | Sequence[AbstractPredicate]) – The predicates that are combined by this composite. For conjunctions and disjunctions, at least two children are required. For negations, exactly one child is permitted (either directly or in the sequence).

Raises:

ValueError – If operation is a negation and a number of children unequal to 1 is passed
ValueError – If operation is a conjunction or a disjunction and less than 2 children are passed
.. deprecated: – 0.20.2: CompoundPredicate will only handle AND/OR predicates in the future. NOT predicates will be represented by a proper NotPredicate class.

accept_visitor(visitor: PredicateVisitor[VisitorResult] | SqlExpressionVisitor[VisitorResult], *args, **kwargs) → VisitorResult#

Enables processing of the current predicate by a predicate visitor.

Parameters:

visitor (PredicateVisitor[VisitorResult]) – The visitor
args – Additional arguments to pass to the visitor
kwargs – Additional keyword arguments to pass to the visitor

Return type:

VisitorResult

base_predicates() → Iterable[AbstractPredicate]#

Provides all base predicates that form this predicate.

This allows to iterate over all leaves of a compound predicate, for base predicates it simply returns the predicate itself.

Returns:: The base predicates, in an arbitrary order. If the predicate is a base predicate already, it will be the only item in the iterable.
Return type:: Iterable[AbstractPredicate]

property children: Sequence[AbstractPredicate] | AbstractPredicate#

Get the child predicates that are combined in this compound predicate.

For conjunctions and disjunctions this will be a sequence of children with at least two children. For negations the child predicate will be returned directly (i.e. without being wrapped in a sequence).

Returns:: The sequence of child predicates for AND and OR predicates, or the negated predicate for NOT predicates.
Return type:: Sequence[AbstractPredicate] | AbstractPredicate

columns() → set[ColumnReference]#

Provides all columns that are referenced by this predicate.

Returns:: The columns. If the predicate contains a subquery, all columns of that query are included.
Return type:: set[ColumnReference]

static create(operation: CompoundOperator, parts: Sequence[AbstractPredicate]) → AbstractPredicate#

Creates an arbitrary compound predicate for a number of child predicates.

If just a single child predicate is provided, but the operation requires multiple children, that child is returned directly instead of the compound predicate.

Parameters:

operation (LogicalSqlCompoundOperators) – The logical operator to combine the child predicates.
parts (Sequence[AbstractPredicate]) – The child predicates

Returns:

A composite predicate of the given parts, if parts contains the appropriate number of items. Otherwise the supplied child predicate.

Return type:

AbstractPredicate

Raises:

ValueError – If a negation predicate should be created but a number child predicates unequal to one are supplied. Likewise, if a conjunction or disjunction is requested, but no child predicates are supplied.

static create_and(parts: Collection[AbstractPredicate]) → AbstractPredicate#

Creates an AND predicate, combining a number of child predicates.

If just a single child predicate is provided, that child is returned directly instead of wrapping it in an AND predicate.

Parameters:: parts (Collection[AbstractPredicate]) – The children that should be combined
Returns:: A conjunctive predicate of the given parts, if parts contains at least two items. Otherwise the only passed predicate is returned.
Return type:: AbstractPredicate
Raises:: ValueError – If parts is empty

static create_not(predicate: AbstractPredicate) → CompoundPredicate#

Builds a NOT predicate, wrapping a specific child predicate.

Parameters:: predicate (AbstractPredicate) – The predicate that should be negated
Returns:: The negated predicate. No logic checks or simplifications are performed. For example, it is possible to negate a negation and this will still be represented in the predicate hierarchy.
Return type:: CompoundPredicate

static create_or(parts: Collection[AbstractPredicate]) → AbstractPredicate#

Creates an OR predicate, combining a number of child predicates.

If just a single child predicate is provided, that child is returned directly instead of wrapping it in an OR predicate.

Parameters:: parts (Collection[AbstractPredicate]) – The children that should be combined
Returns:: A disjunctive predicate of the given parts, if parts contains at least two items. Otherwise the only passed predicate is returned.
Return type:: AbstractPredicate
Raises:: ValueError – If parts is empty

is_compound() → bool#

Checks, whether this predicate combines the evaluation of other predicates to compute the overall evaluation.

Returns:: Whethter this predicate is a composite of other predicates.
Return type:: bool

is_join() → bool#

Checks, whether this predicate encodes a join between two tables.

PostBOUND uses the following criteria to determine, whether a predicate is join or not:

all predicates of the form <col 1> <operator> <col 2> where <col 1> and <col 2> come from different tables are joins. The columns can optionally be modified by value casts or static expressions, e.g. R.a::integer + 7
all functions that access columns from multiple tables are joins, e.g. my_udf(R.a, S.b)
all subqueries are treated as filters, no matter whether they are dependent subqueries or not. This means that both R.a = (SELECT MAX(S.b) FROM S) and R.a = (SELECT MAX(S.b) FROM S WHERE R.c = S.d) are treated as filters and not as joins, even though the second subquery will require some sort of the join in the query plan.
BETWEEN and IN predicates are treated according to rule 1 since they can be emulated via base predicates (subqueries in IN predicates are evaluated according to rule 3.)

Returns:: Whether the predicate is a join of different relations.
Return type:: bool

is_negation() → bool#

Checks whether this is a NOT predicate.

Return type:: bool

iterchildren() → Sequence[AbstractPredicate]#

Provides all children contained in this predicate.

In contrast to the children property, this method always returns an iterable, even for NOT predicates. In the latter case the iterable contains just a single item.

Returns:: The children
Return type:: Sequence[AbstractPredicate]

itercolumns() → Iterable[ColumnReference]#

Provides all columns that are referenced by this predicate.

If a column is referenced multiple times, it is also returned multiple times.

Returns:: All columns in exactly the order in which they are used.
Return type:: Iterable[ColumnReference]

iterexpressions() → Iterable[SqlExpression]#

Provides access to all expressions that are directly contained in this predicate.

Nested expressions can be accessed from these expressions in a recursive manner (see the SqlExpression interface for details).

Returns:: The expressions
Return type:: Iterable[SqlExpression]

join_partners() → set[tuple[ColumnReference, ColumnReference]]#

Provides all pairs of columns that are joined within this predicate.

Returns:: The pairs of joined columns. These pairs are normalized, such that two predicates which join the same columns provide the join partners in the same order.
Return type:: set[tuple[ColumnReference, ColumnReference]]
Raises:: NoJoinPredicateError –

property operation: CompoundOperator#

Get the operation used to combine the individual evaluations of the child predicates.

Returns:: The operation
Return type:: LogicalSqlCompoundOperators

class postbound.qal.DirectTableSource(table: TableReference)#

Models a plain table that is directly referenced in a FROM clause, e.g. R in SELECT * FROM R, S.

Parameters:: table (TableReference) – The table that is sourced

columns() → set[ColumnReference]#

Provides all column sthat are referenced in the source.

For plain table sources this will be empty, but for more complicate structures such as subquery source, this will include all columns that are referenced in the subquery.

Returns:: The columns
Return type:: set[ColumnReference]

static create_for(table_name: str, alias: str = '') → DirectTableSource#

Shorthand to create a direct table source from a table name and an optional alias.

Parameters:

table_name (str)
alias (str)

Return type:

DirectTableSource

identifier() → str#

Get the table’s identifier.

Return type:: str

itercolumns() → Iterable[ColumnReference]#

Provides access to all column in the source.

For plain table sources this will be empty, but for subquery sources, etc. all expressions are returned. In contrast to the columns method, duplicates are returned multiple times, i.e. if a column is referenced n times in this source, it will also be returned n times by this method. Furthermore, the order in which columns are provided by the iterable matches the order in which they appear in this clause.

Returns:: All columns exactly in the order in which they are used
Return type:: Iterable[ColumnReference]

iterexpressions() → Iterable[SqlExpression]#

Provides access to all directly contained expressions in the source.

For plain table sources this will be empty, but for subquery sources, etc. all expressions are returned. Nested expressions can be accessed from these expressions in a recursive manner (see the SqlExpression interface for details).

Returns:: The expressions
Return type:: Iterable[SqlExpression]

predicates() → QueryPredicates | None#

Provides all predicates that are contained in the source.

For plain table sources this will be None, but for subquery sources, etc. all predicates are returned.

Returns:: The predicates or None if the source does not allow predicates or simply does not contain any.
Return type:: QueryPredicates | None

property table: TableReference#

Get the table that is sourced.

This can be a virtual table (e.g. for CTEs), but will most commonly be an actual table.

Returns:: The table.
Return type:: TableReference

tables() → set[TableReference]#

Provides all tables that are referenced in the source.

For plain table sources this will just be the actual table. For more complicated structures, such as subquery sources, this will include all tables of the subquery as well.

Returns:: The tables.
Return type:: set[TableReference]

class postbound.qal.ExceptClause(left_query: SqlQuery | SetQuery, right_query: SqlQuery | SetQuery)#

The EXCEPT clause of a query.

Parameters:

left_query (SelectStatement) – The left query that is part of the EXCEPT operation. This is the result set from which tuples are removed.
right_query (SelectStatement) – The right query that is part of the EXCEPT operation. This is the result set of the tuples that should be removed.

accept_visitor(visitor: ClauseVisitor[VisitorResult], *args, **kwargs) → VisitorResult#

Enables processing of the current clause by a visitor.

Parameters:

visitor (ClauseVisitor[VisitorResult]) – The visitor
args – Additional arguments that are passed to the visitor
kwargs – Additional keyword arguments that are passed to the visitor

Return type:

VisitorResult

columns() → set[ColumnReference]#

Provides all columns that are referenced in the clause.

Returns:: The columns
Return type:: set[ColumnReference]

itercolumns() → Iterable[ColumnReference]#

Provides access to all column in this clause.

Returns:: All columns exactly in the order in which they are used
Return type:: Iterable[ColumnReference]

iterexpressions() → Iterable[SqlExpression]#

Provides access to all directly contained expressions in this clause.

Nested expressions can be accessed from these expressions in a recursive manner (see the SqlExpression interface for details).

Returns:: The expressions
Return type:: Iterable[SqlExpression]

property left_query: SqlQuery | SetQuery#

Get the left query that is part of the EXCEPT operation.

The left query provides the result set from which tuples are removed.

Returns:: The left query.
Return type:: SelectStatement

property right_query: SqlQuery | SetQuery#

Get the right query that is part of the EXCEPT operation.

The right query provides the result set of the tuples that should be removed.

Returns:: The right query.
Return type:: SelectStatement

tables() → set[TableReference]#

Provides all tables that are referenced in the clause.

Returns:: All tables. This includes virtual tables if such tables are present in the clause
Return type:: set[TableReference]

class postbound.qal.Explain(analyze: bool = False, target_format: str | None = None)#

EXPLAIN block of a query.

EXPLAIN queries change the execution mode of a query. Instead of focusing on the actual query result, an EXPLAIN query produces information about the internal processes of the database system. Typically, this includes which execution plan the DBS would choose for the query. Additionally, EXPLAIN ANALYZE (as for example supported by Postgres) provides the query plan and executes the actual query. The returned plan is then annotated by how the optimizer predictions match reality. Furthermore, such ANALYZE plans typically also contain some runtime statistics such as runtime of certain operators.

Notice that there is no EXPLAIN keyword in the SQL standard, but all major database systems provide this functionality. Nevertheless, the precise syntax and semantic of an EXPLAIN statement depends on the actual DBS. The Explain clause object is modeled after Postgres and needs to adapted accordingly for different systems (see db.HintService). Especially the EXPLAIN ANALYZE variant is not supported by all systems.

Parameters:

analyze (bool, optional) – Whether the query should not only be executed as an EXPLAIN query, but rather as an EXPLAIN ANALYZE query. Defaults to False which runs the query as a pure EXPLAIN query.
target_format (Optional[str], optional) – The desired output format of the query plan, if this is supported by the database system. Defaults to None which normally forces the default output format.

See also

postbound.db.db.HintService.format_query

References

accept_visitor(visitor: ClauseVisitor[VisitorResult], *args, **kwargs) → VisitorResult#

Enables processing of the current clause by a visitor.

Parameters:

visitor (ClauseVisitor[VisitorResult]) – The visitor
args – Additional arguments that are passed to the visitor
kwargs – Additional keyword arguments that are passed to the visitor

Return type:

VisitorResult

property analyze: bool#

Check, whether the query should be executed as EXPLAIN ANALYZE rather than just plain EXPLAIN.

Usually, EXPLAIN ANALYZE executes the query and gathers extensive runtime statistics (e.g. comparing estimated vs. true cardinalities for intermediate nodes).

Returns:: Whether ANALYZE mode is enabled
Return type:: bool

columns() → set[ColumnReference]#

Provides all columns that are referenced in the clause.

Returns:: The columns
Return type:: set[ColumnReference]

static explain_analyze(format_type: str = 'JSON') → Explain#

Constructs an EXPLAIN ANALYZE clause with the specified output format.

Parameters:: format_type (str, optional) – The output format, by default JSON
Returns:: The explain clause
Return type:: Explain

itercolumns() → Iterable[ColumnReference]#

Provides access to all column in this clause.

Returns:: All columns exactly in the order in which they are used
Return type:: Iterable[ColumnReference]

iterexpressions() → Iterable[SqlExpression]#

Provides access to all directly contained expressions in this clause.

Nested expressions can be accessed from these expressions in a recursive manner (see the SqlExpression interface for details).

Returns:: The expressions
Return type:: Iterable[SqlExpression]

static plan(format_type: str = 'JSON') → Explain#

Constructs a pure EXPLAIN clause (i.e. without ANALYZE) with the specified output format.

Parameters:: format_type (str, optional) – The output format, by default JSON
Returns:: The explain clause
Return type:: Explain

property target_format: str | None#

Get the target format in which the EXPLAIN plan should be provided.

Returns:: The output format, or None if this is not specified. This is never an empty string.
Return type:: Optional[str]

class postbound.qal.ExplicitFromClause(joins: JoinTableSource | Iterable[JoinTableSource])#

Represents a special kind of FROM clause that requires all tables to be joined using the JOIN ON syntax.

Parameters:: joins (JoinTableSource | Iterable[JoinTableSource]) – The joins that should be performed

base_table() → TableReference#

Get the table that is farthest to the left in the join chain.

For subqueries or VALUES clauses, this will return the alias of the expression, i.e. the name of the virtual table that is created for the subquery or VALUES clause.

Returns:: The table
Return type:: TableReference

iterpredicates() → Iterable[AbstractPredicate]#

Provides all join conditions that are contained in the FROM clause.

Returns:: The join conditions.
Return type:: Iterable[AbstractPredicate]

property root: JoinTableSource#

Get the root join of the FROM clause.

Returns:: The root join
Return type:: JoinTableSource

An explicit query restricts the constructs that may appear in the FROM clause.

For explicit queries, the FROM clause must utilize the JOIN ON syntax for all tables. The join conditions should be put into the ON blocks. Notice however, that PostBOUND does not perform any sanity checks here. Therefore, it is possible to put mix joins and filters in the ON blocks, move all joins to the WHERE clause or scatter the join conditions between the two clauses. Whether this is good style is up for debate, but at least PostBOUND does allow it. In contrast to the implicit query, subqueries are also allowed as table sources.

Notice that each explicit query must join at least two tables in its FROM clause.

The attributes and parameters for this query type are the same as for SqlQuery, only the type of the From clause is restricted.

Examples

The following queries are considered as explicit queries:

SELECT *
FROM R
    JOIN S ON R.a = S.b
    JOIN T ON S.b = T.c
WHERE R.a < 42

SELECT *
FROM R
    JOIN S ON R.a = S.b AND R.a = (SELECT MIN(R.a) FROM R)
    JOIN T ON S.b = T.c

Parameters:

select_clause (Select)
from_clause (Optional[ExplicitFromClause])
where_clause (Optional[Where])
groupby_clause (Optional[GroupBy])
having_clause (Optional[Having])
orderby_clause (Optional[OrderBy])
limit_clause (Optional[Limit])
cte_clause (Optional[CommonTableExpression])
explain_clause (Optional[Explain])
hints (Optional[Hint])

property from_clause: ExplicitFromClause | None#

Get the FROM clause of the query.

Returns:: The FROM clause if it was specified, or None otherwise.
Return type:: Optional[From]

is_explicit() → bool#

Checks, whether this query has an explicit FROM clause.

The explicit FROM clause exclusively makes use of the JOIN ON syntax to denote both the tables that should be scanned, and the predicates that should be used to join the tables together. Therefore, the WHERE clause should only consist of filter predicates on the base tables. However, this is not enforced and the contents of the ON conditions as well as the WHERE clause can be arbitrary predicates.

Returns:: Whether the query is explicit
Return type:: bool

See also

ExplicitSqlQuery

is_implicit() → bool#

Checks, whether this query has an implicit FROM clause.

The implicit FROM clause only consists of the source tables that should be scanned for the query. No subqueries or joins are contained in the clause. All join predicates must be part of the WHERE clause.

Returns:: Whether the query is implicit
Return type:: bool

See also

ImplicitSqlQuery

class postbound.qal.ExpressionCollector(matcher: Callable[[SqlExpression], bool], *, continue_after_match: bool = False)#

Utility to traverse an arbitrarily deep expression hierarchy in order to collect specific expressions.

Parameters:

matcher (Callable[[SqlExpression], bool]) – Function to determine whether a specific expression matches the collection predicate. Should return True for matches and False otherwise.
continue_after_match (bool, optional) – Whether the traversal of the current expression element should be continued if the current element matches the collection predicate. By default, traversal is stopped for the current element (but other branches in the expression tree could still produce more matches).

class postbound.qal.From(items: TableSource | Iterable[TableSource])#

The FROM clause models which tables should be selected and potentially how they are combined.

A FROM clause permits arbitrary source items and does not enforce a specific structure or semantic on them. This puts the user in charge to generate a valid and meaningful structure. For example, the model allows for the first item to be a JoinTableSource, even though this is not valid SQL. Likewise, no duplicate checks are performed.

To represent FROM clauses with a bit more structure, the ImplicitFromClause and ExplicitFromClause subclasses exist and should generally be preffered over direct usage of the raw From clause class.

Parameters:: items (TableSource | Iterable[TableSource]) – The tables that should be sourced in the FROM clause
Raises:: ValueError – If no items are specified

accept_visitor(visitor: ClauseVisitor[VisitorResult], *args, **kwargs) → VisitorResult#

Enables processing of the current clause by a visitor.

Parameters:

visitor (ClauseVisitor[VisitorResult]) – The visitor
args – Additional arguments that are passed to the visitor
kwargs – Additional keyword arguments that are passed to the visitor

Return type:

VisitorResult

columns() → set[ColumnReference]#

Provides all columns that are referenced in the clause.

Returns:: The columns
Return type:: set[ColumnReference]

static create_for(items: TableReference | Iterable[TableReference]) → ImplicitFromClause#

Shorthand method to create a FROM clause for a set of table references.

Parameters:: items (TableReference | Iterable[TableReference])
Return type:: ImplicitFromClause

property items: Sequence[TableType]#

Get the tables that are sourced in the FROM clause

Returns:: The sources in exactly the sequence in which they were specified
Return type:: Sequence[TableSource]

itercolumns() → Iterable[ColumnReference]#

Provides access to all column in this clause.

Returns:: All columns exactly in the order in which they are used
Return type:: Iterable[ColumnReference]

iterexpressions() → Iterable[SqlExpression]#

Provides access to all directly contained expressions in this clause.

Nested expressions can be accessed from these expressions in a recursive manner (see the SqlExpression interface for details).

Returns:: The expressions
Return type:: Iterable[SqlExpression]

tables() → set[TableReference]#

Provides all tables that are referenced in the clause.

Returns:: All tables. This includes virtual tables if such tables are present in the clause
Return type:: set[TableReference]

class postbound.qal.FunctionExpression(function: str, arguments: Sequence[SqlExpression] | None = None, *, keyword_args: Mapping[str, SqlExpression] | None = None, distinct: bool = False, filter_where: AbstractPredicate | None = None)#

The function expression indicates a call to an arbitrary function.

The actual function might be one of the standard SQL functions, an aggregation function or a user-defined one. PostBOUND treats them all the same and it is up to the user to differentiate e.g. between UDFs and aggregations if this distinction is important. This can be easily achieved by introducing additional subclasses of the function expression and updating the queries to use the new function expressions where appropriate. The transform module provides utilities to make such updates easy.

Parameters:

function (str) – The name of the function that should be called. Cannot be empty.
arguments (Optional[Sequence[SqlExpression]], optional) – The parameters that should be passed to the function. Can be None if the function does not take or does not need any arguments (e.g. CURRENT_TIME())
distinct (bool, optional) – Whether the (aggregation) function should only operate on distinct column values and hence a duplicate elimination needs to be performed before passing the argument values (e.g. COUNT(DISTINCT *)). Defaults to False
filter_where (Optional[AbstractPredicate], optional) – An optional filter expression that restricts the values included in an aggregation function.
keyword_args (Optional[Mapping[str, SqlExpression]])

accept_visitor(visitor: SqlExpressionVisitor[VisitorResult], *args, **kwargs) → VisitorResult#

Enables processing of the current expression by an expression visitor.

Parameters:

visitor (SqlExpressionVisitor[VisitorResult]) – The visitor
args – Additional arguments that are passed to the visitor
kwargs – Additional keyword arguments that are passed to the visitor

Return type:

VisitorResult

property arguments: Sequence[SqlExpression]#

Get all arguments that are supplied to the function.

Returns:: The arguments. Can be empty if no arguments are passed (but will never be None).
Return type:: Sequence[SqlExpression]

columns() → set[ColumnReference]#

Provides all columns that are referenced by this expression.

Returns:: The columns
Return type:: set[ColumnReference]

static create_count(column: ~postbound._core.ColumnReference | ~postbound.qal._qal.SqlExpression | ~collections.abc.Iterable[~postbound.qal._qal.SqlExpression] = *, *, distinct: bool = False) → FunctionExpression#

Shortcut method to create a new COUNT() expression over (one or multiple) columns.

If no column is given, a COUNT(*) is created.

Parameters:

column (ColumnReference | SqlExpression | Iterable[SqlExpression])
distinct (bool)

Return type:

FunctionExpression

static create_max(column: ColumnReference | SqlExpression | Iterable[SqlExpression]) → FunctionExpression#

Shortcut method to create a new MAX() expression over (one or multiple) columns.

Parameters:: column (ColumnReference | SqlExpression | Iterable[SqlExpression])
Return type:: FunctionExpression

static create_min(column: ColumnReference | SqlExpression | Iterable[SqlExpression]) → FunctionExpression#

Shortcut method to create a new MIN() expression over (one or multiple) columns.

Parameters:: column (ColumnReference | SqlExpression | Iterable[SqlExpression])
Return type:: FunctionExpression

static create_sum(column: ColumnReference | SqlExpression | Iterable[SqlExpression]) → FunctionExpression#

Shortcut method to create a new SUM() expression over (one or multiple) columns.

Parameters:: column (ColumnReference | SqlExpression | Iterable[SqlExpression])
Return type:: FunctionExpression

property distinct: bool#

Get whether the function should only operate on distinct values.

Whether this makes any sense for the function at hand is entirely dependend on the specific function and not enfored by PostBOUND. The runtime DBS has to check this.

Generally speaking, this argument is intended for aggregation functions.

Returns:: Whether a duplicate elimination has to be performed on the function arguments
Return type:: bool

property filter_where: AbstractPredicate | None#

Get the filter expression for an aggregate function.

Filters restrict the values that are actually included in the aggregate.

Returns:: The filter expression or None if no filter is applied (or the function is not an aggregate).
Return type:: Optional[AbstractPredicate]

property function: str#

Get the function name.

Returns:: The function name. Will never be empty
Return type:: str

is_aggregate() → bool#

Checks, whether the function is a well-known SQL aggregation function.

Only standard functions are considered (e.g. no CORR for computing correlations).

Returns:: Whether the function is a known aggregate function.
Return type:: bool

iterchildren() → Iterable[SqlExpression]#

Provides unified access to all child expressions of the concrete expression type.

Returns:: The expressions
Return type:: Iterable[SqlExpression]

itercolumns() → Iterable[ColumnReference]#

Provides all columns that are referenced by this expression.

If a column is referenced multiple times, it is also returned multiple times.

Returns:: All columns in exactly the order in which they are used.
Return type:: Iterable[ColumnReference]

property keyword_args: Mapping[str, SqlExpression]#

Get all keyword arguments that are supplied to the function.

The order in the mapping matches the order of occurrence in the function call.

tables() → set[TableReference]#

Provides all tables that are accessed by this expression.

Returns:: All tables. This includes virtual tables if such tables are present in the expression.
Return type:: set[TableReference]

class postbound.qal.FunctionTableSource(function: FunctionExpression, *, alias: str | TableReference = '')#

Models a table that is constructed from a function call, e.g. as in SELECT * FROM my_udf(42).

Parameters:

function (FunctionExpression) – The function that computes the temporary relation.
alias (str | TableReference, optional) – The name under which the virtual table can be accessed in the actual query. If this is empty, an anonymous table is created.

alias() → str#

Get the name under which the virtual table can be accessed in the actual query.

Return type:: str

columns() → set[ColumnReference]#

Provides all column sthat are referenced in the source.

For plain table sources this will be empty, but for more complicate structures such as subquery source, this will include all columns that are referenced in the subquery.

Returns:: The columns
Return type:: set[ColumnReference]

property function: FunctionExpression#: Get the function that computes the temporary relation.

itercolumns() → Iterable[ColumnReference]#

Provides access to all column in the source.

Returns:: All columns exactly in the order in which they are used
Return type:: Iterable[ColumnReference]

iterexpressions() → Iterable[SqlExpression]#

Provides access to all directly contained expressions in the source.

Returns:: The expressions
Return type:: Iterable[SqlExpression]

predicates() → QueryPredicates | None#

Provides all predicates that are contained in the source.

For plain table sources this will be None, but for subquery sources, etc. all predicates are returned.

Returns:: The predicates or None if the source does not allow predicates or simply does not contain any.
Return type:: QueryPredicates | None

tables() → set[TableReference]#

Provides all tables that are referenced in the source.

For plain table sources this will just be the actual table. For more complicated structures, such as subquery sources, this will include all tables of the subquery as well.

Returns:: The tables.
Return type:: set[TableReference]

property target_name: str#

Get the name under which the virtual table can be accessed in the actual query.

For anonymous tables, this is an empty string.

property target_table: TableReference | None#

Get the virtual table that contains the tuples.

This will always be a virtual table, or None for anonymous tables.

This property differs from target_name only by its return type.

class postbound.qal.GroupBy(group_columns: Sequence[SqlExpression], distinct: bool = False)#

The GROUP BY clause combines rows that match a grouping criterion to enable aggregation on these groups.

Despite their names, all grouped columns can be arbitrary SqlExpression instances, rules and restrictions of the SQL standard are not enforced by PostBOUND.

Parameters:

group_columns (Sequence[SqlExpression]) – The expressions that should be used to perform the grouping
distinct (bool, optional) – Whether the grouping should perform duplicate elimination, by default False

Raises:

ValueError – If group_columns is empty.

accept_visitor(visitor: ClauseVisitor[VisitorResult], *args, **kwargs) → VisitorResult#

Enables processing of the current clause by a visitor.

Parameters:

visitor (ClauseVisitor[VisitorResult]) – The visitor
args – Additional arguments that are passed to the visitor
kwargs – Additional keyword arguments that are passed to the visitor

Return type:

VisitorResult

columns() → set[ColumnReference]#

Provides all columns that are referenced in the clause.

Returns:: The columns
Return type:: set[ColumnReference]

static create_for(column: ColumnReference | Iterable[ColumnReference], *more_cols) → GroupBy#

Shortcut method to create a GROUP BY clause from column references.

Parameters:: column (ColumnReference | Iterable[ColumnReference])
Return type:: GroupBy

property distinct: bool#

Get whether the grouping should eliminate duplicates.

Returns:: Whether duplicate removal is performed.
Return type:: bool

property group_columns: Sequence[SqlExpression]#

Get all expressions that should be used to determine the grouping.

Returns:: The grouping expressions in exactly the sequence in which they were specified.
Return type:: Sequence[SqlExpression]

itercolumns() → Iterable[ColumnReference]#

Provides access to all column in this clause.

Returns:: All columns exactly in the order in which they are used
Return type:: Iterable[ColumnReference]

iterexpressions() → Iterable[SqlExpression]#

Provides access to all directly contained expressions in this clause.

Nested expressions can be accessed from these expressions in a recursive manner (see the SqlExpression interface for details).

Returns:: The expressions
Return type:: Iterable[SqlExpression]

class postbound.qal.Having(condition: AbstractPredicate)#

The HAVING clause enables filtering of the groups constructed by a GROUP BY clause.

All conditions are collected in a (potentially conjunctive or disjunctive) predicate object. See AbstractPredicate for details.

The structure of this clause is similar to the Where clause, but its scope is different (even though PostBOUND does no semantic validation to enforce this): predicates of the HAVING clause are only checked on entire groups of values and have to be valid their, instead of on individual tuples.

Parameters:: condition (AbstractPredicate) – The root predicate that contains all actual conditions

accept_visitor(visitor: ClauseVisitor[VisitorResult], *args, **kwargs) → VisitorResult#

Enables processing of the current clause by a visitor.

Parameters:

visitor (ClauseVisitor[VisitorResult]) – The visitor
args – Additional arguments that are passed to the visitor
kwargs – Additional keyword arguments that are passed to the visitor

Return type:

VisitorResult

columns() → set[ColumnReference]#

Provides all columns that are referenced in the clause.

Returns:: The columns
Return type:: set[ColumnReference]

property condition: AbstractPredicate#

Get the root predicate that is used to form the HAVING clause.

Returns:: The condition
Return type:: AbstractPredicate

itercolumns() → Iterable[ColumnReference]#

Provides access to all column in this clause.

Returns:: All columns exactly in the order in which they are used
Return type:: Iterable[ColumnReference]

iterexpressions() → Iterable[SqlExpression]#

Provides access to all directly contained expressions in this clause.

Nested expressions can be accessed from these expressions in a recursive manner (see the SqlExpression interface for details).

Returns:: The expressions
Return type:: Iterable[SqlExpression]

class postbound.qal.Hint(preparatory_statements: str = '', query_hints: str = '')#

Hint block of a clause.

Depending on the SQL dialect, these hints will be placed at different points in the query. Furthermore, the precise contents (i.e. syntax and semantic) vary from database system to system.

Hints are differentiated in two parts:

preparatory statements can be executed as valid commands on the database system, e.g. optimizer settings, etc.
query hints are the actual hints. Typically, these will be inserted as comments at some place in the query.

These two parts are set as parameters in the __init__ method and are available as read-only properties afterwards.

Parameters:

preparatory_statements (str, optional) – Statements that configure the optimizer and have to be run before the actual query is executed. Such settings often configure the optimizer for the entire session and can thus influence other queries as well. Defaults to an empty string, which indicates that there are no such settings.
query_hints (str, optional) – Hints that configure the optimizer, often for an individual join. These hints are executed as part of the actual query.

Examples

A hint clause for MySQL could look like this:

This enforces the join between tables R and S to be executed as a hash join (due to the query hint) and disables usage of the block nested-loop join for the entire query (which in this case only affects the join between tables S and T) due to the preparatory SET optimizer_switch statement.

accept_visitor(visitor: ClauseVisitor[VisitorResult], *args, **kwargs) → VisitorResult#

Enables processing of the current clause by a visitor.

Parameters:

visitor (ClauseVisitor[VisitorResult]) – The visitor
args – Additional arguments that are passed to the visitor
kwargs – Additional keyword arguments that are passed to the visitor

Return type:

VisitorResult

columns() → set[ColumnReference]#

Provides all columns that are referenced in the clause.

Returns:: The columns
Return type:: set[ColumnReference]

itercolumns() → Iterable[ColumnReference]#

Provides access to all column in this clause.

Returns:: All columns exactly in the order in which they are used
Return type:: Iterable[ColumnReference]

iterexpressions() → Iterable[SqlExpression]#

Provides access to all directly contained expressions in this clause.

Nested expressions can be accessed from these expressions in a recursive manner (see the SqlExpression interface for details).

Returns:: The expressions
Return type:: Iterable[SqlExpression]

property preparatory_statements: str#

Get the string of preparatory statements. Can be empty.

Returns:: The preparatory statements. If these are multiple statements, they are concatenated into a single string with appropriate separator characters between them.
Return type:: str

property query_hints: str#

Get the query hint text. Can be empty.

Returns:: The hints. The string has to be understood as-is by the target database system. If multiple hints are used, they have to be concatenated into a single string with appropriate separator characters between them. Correspondingly, if the hint blocks requires a specific prefix/suffix (e.g. comment syntax), this has to be part of the string as well.
Return type:: str

class postbound.qal.ImplicitFromClause(tables: DirectTableSource | Iterable[DirectTableSource])#

Represents a special case of FROM clause that only allows for pure tables to be selected.

Specifically, this means that subqueries or explicit joins using the JOIN ON syntax are not allowed. Just plain old SELECT ... FROM R, S, T WHERE ... queries.

As a special case, all FROM clauses that consist of a single (non-subquery) table can be represented as implicit clauses.

Parameters:: tables (DirectTableSource | Iterable[DirectTableSource]) – The tables that should be selected

static create_for(tables: TableReference | Iterable[TableReference]) → ImplicitFromClause#

Shorthand method to create a FROM clause for a set of table references.

This saves the user from creating the DirectTableSource instances before instantiating a implicit FROM clause.

Parameters:: tables (TableReference | Iterable[TableReference]) – The tables that should be sourced
Returns:: The FROM clause
Return type:: ImplicitFromClause

itertables() → Sequence[TableReference]#

Provides all tables in the FROM clause exactly in the sequence in which they were specified.

This utility saves the user from unwrapping all the DirectTableSource objects by herself.

Returns:: The tables.
Return type:: Sequence[TableReference]

An implicit query restricts the constructs that may appear in the FROM clause.

For implicit queries, the FROM clause may only consist of simple table sources. All join conditions have to be put in the WHERE clause. Notice that this does not restrict the structure of other clauses. For example, the WHERE clause can still contain subqueries. As a special case, queries without a FROM clause are also considered implicit.

The attributes and parameters for this query type are the same as for SqlQuery, only the type of the From clause is restricted.

Examples

The following queries are considered as implicit queries:

SELECT *
FROM R, S, T
WHERE R.a = S.b
    AND S.b = T.c
    AND R.a < 42

SELECT *
FROM R, S, T
WHERE R.a = S.b
    AND S.b = T.c
    AND R.a = (SELECT MIN(R.a) FROM R)

Parameters:

select_clause (Select)
from_clause (Optional[ImplicitFromClause])
where_clause (Optional[Where])
groupby_clause (Optional[GroupBy])
having_clause (Optional[Having])
orderby_clause (Optional[OrderBy])
limit_clause (Optional[Limit])
cte_clause (Optional[CommonTableExpression])
explain_clause (Optional[Explain])
hints (Optional[Hint])

property from_clause: ImplicitFromClause | None#

Get the FROM clause of the query.

Returns:: The FROM clause if it was specified, or None otherwise.
Return type:: Optional[From]

is_explicit() → bool#

Checks, whether this query has an explicit FROM clause.

Returns:: Whether the query is explicit
Return type:: bool

See also

ExplicitSqlQuery

is_implicit() → bool#

Checks, whether this query has an implicit FROM clause.

Returns:: Whether the query is implicit
Return type:: bool

See also

ImplicitSqlQuery

class postbound.qal.InPredicate(column: SqlExpression, values: Sequence[SqlExpression])#

An IN predicate lists the allowed values for a column.

In most cases, such a predicate with n allowed values can be transformed into a disjunction of n equality predicates, i.e. R.a IN (1, 2, 3) is equivalent to R.a = 1 OR R.a = 2 OR R.a = 3. Depending on the allowed values, an IN predicate can denote a join, e.g. R.a IN (S.b, S.c). An important special case arises if the allowed values are produced by a subquery, e.g. R.a IN (SELECT S.b FROM S). This does no longer allow for a transformation into binary predicates since it is unclear how many rows the subquery will produce.

Parameters:

column (SqlExpression) – The value that is checked by the predicate
values (Sequence[SqlExpression]) – The allowed column values. The individual expressions are not limited to StaticValueExpression instances, but can also include subqueries, columns or complicated mathematical expressions.

Raises:

ValueError – If values is empty.

Notes

Some systems even allow the column part to be an arbitrary expression. For example, in Postgres this is a valid query:

SELECT *
FROM R JOIN S ON R.a = S.b
WHERE 42 IN (R.c, 40 * S.d)

accept_visitor(visitor: PredicateVisitor[VisitorResult] | SqlExpressionVisitor[VisitorResult], *args, **kwargs) → VisitorResult#

Enables processing of the current predicate by a predicate visitor.

Parameters:

visitor (PredicateVisitor[VisitorResult]) – The visitor
args – Additional arguments to pass to the visitor
kwargs – Additional keyword arguments to pass to the visitor

Return type:

VisitorResult

property column: SqlExpression#

Get the expression that is tested (R.a in SELECT * FROM R WHERE R.a IN (1, 2, 3)).

Returns:: The expression
Return type:: SqlExpression

columns() → set[ColumnReference]#

Provides all columns that are referenced by this predicate.

Returns:: The columns. If the predicate contains a subquery, all columns of that query are included.
Return type:: set[ColumnReference]

is_join() → bool#

Checks, whether this predicate encodes a join between two tables.

PostBOUND uses the following criteria to determine, whether a predicate is join or not:

all predicates of the form <col 1> <operator> <col 2> where <col 1> and <col 2> come from different tables are joins. The columns can optionally be modified by value casts or static expressions, e.g. R.a::integer + 7
all functions that access columns from multiple tables are joins, e.g. my_udf(R.a, S.b)
all subqueries are treated as filters, no matter whether they are dependent subqueries or not. This means that both R.a = (SELECT MAX(S.b) FROM S) and R.a = (SELECT MAX(S.b) FROM S WHERE R.c = S.d) are treated as filters and not as joins, even though the second subquery will require some sort of the join in the query plan.
BETWEEN and IN predicates are treated according to rule 1 since they can be emulated via base predicates (subqueries in IN predicates are evaluated according to rule 3.)

Returns:: Whether the predicate is a join of different relations.
Return type:: bool

is_subquery_predicate() → bool#

Checks, if this is a subquery-based IN predicate, i.e. a predicate of the form R.a IN (SELECT S.b FROM S).

Returns:: Whether this predicate is based on a subquery
Return type:: bool

itercolumns() → Iterable[ColumnReference]#

Provides all columns that are referenced by this predicate.

If a column is referenced multiple times, it is also returned multiple times.

Returns:: All columns in exactly the order in which they are used.
Return type:: Iterable[ColumnReference]

iterexpressions() → Iterable[SqlExpression]#

Provides access to all expressions that are directly contained in this predicate.

Nested expressions can be accessed from these expressions in a recursive manner (see the SqlExpression interface for details).

Returns:: The expressions
Return type:: Iterable[SqlExpression]

join_partners() → set[tuple[ColumnReference, ColumnReference]]#

Provides all pairs of columns that are joined within this predicate.

Returns:: The pairs of joined columns. These pairs are normalized, such that two predicates which join the same columns provide the join partners in the same order.
Return type:: set[tuple[ColumnReference, ColumnReference]]
Raises:: NoJoinPredicateError –

static subquery(column: SqlExpression, subquery: SubqueryExpression | SqlQuery) → InPredicate#

Generates an IN predicate that is based on a subquery.

Such a predicate is of the form R.a IN (SELECT S.b FROM S).

Parameters:

column (SqlExpression) – The column that should be checked for being contained by the subquerie’s result set.
subquery (SubqueryExpression | SqlQuery) – The subquery to produce the allowed values.

Returns:

The predicate

Return type:

InPredicate

property values: Sequence[SqlExpression]#

Get the allowed values of the tested expression.

Returns:: The allowed values. This sequence always contains at least one entry.
Return type:: Sequence[SqlExpression]

class postbound.qal.IntersectClause(left_query: SqlQuery | SetQuery, right_query: SqlQuery | SetQuery)#

The INTERSECT clause of a query.

Parameters:

left_query (SelectStatement) – The left query that is part of the INTERSECT operation. Since set intersection is commutative, the assignment of left and right does not really matter.
right_query (SelectStatement) – The right query that is part of the INTERSECT. Since set intersection is commutative, the assignment of left and right does not really matter.

accept_visitor(visitor: ClauseVisitor[VisitorResult], *args, **kwargs) → VisitorResult#

Enables processing of the current clause by a visitor.

Parameters:

visitor (ClauseVisitor[VisitorResult]) – The visitor
args – Additional arguments that are passed to the visitor
kwargs – Additional keyword arguments that are passed to the visitor

Return type:

VisitorResult

columns() → set[ColumnReference]#

Provides all columns that are referenced in the clause.

Returns:: The columns
Return type:: set[ColumnReference]

input_queries() → set[SqlQuery | SetQuery]#

Get the two input queries that are part of the INTERSECT operation.

Returns:: The left and right queries. Since set intersection is commutative, the assignment of left and right does not really matter.
Return type:: set[SelectStatement]

itercolumns() → Iterable[ColumnReference]#

Provides access to all column in this clause.

Returns:: All columns exactly in the order in which they are used
Return type:: Iterable[ColumnReference]

iterexpressions() → Iterable[SqlExpression]#

Provides access to all directly contained expressions in this clause.

Nested expressions can be accessed from these expressions in a recursive manner (see the SqlExpression interface for details).

Returns:: The expressions
Return type:: Iterable[SqlExpression]

property left_query: SqlQuery | SetQuery#

Get the left query that is part of the INTERSECT operation.

Returns:: The left query. Since set intersection is commutative, the assignment of left and right does not really matter.
Return type:: SelectStatement

See also

property right_query: SqlQuery | SetQuery#

Get the right query that is part of the INTERSECT operation.

Returns:: The right query. Since set intersection is commutative, the assignment of left and right does not really matter.
Return type:: SelectStatement

See also

tables() → set[TableReference]#

Provides all tables that are referenced in the clause.

Returns:: All tables. This includes virtual tables if such tables are present in the clause
Return type:: set[TableReference]

class postbound.qal.JoinTableSource(left: TableSource, right: TableSource, *, join_condition: AbstractPredicate | None = None, join_type: JoinType = JOIN)#

Models a table that is referenced in a FROM clause using the explicit JOIN syntax.

Such a table source consists of three parts:

the left-hand side table source
the right-hand side table source being joined to the left-hand side
an (optional) join condition that specifies how the two tables are joined, if the specific join requires such a predicate

Parameters:

left (TableSource) – The left-hand side of the join
right (TableSource) – The right-hand side of the join
join_condition (Optional[AbstractPredicate], optional) – The predicate that is used to join the specified table with the other tables of the FROM clause. For most joins this is a required argument in order to create a valid SQL query (e.g. LEFT JOIN or INNER JOIN), but there are some joins without a condition (e.g. CROSS JOIN and NATURAL JOIN).
join_type (JoinType, optional) – The specific join that should be performed. Defaults to JoinType.InnerJoin.

base_table() → TableReference#

Provide the table that is farthest to the left in the join chain.

For subqueries or VALUES clauses, this will return the alias of the expression, i.e. the name of the virtual table that is created for the subquery or VALUES clause.

Returns:: The table
Return type:: TableReference

columns() → set[ColumnReference]#

Provides all column sthat are referenced in the source.

For plain table sources this will be empty, but for more complicate structures such as subquery source, this will include all columns that are referenced in the subquery.

Returns:: The columns
Return type:: set[ColumnReference]

itercolumns() → Iterable[ColumnReference]#

Provides access to all column in the source.

Returns:: All columns exactly in the order in which they are used
Return type:: Iterable[ColumnReference]

iterexpressions() → Iterable[SqlExpression]#

Provides access to all directly contained expressions in the source.

Returns:: The expressions
Return type:: Iterable[SqlExpression]

property join_condition: AbstractPredicate | None#

Get the predicate that is used to determine matching tuples from the table.

This can be None if the specific join_type does not require or allow a join condition (e.g. NATURAL JOIN).

Returns:: The condition if it is specified, None otherwise.
Return type:: Optional[AbstractPredicate]

property join_type: JoinType#

Get the kind of join that should be performed.

Returns:: The join type
Return type:: JoinType

property joined_table: JoinTableSource#

Get the nested join statements contained in this join.

Deprecated since version 0.10.0: Use right instead. This is an artifact of the old mosql-based query representation

A nested join is a JOIN statement within a JOIN statement, as in SELECT * FROM R JOIN (S JOIN T ON a = b) ON a = c.

Returns:: The nested joins, can be empty if there are no such joins.
Return type:: JoinTableSource

property left: TableSource#

Get the left-hand side of the join.

Returns:: The join partner. Can be anything from a plain base table, to a subquery, to another join.
Return type:: TableSource

predicates() → QueryPredicates | None#

Provides all predicates that are contained in the source.

For plain table sources this will be None, but for subquery sources, etc. all predicates are returned.

Returns:: The predicates or None if the source does not allow predicates or simply does not contain any.
Return type:: QueryPredicates | None

property right: TableSource#

Get the right-hand side of the join.

Returns:: The join partner. Can be anything from a plain base table, to a subquery, to another join.
Return type:: TableSource

property source: TableSource#

Get the actual table being joined. This can be a proper table or a subquery.

Deprecated since version 0.10.0: Use left instead. This is an artifact of the old mosql-based query representation

Returns:: The table
Return type:: TableSource

tables() → set[TableReference]#

Provides all tables that are referenced in the source.

For plain table sources this will just be the actual table. For more complicated structures, such as subquery sources, this will include all tables of the subquery as well.

Returns:: The tables.
Return type:: set[TableReference]

class postbound.qal.JoinType(*values)#

Indicates the type of a join using the explicit JOIN syntax, e.g. OUTER JOIN or NATURAL JOIN.

The names of the individual values should be pretty self-explanatory and correspond entirely to the names in the SQL standard.

class postbound.qal.Limit(*, limit: int | None = None, offset: int | None = None, fetch_direction: Literal['first', 'next', 'prior', 'last'] = 'first')#

The FETCH FIRST or LIMIT clause restricts the number of output rows returned by the database system.

Each clause can specify an offset (which is probably only meaningful if there is also an ORDER BY clause) and the actual limit. Notice that although many database systems use a non-standard syntax for this clause, our implementation is modelled after the actual SQL standard version (i.e. it produces a FETCH … string output).

Parameters:

limit (Optional[int], optional) – The maximum number of tuples to put in the result set. Defaults to None which indicates that all tuples should be returned.
offset (Optional[int], optional) – The number of tuples that should be skipped from the beginning of the result set. If no OrderBy clause is defined, this makes the result set’s contents non-deterministic (at least in theory). Defaults to None which indicates that no tuples should be skipped.
fetch_direction (FetchDirection)

Raises:

ValueError – If neither a limit, nor an offset are specified

accept_visitor(visitor: ClauseVisitor[VisitorResult], *args, **kwargs) → VisitorResult#

Enables processing of the current clause by a visitor.

Parameters:

visitor (ClauseVisitor[VisitorResult]) – The visitor
args – Additional arguments that are passed to the visitor
kwargs – Additional keyword arguments that are passed to the visitor

Return type:

VisitorResult

columns() → set[ColumnReference]#

Provides all columns that are referenced in the clause.

Returns:: The columns
Return type:: set[ColumnReference]

property fetch_direction: Literal['first', 'next', 'prior', 'last']#: Get the direction of the limit (e.g. first or prior).

itercolumns() → Iterable[ColumnReference]#

Provides access to all column in this clause.

Returns:: All columns exactly in the order in which they are used
Return type:: Iterable[ColumnReference]

iterexpressions() → Iterable[SqlExpression]#

Provides access to all directly contained expressions in this clause.

Nested expressions can be accessed from these expressions in a recursive manner (see the SqlExpression interface for details).

Returns:: The expressions
Return type:: Iterable[SqlExpression]

property limit: int | None#

Get the maximum number of rows in the result set.

Returns:: The limit or None, if all rows should be returned.
Return type:: Optional[int]

property offset: int | None#

Get the offset within the result set (i.e. number of first rows to skip).

Returns:: The offset or None if no rows should be skipped.
Return type:: Optional[int]

class postbound.qal.LogicalOperator(*values)#: The supported unary and binary operators.

class postbound.qal.MathExpression(operator: MathOperator, first_argument: SqlExpression, second_argument: SqlExpression | Sequence[SqlExpression] | None = None)#

A mathematical expression computes a result value based on some formula.

The formula is based on an arbitrary expression, an operator and potentially a number of additional expressions/arguments.

If it is necessary to represent boolean expressions outside of the WHERE and HAVING clauses, a AbstractPredicate should be used instead of a mathematical expression.

Parameters:

operator (MathOperator) – The operator that is used to combine the arguments.
first_argument (SqlExpression) – The first argument. For unary expressions, this can also be the only argument
second_argument (SqlExpression | Sequence[SqlExpression] | None, optional) –
Additional arguments. For the most common case of a binary expression, this will be exactly one argument. Defaults to None to accomodate for unary expressions.

Deprecated since version 0.21.1: Passing a sequence of expressions is deprecated and will be removed in v0.22.0. To encode operations involving more than two operands, a hierarchy of binary expressions should be used instead.

accept_visitor(visitor: SqlExpressionVisitor[VisitorResult], *args, **kwargs) → VisitorResult#

Enables processing of the current expression by an expression visitor.

Parameters:

visitor (SqlExpressionVisitor[VisitorResult]) – The visitor
args – Additional arguments that are passed to the visitor
kwargs – Additional keyword arguments that are passed to the visitor

Return type:

VisitorResult

columns() → set[ColumnReference]#

Provides all columns that are referenced by this expression.

Returns:: The columns
Return type:: set[ColumnReference]

static create_binary(lhs: Any, operator: MathOperator | str, rhs: Any) → MathExpression#

Shortcut method to create a new math expression.

Lhs and rhs are processed depending on their actual types:

SqlExpression instances are left as-is
ColumnReference instances are mapped to `ColumnExpression`s
Anything else is wrapped in a StaticValueExpression

Parameters:

lhs (Any)
operator (MathOperator | str)
rhs (Any)

Return type:

MathExpression

property first_arg: SqlExpression#

Get the first argument to the operator. This is always specified.

Returns:: The argument
Return type:: SqlExpression

is_unary() → bool#

Checks, whether the expression is a unary one (e.g. a negation as in -42).

Return type:: bool

iterchildren() → Iterable[SqlExpression]#

Provides unified access to all child expressions of the concrete expression type.

Returns:: The expressions
Return type:: Iterable[SqlExpression]

itercolumns() → Iterable[ColumnReference]#

Provides all columns that are referenced by this expression.

If a column is referenced multiple times, it is also returned multiple times.

Returns:: All columns in exactly the order in which they are used.
Return type:: Iterable[ColumnReference]

property operator: MathOperator#

Get the operation to combine the input value(s).

Returns:: The operator
Return type:: MathOperator

property second_arg: SqlExpression | Sequence[SqlExpression] | None#

Get the second argument to the operator.

Depending on the operator, this can be a single expression (the most common case), but also a sequence of expressions (e.g. sum of multiple values) or no value at all (e.g. negation).

Returns:: The argument(s)
Return type:: SqlExpression | Sequence[SqlExpression] | None

tables() → set[TableReference]#

Provides all tables that are accessed by this expression.

Returns:: All tables. This includes virtual tables if such tables are present in the expression.
Return type:: set[TableReference]

class postbound.qal.MathOperator(*values)#: The supported mathematical operators.

A mixed query allows for both the explicit as well as the implicit syntax to be used within the same FROM clause.

The mixed query complements ImplicitSqlQuery and ExplicitSqlQuery by removing the “purity” restriction: the tables that appear in the FROM clause can be described using either plain references or subqueries and they are free to use the JOIN ON syntax. The only thing that is not allowed as a FROM clause is an instance of ImplicitFromClause or an instance of ExplicitFromClause, since those cases are already covered by their respective query classes.

Notice however, that we currently do not enforce the From clause to not be a valid explicit or implicit clause. All checks happen on a type level. If the contents of a general From clause just happen to also be a valid ImplicitFromClause, this is fine.

The attributes and parameters for this query type are the same as for SqlQuery, only the type of the From clause is restricted.

Raises:

ValueError – If the given from_clause is either an implicit FROM clause or an explicit one.

Parameters:

select_clause (Select)
from_clause (Optional[From])
where_clause (Optional[Where])
groupby_clause (Optional[GroupBy])
having_clause (Optional[Having])
orderby_clause (Optional[OrderBy])
limit_clause (Optional[Limit])
cte_clause (Optional[CommonTableExpression])
explain_clause (Optional[Explain])
hints (Optional[Hint])

is_explicit() → bool#

Checks, whether this query has an explicit FROM clause.

Returns:: Whether the query is explicit
Return type:: bool

See also

ExplicitSqlQuery

is_implicit() → bool#

Checks, whether this query has an implicit FROM clause.

Returns:: Whether the query is implicit
Return type:: bool

See also

ImplicitSqlQuery

exception postbound.qal.NoFilterPredicateError(predicate: AbstractPredicate | None = None)#

Error to indicate that a join predicate was supplied at a place where a filter predicate was expected.

Parameters:: predicate (AbstractPredicate | None, optional) – The predicate that caused the error, defaults to None.
Return type:: None

exception postbound.qal.NoJoinPredicateError(predicate: AbstractPredicate | None = None)#

Error to indicate that a filter predicate was supplied at a place where a join predicate was expected.

Parameters:: predicate (AbstractPredicate | None, optional) – The predicate that caused the error, defaults to None
Return type:: None

class postbound.qal.OrderBy(expressions: Iterable[OrderByExpression] | OrderByExpression)#

The ORDER BY clause specifies how result rows should be sorted.

This clause has a similar structure like a Select clause and simply consists of an arbitrary number of OrderByExpression objects.

Parameters:: expressions (Iterable[OrderByExpression] | OrderByExpression) – The terms that should be used to determine the ordering. At least one expression is required
Raises:: ValueError – If no expressions are provided

accept_visitor(visitor: ClauseVisitor[VisitorResult], *args, **kwargs) → VisitorResult#

Enables processing of the current clause by a visitor.

Parameters:

visitor (ClauseVisitor[VisitorResult]) – The visitor
args – Additional arguments that are passed to the visitor
kwargs – Additional keyword arguments that are passed to the visitor

Return type:

VisitorResult

columns() → set[ColumnReference]#

Provides all columns that are referenced in the clause.

Returns:: The columns
Return type:: set[ColumnReference]

static create_for(*columns: ColumnReference | Iterable[ColumnReference], ascending: bool | None = None, nulls_first: bool | None = None) → OrderBy#

Shorthand method to create an ORDER BY clause for a set of column references.

Additional ordering parameters are assigned to all columns.

Parameters:

columns (ColumnReference | Iterable[ColumnReference])
ascending (bool | None)
nulls_first (bool | None)

Return type:

OrderBy

property expressions: Sequence[OrderByExpression]#

Get the expressions that form this ORDER BY clause.

Returns:: The individual terms that make up the ordering in exactly the sequence in which they were specified (which is the only valid sequence since all other orders could change the ordering of the result set).
Return type:: Sequence[OrderByExpression]

itercolumns() → Iterable[ColumnReference]#

Provides access to all column in this clause.

Returns:: All columns exactly in the order in which they are used
Return type:: Iterable[ColumnReference]

iterexpressions() → Iterable[SqlExpression]#

Provides access to all directly contained expressions in this clause.

Nested expressions can be accessed from these expressions in a recursive manner (see the SqlExpression interface for details).

Returns:: The expressions
Return type:: Iterable[SqlExpression]

class postbound.qal.OrderByExpression(column: SqlExpression, ascending: bool | None = None, nulls_first: bool | None = None)#

The OrderByExpression is the fundamental building block for an ORDER BY clause.

Each expression consists of the actual column (which might be an arbitrary SqlExpression, rules and restrictions by the SQL standard are not enforced here) as well as information regarding the ordering of the column. Setting this information to None falls back to the default interpretation by the target database system.

Parameters:

column (SqlExpression) – The column that should be used for ordering
ascending (Optional[bool], optional) – Whether the column values should be sorted in ascending order. Defaults to None, which indicates that the system-default ordering should be used.
nulls_first (Optional[bool], optional) – Whether NULL values should be placed at beginning or at the end of the sorted list. Defaults to None, which indicates that the system-default behaviour should be used.

property ascending: bool | None#

Get the desired ordering of the output rows.

Returns:: Whether to sort in ascending order. None indicates that the default behaviour of the system should be used.
Return type:: Optional[bool]

property column: SqlExpression#

Get the expression used to specify the current grouping.

In the simplest case this can just be a ColumnExpression which sorts directly by the column values. More complicated constructs like mathematical expressions over the column values are also possible.

Returns:: The expression
Return type:: SqlExpression

static create_for(column: ColumnReference, *, ascending: bool | None = None, nulls_first: bool | None = None) → OrderByExpression#

Shorthand method to create an OrderByExpression for a specific column reference.

Parameters:

column (ColumnReference)
ascending (bool | None)
nulls_first (bool | None)

Return type:

OrderByExpression

property nulls_first: bool | None#

Get where to place NULL values in the result set.

Returns:: Whether to put NULL values at the beginning of the result set (or at the end). None indicates that the default behaviour of the system should be used.
Return type:: Optional[bool]

class postbound.qal.PredicateVisitor#

Basic visitor to operator on arbitrary predicate trees.

As a modification to a strict vanilla interpretation of the design pattern, we provide dedicated matching methods for the different composite operators (i.e. for AND, OR and NOT predicates), rather than just matching on CompoundPredicate.

See also

AbstractPredicate

References

class postbound.qal.QuantifierExpression(expression: SqlExpression, *, quantifier: QuantifierOperator)#

An ANY/ALL expression.

For a predicate such as R.a > ALL (SELECT b FROM S) this expression is used to represent the right-hand side of the predicate. It typically appears as a child expression of a BinaryPredicate.

Parameters:

expression (SqlExpression) – The expression being quantified. Typically this is a SubqueryExpression, but some database systems also allow comparing against arrays or other collection types.
quantifier (QuantifierOperator) – The quantifier operator (_ANY_ or _ALL_).

accept_visitor(visitor: SqlExpressionVisitor[VisitorResult], *args, **kwargs) → VisitorResult#

Enables processing of the current expression by an expression visitor.

Parameters:

visitor (SqlExpressionVisitor[VisitorResult]) – The visitor
args – Additional arguments that are passed to the visitor
kwargs – Additional keyword arguments that are passed to the visitor

Return type:

VisitorResult

static all(expression: SqlExpression | SqlQuery) → QuantifierExpression#

Create an ALL expression.

Parameters:: expression (SqlExpression | SqlQuery)
Return type:: QuantifierExpression

static any(expression: SqlExpression | SqlQuery) → QuantifierExpression#

Create an ANY expression.

Parameters:: expression (SqlExpression | SqlQuery)
Return type:: QuantifierExpression

columns() → set[ColumnReference]#

Provides all columns that are referenced by this expression.

Returns:: The columns
Return type:: set[ColumnReference]

property expression: SqlExpression#

Get the expression that is being compared.

Typically, this will be a SubqueryExpression that produces a relation with a single column. Some database systems also allow comparing against arrays or other collection types.

iterchildren() → Iterable[SqlExpression]#

Provides unified access to all child expressions of the concrete expression type.

Returns:: The expressions
Return type:: Iterable[SqlExpression]

itercolumns() → Iterable[ColumnReference]#

Provides all columns that are referenced by this expression.

If a column is referenced multiple times, it is also returned multiple times.

Returns:: All columns in exactly the order in which they are used.
Return type:: Iterable[ColumnReference]

property quantifier: QuantifierOperator#: Get the actual quantifier.

tables() → set[TableReference]#

Provides all tables that are accessed by this expression.

Returns:: All tables. This includes virtual tables if such tables are present in the expression.
Return type:: set[TableReference]

class postbound.qal.QuantifierOperator(*values)#: The supported quantifier operators.

class postbound.qal.QueryPredicates(root: AbstractPredicate | None)#

The query predicates provide high-level access to all the different predicates in a query.

Generally speaking, this class provides the most user-friendly access into the predicate hierarchy and should be sufficient for most use-cases. The provided methods revolve around identifying filter and join predicates easily, as well finding the predicates that are specified on specific tables.

Parameters:: root (Optional[AbstractPredicate]) – The root predicate of the predicate hierarchy that should be represented by the QueryPredicates. Typically, this is a conjunction of the actual predicates.

all_simple() → bool#

Checks, whether all predicates in the hierarchy can be represented as simplified views.

See also

SimpleFilter: The simplified representation of predicates
SimpleJoin: The simplified representation of join predicates

Return type:: bool

and_(other_predicate: QueryPredicates | AbstractPredicate) → QueryPredicates#

Combines the current predicates with additional predicates, creating a conjunction of the two predicates.

The input predicates, as well as the current predicates object are not modified. All changes are applied to the resulting predicates object.

Parameters:: other_predicate (QueryPredicates | AbstractPredicate) – The predicates to combine. Can also be an AbstractPredicate, in which case this predicate is used as the root for the other predicates instance.
Returns:: The merged predicates wrapper. Its root is roughly equivalent to self.root AND other_predicate.root.
Return type:: QueryPredicates

static empty_predicate() → QueryPredicates#

Constructs a new predicates instance without any actual content.

Returns:: The predicates wrapper
Return type:: QueryPredicates

filters() → Collection[AbstractPredicate]#

Provides all filter predicates that are contained in the predicate hierarchy.

By default, the distinction between filters and joins that is defined in AbstractPredicate.is_join is used. However, this behaviour can be changed by subclasses.

This method handles compound predicates as follows:

conjunctions are un-nested, i.e. all predicates that form an AND predicate are collected individually
OR predicates are included with exactly those predicates from their children that are filters. If this is only true for a single predicate, that predicate will be returned directly.
NOT predicates are included if their child predicate is a filter.

Returns:: The filter predicates.
Return type:: Collection[AbstractPredicate]

filters_for(table: TableReference) → AbstractPredicate | None#

Provides all filter predicates that reference a specific table.

If multiple individual filter predicates are specified in the query, they will be combined in one large conjunction.

The determination of matching filter predicates is the same as for the filters() method.

Parameters:: table (TableReference) – The table to retrieve the filters for
Returns:: A (conjunction of) the filter predicates of the table, or None if the table is unfiltered.
Return type:: Optional[AbstractPredicate]

is_empty() → bool#

Checks, whether this predicate wrapper contains any actual predicates.

Returns:: Whether at least one predicate was specified.
Return type:: bool

join_graph(*, merge_aliases: bool = False) → Graph#

Provides the join graph for the predicates.

A join graph is an undirected graph, where each node corresponds to a base table and each edge corresponds to a join predicate between two base tables. In addition, each node is annotated by a predicate key which is a conjunction of all filter predicates on that table (or None if the table is unfiltered). Likewise, each edge is annotated by a predicate key that corresponds to the join predicate (which can never be None).

Parameters:: merge_aliases (bool, optional) – Whether to merge aliases of the same base table into one node. Setting this to True transforms all predicates to dictionaries that map the aliased tables to their filter predicates and frozensets of tables to their join predicates. Otherwise, the predicates are annotated directly.
Returns:: The join graph.
Return type:: nx.Graph

Notes

Since our implementation of a join graph is not based on a multigraph, only binary joins can be represented. The determination of edges is based on AbstractPredicate.join_partners.

joins() → Collection[AbstractPredicate]#

Provides all join predicates that are contained in the predicate hierarchy.

By default, the distinction between filters and joins that is defined in AbstractPredicate.is_join is used. However, this behaviour can be changed by subclasses.

This method handles compound predicates as follows:

conjunctions are un-nested, i.e. all predicates that form an AND predicate are collected individually
OR predicates are included with exactly those predicates from their children that are joins. If this is only true for a single predicate, that predicate will be returned directly.
NOT predicates are included if their child predicate is a join.

Returns:: The join predicates
Return type:: Collection[AbstractPredicate]

joins_between(first_table: TableReference | Iterable[TableReference], second_table: TableReference | Iterable[TableReference], *, _computation: Literal['legacy', 'graph', 'map'] = 'map') → AbstractPredicate | None#

Provides the (conjunctive) join predicate that joins specific tables.

The precise behaviour of this method depends on the provided parameters: If first_table or second_table contain multiple tables, all join predicates between tables from the different sets are returned (but joins from tables within first_table or from tables within second_table are not).

Notice that the returned predicate might also include other tables, if they are part of a join predicate that also joins the given two tables.

The performance of this method can be crucial for some applications, since the join check is often part of a very hot loop. Therefore, a number of different calculation strategies are implemented. If profiling shows that the application is slowed down heavily by the current strategy, it might be a good idea to switch to another algorithm. This can be achieved using the _computation parameter.

Parameters:

first_table (TableReference | Iterable[TableReference]) – The (set of) tables to join
second_table (TableReference | Iterable[TableReference]) – The (set of) join partners for first_table.
_computation (Literal["legacy", "graph", "map"], optional) –
The specific algorithm to use for determining the join partners. The algorithms have very different performance and memory charactersistics. The default map setting is usually the fastest. It should only really be changed if there are very good reasons for it. The following settings exist:
- legacy: uses a recursive strategy in case first_table or second_table contain multiple references. This is pretty slow, but easy to debug
- graph: builds an internal join graph and merges the involved nodes from first_table and second_table to extract the overall join predicate directly.
- map: stores a mapping between tables and their join predicates and merges these mappings to determine the overall join predicate

Returns:

A conjunction of all the individual join predicates between the two sets of candidate tables. If there is no join predicate between any of the tables, None is returned.

Return type:

Optional[AbstractPredicate]

Raises:

ValueError – If the _computation strategy is none of the allowed values

joins_for(table: TableReference) → Collection[AbstractPredicate]#

Provides all join predicates that reference a specific table.

Each entry in the resulting collection is a join predicate between the given table and a (set of) partner tables, such that the partner tables in different entries in the collection are also different. If multiple join predicates are specified between the given table and a specific (set of) partner tables, these predicates are aggregated into one large conjunction.

The determination of matching join predicates is the same as for the joins() method.

Parameters:: table (TableReference) – The table to retrieve the joins for
Returns:: The join predicates with table. If there are no such predicates, the collection is empty.
Return type:: Collection[AbstractPredicate]

joins_tables(tables: TableReference | Iterable[TableReference], *more_tables: TableReference) → bool#

Checks, whether specific tables are all joined with each other.

This does not mean that there has to be a join predicate between each pair of tables, but rather that all pairs of tables must at least be connected through a sequence of other tables. From a graph-theory centric point of view this means that the join (sub-) graph induced by the given tables is connected.

Parameters:

tables (TableReference | Iterable[TableReference]) – The tables to check.
*more_tables – Additional tables that also should be included in the check. This parameter is mainly for convenience usage in interactive scenarios.

Returns:

Whether the given tables can be joined without any cross product

Return type:

bool

Raises:

ValueError – If tables and more_tables is both empty

Examples

The following calls are exactly equivalent:

>>> predicates.joins_tables([table1, table2, table3])
>>> predicates.joins_tables(table1, table2, table3)
>>> predicates.joins_table([table1, table2], table3)

property root: AbstractPredicate#

Get the root predicate that represents the entire predicate hierarchy.

Typically, this is a conjunction of the actual predicates. This conjunction can be used to start a custom traversal of the predicate hierarchy.

Returns:: The root predicate
Return type:: AbstractPredicate
Raises:: StateError – If the predicates warpper is empty and there is no root predicate.

simplify() → Sequence[SimpleFilter | SimpleJoin]#

Converts all predicates in the hierarchy into their simplified counterparts.

See also

SimpleFilter: The simplified representation of predicates
SimpleJoin: The simplified representation of join predicates

Return type:: Sequence[SimpleFilter | SimpleJoin]

class postbound.qal.Select(targets: BaseProjection | SqlExpression | Sequence[BaseProjection | SqlExpression], *, distinct: Iterable[SqlExpression] | bool = False)#

The SELECT clause of a query.

This is the only required part of a query. Everything else is optional and can be left out. (Notice that PostBOUND is focused on SPJ-queries, hence there are no INSERT, UPDATE, or DELETE queries)

A SELECT clause simply consists of a number of individual projections (see BaseProjection), its targets.

Parameters:

targets (BaseProjection | Sequence[BaseProjection]) – The individual projection(s) that form the SELECT clause
distinct (Iterable[SqlExpression] | bool, optional) – Whether a duplicate elimination should be performed. By default, this is False indicating no duplicate elimination. If True, rows are eliminated based on all columns. Alternatively, a DISTINCT ON clause can be created by specifying the columns that should be used for duplicate elimination.

Raises:

ValueError – If the targets are empty.

accept_visitor(visitor: ClauseVisitor[VisitorResult], *args, **kwargs) → VisitorResult#

Enables processing of the current clause by a visitor.

Parameters:

visitor (ClauseVisitor[VisitorResult]) – The visitor
args – Additional arguments that are passed to the visitor
kwargs – Additional keyword arguments that are passed to the visitor

Return type:

VisitorResult

columns() → set[ColumnReference]#

Provides all columns that are referenced in the clause.

Returns:: The columns
Return type:: set[ColumnReference]

static count_star(*, target_name: str = '') → Select#

Shortcut method to create a SELECT COUNT(*) clause.

Parameters:: target_name (str, optional) – An optional name under which the column should be accessible (e.g., COUNT(*) AS freq).
Returns:: The clause
Return type:: Select

static create_count(column: ColumnReference | SqlExpression | None = None, *, target_name: str = '') → Select#

Shorthand to create a SELECT clause with a single COUNT() expression.

If no column is given, defaults to a COUNT(*) expression.

Parameters:

column (ColumnReference | SqlExpression | None)
target_name (str)

Return type:

static create_for(columns: ColumnReference | SqlExpression | Iterable[ColumnReference | SqlExpression], *, distinct: Iterable[SqlExpression] | bool = False) → Select#

Full factory method to accompany star and count_star factory methods.

This is basically the same as calling the __init__ method directly.

Parameters:

columns (ColumnReference | Iterable[ColumnReference]) – The columns that should form the projection
distinct (Iterable[SqlExpression] | bool, optional) – Whether a duplicate elimination should be performed. By default, this is False indicating no duplicate elimination. If True, rows are eliminated based on all columns. Alternatively, a DISTINCT ON clause can be created by specifying the columns that should be used for duplicate elimination.

Returns:

The clause

Return type:

Select

static create_max(column: ColumnReference | SqlExpression, *, target_name: str = '') → Select#

Shorthand to create a SELECT clause with a single MAX() expression.

Parameters:

column (ColumnReference | SqlExpression)
target_name (str)

Return type:

static create_min(column: ColumnReference | SqlExpression, *, target_name: str = '') → Select#

Shorthand to create a SELECT clause with a single MIN() expression.

Parameters:

column (ColumnReference | SqlExpression)
target_name (str)

Return type:

static create_sum(column: ColumnReference | SqlExpression, *, target_name: str = '') → Select#

Shorthand to create a SELECT clause with a single SUM() expression.

Parameters:

column (ColumnReference | SqlExpression)
target_name (str)

Return type:

distinct_specifier() → Iterable[SqlExpression] | bool#

Provides a precise description of the distinct qualifier.

Returns:

Output should be interpreted as follows:

True: plain DISTINCT over all columns
False: no DISTINCT qualifier
Iterable[SqlExpression]: DISTINCT ON clause over the given expressions

Return type:

Iterable[SqlExpression] | bool

is_distinct() → bool#

Checks, whether this is a SELECT DISTINCT clause (including a DISTINCT ON).

Return type:: bool

is_star() → bool#

Checks, whether the clause is simply SELECT *.

Return type:: bool

itercolumns() → Iterable[ColumnReference]#

Provides access to all column in this clause.

Returns:: All columns exactly in the order in which they are used
Return type:: Iterable[ColumnReference]

iterexpressions() → Iterable[SqlExpression]#

Provides access to all directly contained expressions in this clause.

Nested expressions can be accessed from these expressions in a recursive manner (see the SqlExpression interface for details).

Returns:: The expressions
Return type:: Iterable[SqlExpression]

local_columns() → set[str]#

Get all column names that are defined by the SELECT clause and not provided by the underlying tables.

For example, consider a query SELECT R.a, R.b AS foo FROM R. This query defines the local column foo, while R.a is provided by the underlying table R.

Return type:: set[str]

output_names() → dict[str, ColumnReference]#

Output names map the alias of each column to the actual column.

For example, consider a query SELECT R.a AS foo, R.b AS bar FROM R. Calling output_names on this query provides the dictionary {'foo': R.a, 'bar': R.b}.

Currently, this method only works for 1:1 mappings and other aliases are ignored. For example, consider a query SELECT my_udf(R.a, R.b) AS c FROM R. Here, a user-defined function is used to combine the values of R.a and R.b to form an output column c. Such a projection is ignored by output_names.

Returns:: A mapping from the column target name to the original column.
Return type:: dict[str, ColumnReference]

static star(*, distinct: Iterable[SqlExpression] | bool = False) → Select#

Shortcut to create a SELECT * clause.

Parameters:: distinct (Iterable[SqlExpression] | bool) – Whether a duplicate elimination should be performed. By default, this is False indicating no duplicate elimination. If True, rows are eliminated based on all columns. Alternatively, a DISTINCT ON clause can be created by specifying the columns that should be used for duplicate elimination.
Returns:: The clause
Return type:: Select

star_expressions() → Iterable[BaseProjection]#

Provides all * and R.* expressions.

Returns:: The star expressions. Can be empty if no star expressions are used.
Return type:: Iterable[BaseProjection]

tables() → set[TableReference]#

Provides all tables that are referenced in the clause.

Returns:: All tables. This includes virtual tables if such tables are present in the clause
Return type:: set[TableReference]

property targets: Sequence[BaseProjection]#

Get all projections.

Returns:: The projections in the order in which they were originally specified
Return type:: Sequence[BaseProjection]

class postbound.qal.SetOperator(*values)#: The supported set operators.

A set query combines the result sets of two queries using one of the set operations.

Set operations include UNION, UNION ALL, INTERSECT, and EXCEPT. We represent set queries as a different type than “plain” SELECT queries because these allow for a different interface (e.g. providing access to predicates or the SELECT block). See the documentation of SqlQuery for more details on the distinction and the reasoning behind it.

Still, the SetQuery provides exactly the same high-level interface. In case a specific method or property is not applicable for set queries (e.g. calling query.predicates()), a QueryTypeError will be raised. This is motivated by entirely pragmatic reasons: oftentimes a client will not care whether it receive a SqlQuery or a SetQuery because it is only interested in the common denominator between the two (e.g. calling str or retrieving its tables). Therefore, we want to be set queries applicable in the same places. At the same time, set queries are a much more recent addition to PostBOUND, and we do not want to force the client to update its code base if this is not really necessary.

Notice that set queries provide some clauses are supported by both plain SQL queries as well as set queries.

Parameters:

left_query (SelectStatement) – The left-hand side of the set operation
right_query (SelectStatement) – The right-hand side of the set operation
set_operation (SetOperator) – The actual operation to combine the two result sets.
cte_clause (Optional[CommonTableExpression], optional) – The WITH part of the query, by default None
orderby_clause (Optional[OrderBy], optional) – The ORDER BY part of the query, by default None
limit_clause (Optional[Limit], optional) – The LIMIT and OFFSET part of the query. In standard SQL, this is designated using the FETCH FIRST syntax. Defaults to None.
hints (Optional[Hint], optional) – The hint block of the query. Hints are not part of standard SQL and follow a completely system-specific syntax. Even their placement in within the query varies from system to system and from extension to extension. Defaults to None.
explain_clause (Optional[Explain], optional) – The EXPLAIN part of the query. Like hints, this is not part of standard SQL. However, most systems provide EXPLAIN functionality. The specific features and syntax are quite similar, but still system specific. Defaults to None.

See also

SqlQuery

accept_visitor(clause_visitor: ClauseVisitor, *args, **kwargs) → dict[BaseClause, VisitorResult]#

Applies a visitor over all clauses in the current query.

Notice that since the visitor is applied to all clauses, it returns the results for each of them.

Parameters:: clause_visitor (ClauseVisitor) – The visitor algorithm to use.
Return type:: dict[BaseClause, VisitorResult]

ast() → str#

Provides a human-readable representation of the abstract syntax tree for this query.

The AST is a textual representation of the query that shows the structure of the query in a tree-like manner.

Returns:: The abstract syntax tree of this query
Return type:: str

bound_tables() → set[TableReference]#

Provides all tables that can be assigned to a physical or virtual table reference in this query.

Bound tables are those tables, that are selected in the FROM clause of the query, or a subquery. Conversely, unbound tables are those that have to be “injected” by an outer query, as is the case for dependent subqueries.

For example, the query SELECT * FROM R, S WHERE R.a = S.b has two bound tables: R and S. On the other hand, the query SELECT * FROM R WHERE R.a = S.b has only bound R, whereas S has to be bound in a surrounding query.

Returns:: All tables that are bound (i.e. listed in any FROM clause or a CTE) of the query.
Return type:: set[TableReference]

clauses(*, skip: ~typing.Type | ~collections.abc.Iterable[~typing.Type] | None = <class 'NoneType'>) → Sequence[BaseClause]#

Provides all the clauses that are defined (i.e. not None) in this query.

Parameters:: skip (Optional[Type | Iterable[Type]], optional) – The clause types that should be skipped in the output. This can be a single type or an iterable of types.
Returns:: The clauses. The current order of the clauses is as follows: hints, explain, cte, set operation, orderby, limit. Notice however, that this order is not strictly standardized and may change in the future. All clauses that are not specified on the query will be skipped.
Return type:: Sequence[BaseClause]

columns() → set[ColumnReference]#

Provides all columns that are referenced at any point in the query.

This includes columns from all clauses and does not account for renamed columns from subqueries. For example, consider the query SELECT R.a, my_sq.b FROM R JOIN (SELECT b FROM S) my_sq ON R.a < my_sq.b. columns would return the following set: {R.a, S.b, my_sq.b}, even though my_sq.b can be considered as just an alias for S.b.

Returns:: All columns that are referenced in the query.
Return type:: set[ColumnReference]

contains_cross_product() → bool#

Checks, whether this query has at least one cross product.

Returns:: Whether this query has cross products.
Return type:: bool

property cte_clause: CommonTableExpression | None#

Get the WITH clause of the query.

Returns:: The WITH clause if it was specified, or None otherwise.
Return type:: Optional[CommonTableExpression]

property explain: Explain | None#

Get the EXPLAIN block of the query.

Returns:: The EXPLAIN settings if specified, or None otherwise.
Return type:: Optional[Explain]

property from_clause: From | None#: Placeholder method to ensure compatibility with the SqlQuery interface. Raises a QueryTypeError.

property groupby_clause: GroupBy | None#: Placeholder method to ensure compatibility with the SqlQuery interface. Raises a QueryTypeError.

property having_clause: Having | None#: Placeholder method to ensure compatibility with the SqlQuery interface. Raises a QueryTypeError.

property hints: Hint | None#

Get the hint block of the query.

The hints can specify preparatory statements that have to be executed before the actual query is run in addition to the hints themselves.

Returns:: The hint block if it was specified, or None otherwise.
Return type:: Optional[Hint]

is_dependent() → bool#

Checks, whether all columns that are referenced in this query are provided by the tables from this query.

In order for this check to work, all columns have to be bound to actual tables, i.e. the tables attribute of all column references have to be set to a valid object.

Returns:: Whether all columns belong to tables that are bound by this query
Return type:: bool

is_explain() → bool#

Checks, whether this query is an EXPLAIN query rather than a normal SQL query.

An EXPLAIN query is not executed like a normal SELECT query. Instead of actually calculating a result set, the database system only provides a query plan. This plan is the execution plan that would be used, had the query been entered as a normal SQL query.

Returns:: Whether this query should be explained, rather than executed.
Return type:: bool

is_explicit() → bool#

Placeholder method to ensure compatibility with the SqlQuery interface. Raises a QueryTypeError.

Return type:: bool

is_implicit() → bool#

Placeholder method to ensure compatibility with the SqlQuery interface. Raises a QueryTypeError.

Return type:: bool

is_ordered() → bool#

Checks, whether this query produces its result tuples in order.

Returns:: Whether a valid ORDER BY clause was specified on the query.
Return type:: bool

is_scalar() → bool#

Checks, whether the query is guaranteed to provide a single scalar value as a result.

Scalar results can only be calculated by queries with a single projection in the SELECT clause and if that projection is an aggregate function, e.g. SELECT min(R.a) FROM R. However, there are other queries which could also be scalar “by chance”, e.g. SELECT R.b FROM R WHERE R.a = 1 if R.a is the primary key of R. Notice that such cases are not recognized by this method.

Returns:: Whether the query will always return a single scalar value
Return type:: bool

is_set_query() → bool#

Checks, whether this query is a set query.

A set query is a query that combines the results of two or more queries into a single result set. This can be done by combining the tuples from both sets using a UNION clause (which removes duplicates), or a UNION ALL clause (which retains duplicates). Alternatively, only tuples that are present in both sets can be retained using an INTERSECT clause. Finally, all tuples from the first result set that are not part of the second result set can be computed using an EXCEPT clause.

Notice that only one of the set operators can be used at a time, but the input query of one set operation can itself use another set operation.

Returns:: Whether this query is a set query
Return type:: bool

itercolumns() → Iterable[ColumnReference]#

Provides access to all column in this query.

In contrast to the columns method, duplicates are returned multiple times, i.e. if a column is referenced n times in this query, it will also be returned n times by this method. Furthermore, the order in which columns are provided by the iterable matches the order in which they appear in this query.

Returns:: The columns
Return type:: Iterable[ColumnReference]

iterexpressions() → Iterable[SqlExpression]#

Provides access to all expressions that are directly contained in this query.

Nested expressions can be accessed from these expressions in a recursive manner (see the SqlExpression interface for details).

Returns:: The expressions
Return type:: Iterable[SqlExpression]

property left_query: SqlQuery | SetQuery#

Get the left-hand side of the set operation.

Returns:: The left-hand side of the set operation
Return type:: SelectStatement

property limit_clause: Limit | None#

Get the combined LIMIT and OFFSET clauses of the query.

According to the SQL standard, these clauses should use the FETCH FIRST syntax. However, many systems use OFFSET and LIMIT instead.

Returns:: The FETCH FIRST clause if it was specified, or None otherwise.
Return type:: Optional[Limit]

property orderby_clause: OrderBy | None#

Get the ORDER BY clause of the query.

Returns:: The ORDER BY clause if it was specified, or None otherwise.
Return type:: Optional[OrderBy]

output_columns() → Sequence[ColumnReference]#

Provides the columns that form the result relation of this query.

Columns are ordered according to their appearance in the SELECT clause and will not have a bound table associated with them. This is because the query result is “anonymous” and does not have a relation name associated with it. The columns are named according to the following rules:

If the expression has an alias, this name is used
If the expression is a simple column reference, the column name is used
Otherwise, a generic name is used, e.g. “column_1”, “column_2”, etc.

Additionally, to resolve naming conflicts between the left-hand side and the right-hand side of the set operation, names from the left-hand side overwrite names from the right-hand side. Names from the right-hand side are only used if the left-hand side does not provide a name. If both sides do not specify a name, a new generic name is used.

Returns:: The columns. Their order matches the order in which they appear in the SELECT clause of the query.
Return type:: Sequence[ColumnReference]

predicates() → QueryPredicates#

Placeholder method to ensure compatibility with the SqlQuery interface. Raises a QueryTypeError.

Return type:: QueryPredicates

property right_query: SqlQuery | SetQuery#

Get the right-hand side of the set operation.

Returns:: The right-hand side of the set operation
Return type:: SelectStatement

property select_clause: Select#: Placeholder method to ensure compatibility with the SqlQuery interface. Raises a QueryTypeError.

property set_clause: UnionClause | ExceptClause | IntersectClause#: Get the set clause of the query.

property set_operation: SetOperator#

Get the set operation that is used to combine the two queries.

Returns:: The set operation
Return type:: SetOperator

stringify(*, trailing_delimiter: bool = True) → str#

Provides a string representation of this query.

The only difference to calling str directly, is that the stringify method provides control over whether a trailing delimiter should be appended to the query.

Parameters:: trailing_delimiter (bool, optional) – Whether a delimiter should be appended to the query. Defaults to True.
Returns:: A string representation of this query
Return type:: str

subqueries() → Collection[SqlQuery]#

Provides all subqueries that are referenced in this query.

Notice that CTEs are ignored by this method, since they can be accessed directly via the cte_clause property.

Returns:: All subqueries that appear in any of the “inner” clauses of the query
Return type:: Collection[SqlQuery]

tables() → set[TableReference]#

Provides all tables that are referenced at any point in the query.

This includes tables from all clauses. Virtual tables will be included and tables that are only scanned within subqueries are included as well. Notice however, that some database systems might not support subqueries to be put at arbitrary positions in the query (e.g. GROUP BY clause).

Returns:: All tables that are referenced in the query.
Return type:: set[TableReference]

unbound_tables() → set[TableReference]#

Provides all tables that are referenced in this query but not bound.

While tables() provides all tables that are referenced in this query in any way, bound_tables restricts these tables. This method provides the complementary set to bound_tables i.e. tables = bound_tables ⊕ unbound_tables.

Returns:: The unbound tables that have to be supplied as part of an outer query
Return type:: set[TableReference]

property where_clause: Where | None#: Placeholder method to ensure compatibility with the SqlQuery interface. Raises a QueryTypeError.

class postbound.qal.SimpleFilter(column: ColumnReference | AbstractPredicate, operation: LogicalOperator | None = None, value: Any = None, *, predicate: AbstractPredicate | None = None)#

A simple filter provides more streamlined and direct access to filter predicates.

A simple filter is a read-only predicate, i.e. it cannot be created on its own and has to be derived from a base predicate (either a binary predicate, a BETWEEN predicate, or an IN predicate). Afterward, it provides read-only access to the predicate being filtered, the filter operation, as well as the values used to restrict the allowed column instances. All these attributes are made directly available instead of via hierarchies of SqlExpression instances.

Note that not all base predicates can be represented as a simplified view. In order for the view to work, both the column as well as the filter values cannot be modified by other expressions such as function calls or mathematical expressions. However, cast expressions are tolerated and will simply be dropped. As a rule of thumb, if an expression modifies a value (such as a function call), this cannot be unwrapped. Therefore, a filter approximately has to be of the form <column reference> <operator> <static values> in order for the representation to work.

The static methods attempt_wrap, wrap, can_wrap and wrap_all serve as high-level access points into the view. The components of the view are accessible via properties. In general, using attempt_wrap and wrap_all should be sufficient for most purposes.

Parameters:

column (ColumnReference | AbstractPredicate) –
The column being filtered.

For legacy purposes, this can (counterintuitively) also be a full predicate. This is treated as follows: if a predicate is given, all other parameters need to be omitted (they are ignored either way). The predicate must be of simple form (see explanation above) and its components will be automatically unwrapped and used as attribute values for the simplified version. The introduction of the attempt_wrap method has made this mode unnecessary. It is only available to keep backwards compatibility.
operation (LogicalOperator | None) – The filter operation (e.g. IN or <>). This cannot be EXISTS or MISSING, since subqueries cannot be represented in simplified form. Passing None only works in legacy mode.
value (Any) – The filter value. Passing None is interpreted as SQL NULL.
predicate (AbstractPredicate | None) – The original predicate that is represented by this simplified view. Passing None only works in legacy mode.

See also

SimpleJoin: A similar view dedicated to join predicates

Examples

The best way to construct simplified views is to start with the QueryPredicates and extract the filter predicates, e.g., by using views = SimpleFilter.wrap_all(query.predicates()) or filters = SimpleFilter.wrap_all(query.predicates().joins()). Notice that especially the first conversion can be “lossy”: all join predicates are dropped. Likewise, all filters that are more complex such as disjunctions are ignored. Alternatively, the QueryPredicates also provides a simplify() method that can be used to convert all predicates (filters and joins) into their simplified counterparts.

If a single predicate should be simplified, the attempt_wrap(predicate) usually works best.

The following predicates can be represented as a simplified view: R.a = 42, R.b BETWEEN 1 AND 2 or R.c IN (11, 22, 33). On the other hand, the following predicates cannot be represented b/c they involve advanced operations: R.a + 10 = 42 (contains a mathematical expression) and some_udf(R.a) < 11 % 2 (contains a function call, a predicate expressoin, and a mathematical expression).

Notes

Simple filters can be used in match statements and provide (column, operation, value) as arguments.

accept_visitor(visitor: PredicateVisitor[VisitorResult] | SqlExpressionVisitor[VisitorResult], *args, **kwargs) → VisitorResult#

Enables processing of the current predicate by a predicate visitor.

Parameters:

visitor (PredicateVisitor[VisitorResult]) – The visitor
args – Additional arguments to pass to the visitor
kwargs – Additional keyword arguments to pass to the visitor

Return type:

VisitorResult

static attempt_wrap(predicate: AbstractPredicate) → SimpleFilter | None#

Transforms a predicate into a simplified view. Returns None if that is not possible.

Parameters:: predicate (AbstractPredicate)
Return type:: SimpleFilter | None

base_predicates() → Iterable[AbstractPredicate]#

Provides all base predicates that form this predicate.

This allows to iterate over all leaves of a compound predicate, for base predicates it simply returns the predicate itself.

Returns:: The base predicates, in an arbitrary order. If the predicate is a base predicate already, it will be the only item in the iterable.
Return type:: Iterable[AbstractPredicate]

static can_wrap(predicate: AbstractPredicate) → bool#

Checks, whether a specific predicate can be represented as a simplified view.

Parameters:: predicate (AbstractPredicate) – The predicate to check
Returns:: Whether a representation as a simplified view is possible.
Return type:: bool

property column: ColumnReference#

Get the filtered column.

Returns:: The column
Return type:: ColumnReference

columns() → set[ColumnReference]#

Provides all columns that are referenced by this predicate.

Returns:: The columns. If the predicate contains a subquery, all columns of that query are included.
Return type:: set[ColumnReference]

is_compound() → bool#

Checks, whether this predicate combines the evaluation of other predicates to compute the overall evaluation.

Returns:: Whethter this predicate is a composite of other predicates.
Return type:: bool

is_join() → bool#

Checks, whether this predicate encodes a join between two tables.

PostBOUND uses the following criteria to determine, whether a predicate is join or not:

all predicates of the form <col 1> <operator> <col 2> where <col 1> and <col 2> come from different tables are joins. The columns can optionally be modified by value casts or static expressions, e.g. R.a::integer + 7
all functions that access columns from multiple tables are joins, e.g. my_udf(R.a, S.b)
all subqueries are treated as filters, no matter whether they are dependent subqueries or not. This means that both R.a = (SELECT MAX(S.b) FROM S) and R.a = (SELECT MAX(S.b) FROM S WHERE R.c = S.d) are treated as filters and not as joins, even though the second subquery will require some sort of the join in the query plan.
BETWEEN and IN predicates are treated according to rule 1 since they can be emulated via base predicates (subqueries in IN predicates are evaluated according to rule 3.)

Returns:: Whether the predicate is a join of different relations.
Return type:: bool

itercolumns() → Iterable[ColumnReference]#

Provides all columns that are referenced by this predicate.

If a column is referenced multiple times, it is also returned multiple times.

Returns:: All columns in exactly the order in which they are used.
Return type:: Iterable[ColumnReference]

iterexpressions() → Iterable[SqlExpression]#

Provides access to all expressions that are directly contained in this predicate.

Nested expressions can be accessed from these expressions in a recursive manner (see the SqlExpression interface for details).

Returns:: The expressions
Return type:: Iterable[SqlExpression]

join_partners() → set[tuple[ColumnReference, ColumnReference]]#

Provides all pairs of columns that are joined within this predicate.

Returns:: The pairs of joined columns. These pairs are normalized, such that two predicates which join the same columns provide the join partners in the same order.
Return type:: set[tuple[ColumnReference, ColumnReference]]
Raises:: NoJoinPredicateError –

property operation: LogicalOperator#

Get the SQL operation that is used for the filter (e.g. IN or <>).

Returns:: the operator. This cannot be EXISTS or MISSING, since subqueries cannot be represented in simplified views.
Return type:: LogicalSqlOperators

unwrap() → AbstractPredicate#

Get the original predicate that is represented by this view.

Returns:: The original predicate
Return type:: AbstractPredicate

property value: object | tuple[object] | Sequence[object]#

Get the filter value.

Returns:: The value. For a binary predicate, this is just the value itself. For a BETWEEN predicate, this is tuple of the form (lower, upper) and for an IN predicate, this is a sequence of the allowed values.
Return type:: object | tuple[object] | Sequence[object]

static wrap(predicate: AbstractPredicate) → SimpleFilter#

Transforms a predicate into a simplified view. Raises an error if that is not possible.

Parameters:: predicate (AbstractPredicate) – The predicate to represent as a simplified view
Returns:: A simplified view wrapping the given predicate
Return type:: SimpleFilter
Raises:: ValueError – If the predicate cannot be represented as a simplified view.

static wrap_all(predicates: Iterable[AbstractPredicate]) → Sequence[SimpleFilter]#

Transforms specific predicates into simplified views.

If individual predicates cannot be represented as views, they are ignored.

Parameters:: predicates (Iterable[AbstractPredicate]) – The predicates to represent as views. These can be arbitrary predicates, i.e. including joins and complex filters.
Returns:: The simplified views for all predicates that can be represented this way. The sequence of the views matches the sequence in the predicates. If the representation fails for individual predicates, they simply do not appear in the result. Therefore, this sequence may be empty if none of the predicates are valid simplified views.
Return type:: Sequence[SimpleFilter]

class postbound.qal.SimpleJoin(lhs: ColumnReference | AbstractPredicate, rhs: ColumnReference | None = None, *, predicate: AbstractPredicate | None = None)#

A simple join provides a more streamlined and direct access to join predicates.

A simple join is a read-only predicate, i.e. it cannot be created on its own and has to be derived from a binary equi-join predicate. Afterward, it provides read-only access to the partner columns that are joined.

Note that not all binary joins can be represented in a simplified view. In order for the view to work, the join must be an equi-join, i.e. using LogicalOperator.Equal. Furthermore, both sides of the join have to cannot be modified by other expressions such as function calls or mathematical expressions. However, cast expressions are tolerated and will simply be dropped. As a rule of thumb, if an expression modifies a value (such as a function call), this cannot be unwrapped. Therefore, a join approximately has to be of the form <first col> = <second col> in order for the representation to work.

Parameters:

lhs (ColumnReference | AbstractPredicate) – The left-hand side of the join. For legacy purposes, this can (counterintuitively) also be a full predicate. This is treated as follows: if a predicate is given, all other parameters need to be omitted (they are ignored either way). The predicate must be of simple form (see explanation above) and its components will be automatically unwrapped and used as attribute values for the simplified version. The introduction of the attempt_wrap method has made this mode unnecessary. It is only available to keep backwards compatibility.
rhs (ColumnReference | None) – The right-hand side of the join. Passing None only works in legacy mode.
predicate (AbstractPredicate | None) – The original predicate that is represented by this simplified view. Passing None only works in legacy mode.

See also

SimpleFilter: A similar view dedicated to filter predicates

Examples

The best way to construct simplified views is to start with the QueryPredicates and extract the joins, e.g. by using views = SimpleJoin.wrap_all(query.predicates()) or joins = SimpleJoin.wrap_all(query.predicates().joins()). Notice that especially the first conversion can be “lossy”: all filter predicates are dropped. Likewise, all joins that are not equi-joins are ignored. Alternatively, the QueryPredicates also provides a simplify() method that can be used to convert all predicates (filters and joins) into their simplified counterparts.

Notes

Simple joins can be used in match statements and provide the lhs and rhs properties.

accept_visitor(visitor: PredicateVisitor[VisitorResult] | SqlExpressionVisitor[VisitorResult], *args, **kwargs) → VisitorResult#

Enables processing of the current predicate by a predicate visitor.

Parameters:

visitor (PredicateVisitor[VisitorResult]) – The visitor
args – Additional arguments to pass to the visitor
kwargs – Additional keyword arguments to pass to the visitor

Return type:

VisitorResult

static attempt_wrap(predicate: AbstractPredicate) → SimpleJoin | None#

Transforms a predicate into a simplified view. Returns None if that is not possible.

Parameters:: predicate (AbstractPredicate)
Return type:: SimpleJoin | None

base_predicates() → Iterable[AbstractPredicate]#

Provides all base predicates that form this predicate.

This allows to iterate over all leaves of a compound predicate, for base predicates it simply returns the predicate itself.

Returns:: The base predicates, in an arbitrary order. If the predicate is a base predicate already, it will be the only item in the iterable.
Return type:: Iterable[AbstractPredicate]

static can_wrap(predicate: AbstractPredicate) → bool#

Checks, whether a specific predicate can be represented as a simplified view.

Parameters:: predicate (AbstractPredicate) – The predicate to check
Returns:: Whether a representation as a simplified view is possible.
Return type:: bool

columns() → set[ColumnReference]#

Provides all columns that are referenced by this predicate.

Returns:: The columns. If the predicate contains a subquery, all columns of that query are included.
Return type:: set[ColumnReference]

is_compound() → bool#

Checks, whether this predicate combines the evaluation of other predicates to compute the overall evaluation.

Returns:: Whethter this predicate is a composite of other predicates.
Return type:: bool

is_join() → bool#

Checks, whether this predicate encodes a join between two tables.

PostBOUND uses the following criteria to determine, whether a predicate is join or not:

all predicates of the form <col 1> <operator> <col 2> where <col 1> and <col 2> come from different tables are joins. The columns can optionally be modified by value casts or static expressions, e.g. R.a::integer + 7
all functions that access columns from multiple tables are joins, e.g. my_udf(R.a, S.b)
all subqueries are treated as filters, no matter whether they are dependent subqueries or not. This means that both R.a = (SELECT MAX(S.b) FROM S) and R.a = (SELECT MAX(S.b) FROM S WHERE R.c = S.d) are treated as filters and not as joins, even though the second subquery will require some sort of the join in the query plan.
BETWEEN and IN predicates are treated according to rule 1 since they can be emulated via base predicates (subqueries in IN predicates are evaluated according to rule 3.)

Returns:: Whether the predicate is a join of different relations.
Return type:: bool

itercolumns() → Iterable[ColumnReference]#

Provides all columns that are referenced by this predicate.

If a column is referenced multiple times, it is also returned multiple times.

Returns:: All columns in exactly the order in which they are used.
Return type:: Iterable[ColumnReference]

iterexpressions() → Iterable[SqlExpression]#

Provides access to all expressions that are directly contained in this predicate.

Nested expressions can be accessed from these expressions in a recursive manner (see the SqlExpression interface for details).

Returns:: The expressions
Return type:: Iterable[SqlExpression]

join_partners() → set[tuple[ColumnReference, ColumnReference]]#

Provides all pairs of columns that are joined within this predicate.

Returns:: The pairs of joined columns. These pairs are normalized, such that two predicates which join the same columns provide the join partners in the same order.
Return type:: set[tuple[ColumnReference, ColumnReference]]
Raises:: NoJoinPredicateError –

joins(partner: TableReference | ColumnReference) → bool#

Checks, whether this predicate joins the given column or table.

Parameters:: partner (TableReference | ColumnReference)
Return type:: bool

property lhs: ColumnReference#: Get the left-hand side of the join.

partner_of(other: TableReference | ColumnReference) → TableReference | ColumnReference | None#

Provides the join partner of the given column or table.

If the column or table is not joined, None is returned.

Parameters:: other (TableReference | ColumnReference)
Return type:: TableReference | ColumnReference | None

property rhs: ColumnReference#: Get the right-hand side of the join.

unwrap() → AbstractPredicate#

Get the original predicate that is represented by this view.

Return type:: AbstractPredicate

static wrap(predicate: AbstractPredicate) → SimpleJoin#

Transforms a specific predicate into a simplified view. Raises an error if that is not possible.

Parameters:: predicate (AbstractPredicate) – The predicate to represent as a simplified view
Returns:: A simplified view wrapping the given predicate
Return type:: SimpleJoin
Raises:: ValueError – If the predicate cannot be represented as a simplified view.

static wrap_all(predicates: Iterable[AbstractPredicate]) → Sequence[SimpleJoin]#

Transforms specific predicates into simplified views.

If individual predicates cannot be represented as views, they are ignored.

Parameters:: predicates (Iterable[AbstractPredicate]) – The predicates to represent as views. These can be arbitrary predicates, i.e. including filters and complex joins.
Returns:: The simplified views for all predicates that can be represented this way. The sequence of the views matches the sequence in the predicates. If the representation fails for individual predicates, they simply do not appear in the result. Therefore, this sequence may be empty if none of the predicates are valid simplified views.
Return type:: Sequence[SimpleJoin]

class postbound.qal.SqlExpression(hash_val: int)#

Base class for all expressions.

Expressions form one of the central building blocks of representing a SQL query in the QAL. They specify how values from different columns are modified and combined, thereby forming larger (hierarchical) structures.

Expressions can be inserted in many different places in a SQL query. For example, a SELECT clause produces columns such as in SELECT R.a FROM R, but it can also modify the column values slightly, such as in SELECT R.a + 42 FROM R. To account for all these different situations, the SqlExpression is intended to form hierarchical trees and chains of expressions. In the first case, a ColumnExpression is used, whereas a MathExpression can model the second case. Whereas column expressions represent leaves in the expression tree, mathematical expressions are intermediate nodes.

As a more advanced example, a complicated expressions such as my_udf(R.a::interval + 42) which consists of a user-defined function, a value cast and a mathematical operation is represented the following way: FunctionExpression(MathExpression(CastExpression(ColumnExpression), StaticValueExpression)). The methods provided by all expression instances enable a more convenient use and access to the expression hierarchies.

The different kinds of expressions are represented using different subclasses of the SqlExpression interface. This really is an abstract interface, not a usable expression. All inheriting expression have to provide their own __eq__ method and re-use the __hash__ method provided by the base expression. Remember to explicitly set this up! The concrete hash value is constant since the clause itself is immutable. It is up to the implementing class to make sure that the equality/hash consistency is enforced.

Parameters:: hash_val (int) – The hash of the concrete expression object

abstractmethod accept_visitor(visitor: SqlExpressionVisitor[VisitorResult], *args, **kwargs) → VisitorResult#

Enables processing of the current expression by an expression visitor.

Parameters:

visitor (SqlExpressionVisitor[VisitorResult]) – The visitor
args – Additional arguments that are passed to the visitor
kwargs – Additional keyword arguments that are passed to the visitor

Return type:

VisitorResult

abstractmethod columns() → set[ColumnReference]#

Provides all columns that are referenced by this expression.

Returns:: The columns
Return type:: set[ColumnReference]

abstractmethod iterchildren() → Iterable[SqlExpression]#

Provides unified access to all child expressions of the concrete expression type.

Returns:: The expressions
Return type:: Iterable[SqlExpression]

abstractmethod itercolumns() → Iterable[ColumnReference]#

Provides all columns that are referenced by this expression.

If a column is referenced multiple times, it is also returned multiple times.

Returns:: All columns in exactly the order in which they are used.
Return type:: Iterable[ColumnReference]

abstractmethod tables() → set[TableReference]#

Provides all tables that are accessed by this expression.

Returns:: All tables. This includes virtual tables if such tables are present in the expression.
Return type:: set[TableReference]

class postbound.qal.SqlExpressionVisitor#

Basic visitor to operator on arbitrary expression trees.

See also

SqlExpression

References

class postbound.qal.StarExpression(*, from_table: TableReference | None = None)#

A special expression that is only used in SELECT clauses to select all columns.

Parameters:: from_table (Optional[TableReference], optional) – The table from which to select all columns. Defaults to None, in which case all columns of all tables are being selected.

accept_visitor(visitor: SqlExpressionVisitor[VisitorResult], *args, **kwargs) → VisitorResult#

Enables processing of the current expression by an expression visitor.

Parameters:

visitor (SqlExpressionVisitor[VisitorResult]) – The visitor
args – Additional arguments that are passed to the visitor
kwargs – Additional keyword arguments that are passed to the visitor

Return type:

VisitorResult

columns() → set[ColumnReference]#

Provides all columns that are referenced by this expression.

Returns:: The columns
Return type:: set[ColumnReference]

property from_table: TableReference | None#

Get the table from which to select all columns.

If no such table was selected, all columns of all tables are being selected.

Returns:: The table, or None if all columns are selected
Return type:: Optional[TableReference]

iterchildren() → Iterable[SqlExpression]#

Provides unified access to all child expressions of the concrete expression type.

Returns:: The expressions
Return type:: Iterable[SqlExpression]

itercolumns() → Iterable[ColumnReference]#

Provides all columns that are referenced by this expression.

If a column is referenced multiple times, it is also returned multiple times.

Returns:: All columns in exactly the order in which they are used.
Return type:: Iterable[ColumnReference]

tables() → set[TableReference]#

Provides all tables that are accessed by this expression.

Returns:: All tables. This includes virtual tables if such tables are present in the expression.
Return type:: set[TableReference]

class postbound.qal.StaticValueExpression(value: T)#

An expression that wraps a literal/static value.

This is one of the leaf expressions that does not contain any further child expressions.

NULL values are represented by None, you can use StaticValueExpression.null() to create such an expression and is_null() to check whether the value is NULL.

Parameters:: value (T) – The value that is wrapped by the expression

Examples

Consider the following SQL query: SELECT * FROM R WHERE R.a = 42. In this case the comparison value of 42 will be represented as a static value expression. The reference to the column R.a cannot be a static value since its values depend on the actual column values. Hence, a ColumnExpression is used for it.

accept_visitor(visitor: SqlExpressionVisitor[VisitorResult], *args, **kwargs) → VisitorResult#

Enables processing of the current expression by an expression visitor.

Parameters:

visitor (SqlExpressionVisitor[VisitorResult]) – The visitor
args – Additional arguments that are passed to the visitor
kwargs – Additional keyword arguments that are passed to the visitor

Return type:

VisitorResult

columns() → set[ColumnReference]#

Provides all columns that are referenced by this expression.

Returns:: The columns
Return type:: set[ColumnReference]

is_null() → bool#

Checks, whether the value is NULL.

Return type:: bool

iterchildren() → Iterable[SqlExpression]#

Provides unified access to all child expressions of the concrete expression type.

Returns:: The expressions
Return type:: Iterable[SqlExpression]

itercolumns() → Iterable[ColumnReference]#

Provides all columns that are referenced by this expression.

If a column is referenced multiple times, it is also returned multiple times.

Returns:: All columns in exactly the order in which they are used.
Return type:: Iterable[ColumnReference]

static null() → StaticValueExpression[None]#

Create a static value expression that represents a NULL value.

Return type:: StaticValueExpression[None]

tables() → set[TableReference]#

Provides all tables that are accessed by this expression.

Returns:: All tables. This includes virtual tables if such tables are present in the expression.
Return type:: set[TableReference]

property value: T#

Get the value.

Returns:: The value, duh!
Return type:: T

class postbound.qal.SubqueryExpression(subquery: SqlQuery)#

A subquery expression wraps an arbitrary subquery.

This expression can be used in two different contexts: as table source to produce a virtual temporary table for reference in the query (see the clauses module), or as a part of a predicate. In the latter scenario the subqueries’ results are transient for the rest of the query. Therefore, this expression only represents the subquery part but no name under which the query result can be accessed. This is added by the different parts of the clauses module (e.g. WithQuery or SubqueryTableSource).

This is a leaf expression, i.e. a subquery expression cannot have any more child expressions. However, the subquery itself likely consists of additional expressions.

Parameters:: subquery (SqlQuery) – The subquery that forms this expression

accept_visitor(visitor: SqlExpressionVisitor[VisitorResult], *args, **kwargs) → VisitorResult#

Enables processing of the current expression by an expression visitor.

Parameters:

visitor (SqlExpressionVisitor[VisitorResult]) – The visitor
args – Additional arguments that are passed to the visitor
kwargs – Additional keyword arguments that are passed to the visitor

Return type:

VisitorResult

columns() → set[ColumnReference]#

Provides all columns that are referenced by this expression.

Returns:: The columns
Return type:: set[ColumnReference]

iterchildren() → Iterable[SqlExpression]#

Provides unified access to all child expressions of the concrete expression type.

Returns:: The expressions
Return type:: Iterable[SqlExpression]

itercolumns() → Iterable[ColumnReference]#

Provides all columns that are referenced by this expression.

If a column is referenced multiple times, it is also returned multiple times.

Returns:: All columns in exactly the order in which they are used.
Return type:: Iterable[ColumnReference]

property query: SqlQuery#

The (sub)query that is wrapped by this expression.

Returns:: The query
Return type:: SqlQuery

tables() → set[TableReference]#

Provides all tables that are accessed by this expression.

Returns:: All tables. This includes virtual tables if such tables are present in the expression.
Return type:: set[TableReference]

class postbound.qal.SubqueryTableSource(query: SqlQuery | SubqueryExpression, target_name: str | TableReference = '', *, lateral: bool = False)#

Models subquery that is referenced as a virtual table in the FROM clause.

Consider the example query SELECT * FROM R, (SELECT * FROM S, T WHERE S.a = T.b) AS s_t WHERE R.c = s_t.a. In this query, the subquery s_t would be represented as a subquery table source.

Parameters:

query (SqlQuery | SubqueryExpression) – The query that is sourced as a subquery
target_name (str | TableReference, optional) – The name under which the subquery should be made available. Can empty for an anonymous subquery.
lateral (bool, optional) – Whether the subquery should be executed as a lateral join. Defaults to False.

Raises:

ValueError – If the target_name is empty

columns() → set[ColumnReference]#

Provides all column sthat are referenced in the source.

For plain table sources this will be empty, but for more complicate structures such as subquery source, this will include all columns that are referenced in the subquery.

Returns:: The columns
Return type:: set[ColumnReference]

property expression: SubqueryExpression#

Get the query that is used to construct the virtual table, as a subquery expression.

Returns:: The subquery.
Return type:: SubqueryExpression

itercolumns() → Iterable[ColumnReference]#

Provides access to all column in the source.

Returns:: All columns exactly in the order in which they are used
Return type:: Iterable[ColumnReference]

iterexpressions() → Iterable[SqlExpression]#

Provides access to all directly contained expressions in the source.

Returns:: The expressions
Return type:: Iterable[SqlExpression]

property lateral: bool#

Get, whether this is a lateral join.

Returns:: Whether this is a lateral join
Return type:: bool

output_columns() → Sequence[ColumnReference]#

Provides the columns that form the result relation of this subquery.

The columns are named according to the following rules:

If the column has an alias, this name is used
If the column is a simple column reference, the column name is used
Otherwise, a generic name is used, e.g. “column_1”, “column_2”, etc.

Returns:: The columns. Their order matches the order in which they appear in the SELECT clause of the query.
Return type:: Sequence[ColumnReference]

predicates() → QueryPredicates | None#

Provides all predicates that are contained in the source.

For plain table sources this will be None, but for subquery sources, etc. all predicates are returned.

Returns:: The predicates or None if the source does not allow predicates or simply does not contain any.
Return type:: QueryPredicates | None

property query: SqlQuery#

Get the query that is sourced as a virtual table.

Returns:: The query
Return type:: SqlQuery

tables() → set[TableReference]#

Provides all tables that are referenced in the source.

For plain table sources this will just be the actual table. For more complicated structures, such as subquery sources, this will include all tables of the subquery as well.

Returns:: The tables.
Return type:: set[TableReference]

property target_name: str#

Get the name under which the virtual table can be accessed in the actual query.

Returns:: The name. Can be empty for an anonymous subquery.
Return type:: str

property target_table: TableReference | None#

Get the name under which the virtual table can be accessed in the actual query.

The only difference to target_name this return type: this property provides the name as a proper table reference, rather than a string.

Returns:: The table. This will always be a virtual table. Can be None for an anonymous subquery.
Return type:: Optional[TableReference]

class postbound.qal.TableSource#

A table source models a relation that can be scanned by the database system, filtered, joined, …

This is what is commonly reffered to as a table or a relation and forms the basic item of a FROM clause. In an SQL query the items of the FROM clause can originate from a number of different concepts. In the simplest case, this is just a physical table (e.g. SELECT * FROM R, S, T WHERE ...), but other queries might reference subqueries or common table expressions in the FROM clause (e.g. SELECT * FROM R, (SELECT * FROM S, T WHERE ...) WHERE ...). This class models the similarities between these concepts. Specific sub-classes implement them for the concrete kind of source (e.g. DirectTableSource or SubqueryTableSource).

abstractmethod columns() → set[ColumnReference]#

Provides all column sthat are referenced in the source.

For plain table sources this will be empty, but for more complicate structures such as subquery source, this will include all columns that are referenced in the subquery.

Returns:: The columns
Return type:: set[ColumnReference]

abstractmethod itercolumns() → Iterable[ColumnReference]#

Provides access to all column in the source.

Returns:: All columns exactly in the order in which they are used
Return type:: Iterable[ColumnReference]

abstractmethod iterexpressions() → Iterable[SqlExpression]#

Provides access to all directly contained expressions in the source.

Returns:: The expressions
Return type:: Iterable[SqlExpression]

abstractmethod predicates() → QueryPredicates | None#

Provides all predicates that are contained in the source.

For plain table sources this will be None, but for subquery sources, etc. all predicates are returned.

Returns:: The predicates or None if the source does not allow predicates or simply does not contain any.
Return type:: QueryPredicates | None

abstractmethod tables() → set[TableReference]#

Provides all tables that are referenced in the source.

For plain table sources this will just be the actual table. For more complicated structures, such as subquery sources, this will include all tables of the subquery as well.

Returns:: The tables.
Return type:: set[TableReference]

class postbound.qal.UnaryPredicate(column: SqlExpression, operation: MathOperator | LogicalOperator | CompoundOperator | None = None)#

A unary predicate is applied directly to an expression, evaluating to TRUE or FALSE.

Examples of such predicates include R.a IS NOT NULL, EXISTS (SELECT S.b FROM S WHERE R.a = S.b), or my_udf(R.a). In the last case, my_udf has to produce a boolean return value.

Parameters:

column (SqlExpression) – The expression that is tested. This can also be a user-defined function that produces a boolen return value.
operation (Optional[SqlOperator], optional) – The operation that is used to generate the unary predicate. Only a small subset of operators can actually be used in this context (e.g. EXISTS or MISSING). If the predicate does not require an operator (e.g. in the case of filtering UDFs), the operation can be None. Notice however, that PostBOUND has no knowledge of the semantics of UDFs and can therefore not enforce, whether UDFs is actually valid in this context. This has to be done at runtime by the actual database system.

Raises:

ValueError – If the given operation is not a valid unary operator

accept_visitor(visitor: PredicateVisitor[VisitorResult] | SqlExpressionVisitor[VisitorResult], *args, **kwargs) → VisitorResult#

Enables processing of the current predicate by a predicate visitor.

Parameters:

visitor (PredicateVisitor[VisitorResult]) – The visitor
args – Additional arguments to pass to the visitor
kwargs – Additional keyword arguments to pass to the visitor

Return type:

VisitorResult

property column: SqlExpression#

The column that is checked by this predicate

Returns:: The expression
Return type:: SqlExpression

columns() → set[ColumnReference]#

Provides all columns that are referenced by this predicate.

Returns:: The columns. If the predicate contains a subquery, all columns of that query are included.
Return type:: set[ColumnReference]

static exists(subquery: SqlQuery | SubqueryExpression) → UnaryPredicate#

Creates an EXISTS predicate for a subquery.

Parameters:: subquery (SqlQuery | SubqueryExpression) – The subquery. Will be wrapped in a SubqueryExpression if it is not already one.
Returns:: The EXISTS predicate
Return type:: UnaryPredicate

is_exists() → bool#

Checks, whether this predicate is an EXISTS predicate.

Returns:: Whether this predicate is an EXISTS predicate
Return type:: bool

is_join() → bool#

Checks, whether this predicate encodes a join between two tables.

PostBOUND uses the following criteria to determine, whether a predicate is join or not:

all predicates of the form <col 1> <operator> <col 2> where <col 1> and <col 2> come from different tables are joins. The columns can optionally be modified by value casts or static expressions, e.g. R.a::integer + 7
all functions that access columns from multiple tables are joins, e.g. my_udf(R.a, S.b)
all subqueries are treated as filters, no matter whether they are dependent subqueries or not. This means that both R.a = (SELECT MAX(S.b) FROM S) and R.a = (SELECT MAX(S.b) FROM S WHERE R.c = S.d) are treated as filters and not as joins, even though the second subquery will require some sort of the join in the query plan.
BETWEEN and IN predicates are treated according to rule 1 since they can be emulated via base predicates (subqueries in IN predicates are evaluated according to rule 3.)

Returns:: Whether the predicate is a join of different relations.
Return type:: bool

itercolumns() → Iterable[ColumnReference]#

Provides all columns that are referenced by this predicate.

If a column is referenced multiple times, it is also returned multiple times.

Returns:: All columns in exactly the order in which they are used.
Return type:: Iterable[ColumnReference]

iterexpressions() → Iterable[SqlExpression]#

Provides access to all expressions that are directly contained in this predicate.

Nested expressions can be accessed from these expressions in a recursive manner (see the SqlExpression interface for details).

Returns:: The expressions
Return type:: Iterable[SqlExpression]

join_partners() → set[tuple[ColumnReference, ColumnReference]]#

Provides all pairs of columns that are joined within this predicate.

Returns:: The pairs of joined columns. These pairs are normalized, such that two predicates which join the same columns provide the join partners in the same order.
Return type:: set[tuple[ColumnReference, ColumnReference]]
Raises:: NoJoinPredicateError –

class postbound.qal.UnionClause(left_query: SqlQuery | SetQuery, right_query: SqlQuery | SetQuery, *, union_all: bool = False)#

The UNION or UNION ALL clause of a query.

Parameters:

left_query (SelectStatement) – The left input to the UNION operation. Since UNIONs are commutative, the assignment of left and right does not really matter.
right_query (SelectStatement) – The right input to the UNION operation. Since UNIONs are commutative, the assignment of left and right does not really matter.
union_all (bool, optional) – Whether the UNION operation should eliminate duplicates or not. Defaults to False which indicates that duplicates should be eliminated.

accept_visitor(visitor: ClauseVisitor[VisitorResult], *args, **kwargs) → VisitorResult#

Enables processing of the current clause by a visitor.

Parameters:

visitor (ClauseVisitor[VisitorResult]) – The visitor
args – Additional arguments that are passed to the visitor
kwargs – Additional keyword arguments that are passed to the visitor

Return type:

VisitorResult

columns() → set[ColumnReference]#

Provides all columns that are referenced in the clause.

Returns:: The columns
Return type:: set[ColumnReference]

input_queries() → set[SqlQuery | SetQuery]#

Get the two input queries that are part of the UNION operation.

Returns:: The left and right queries. Since UNIONs are commutative, the assignment of left and right does not really matter.
Return type:: set[SelectStatement]

is_union_all() → bool#

Whether this is a UNION or UNION ALL clause.

Returns:: True if duplicates are eliminated, False otherwise.
Return type:: bool

itercolumns() → Iterable[ColumnReference]#

Provides access to all column in this clause.

Returns:: All columns exactly in the order in which they are used
Return type:: Iterable[ColumnReference]

iterexpressions() → Iterable[SqlExpression]#

Provides access to all directly contained expressions in this clause.

Nested expressions can be accessed from these expressions in a recursive manner (see the SqlExpression interface for details).

Returns:: The expressions
Return type:: Iterable[SqlExpression]

property left_query: SqlQuery | SetQuery#

Get the left query that is part of the UNION operation.

Returns:: The left query. Since UNIONs are commutative, the assignment of left and right does not really matter.
Return type:: SelectStatement

See also

property right_query: SqlQuery | SetQuery#

Get the right query that is part of the UNION operation.

Returns:: The right query. Since UNIONs are commutative, the assignment of left and right does not really matter.
Return type:: SelectStatement

See also