Cookbook
========

The cookbook demonstrates how to perform certain, frequently used patterns.
Throughout the examples, we use the following setup:

.. ipython:: python
    :okwarning:

    import postbound as pb
    pg_instance = pb.postgres.connect(config_file=".psycopg_connection")
    stats = pb.workloads.stats()


.. _cookbook-cardinality-estimation:

Cardinality estimation
----------------------

PostBOUND provides two main :doc:`optimization pipelines <core/optimization>` to implement new cardinality estimators, the
:class:`~postbound.TextBookOptimizationPipeline` and the :class:`~postbound.MultiStageOptimizationPipeline`. At a first glance,
the :class:`~postbound.TextBookOptimizationPipeline` might seem to be the better suited one. However, this pipeline does not
allow to "fill the gaps" as well, since it relies on a (simulated) plan enumerator for performance reasons. Therefore, plans
generated by a textbook pipeline which does not specify its own enumerator can differ slightly from the plans that the native
query optimizer would select (see the details of the :ref:`default plan enumerator selection <default-enumerator>`).
In contrast, the :class:`~postbound.MultiStageOptimizationPipeline` allows to use only generated cardinality estimates as part
of the :class:`~postbound.PlanParameterization` stage, even without specifying a join order or physical operators.
This allows the "fill the gaps" approach to use the native query optimizer's join ordering and operator selection strategies to
take over.

**Therefore, the recommended way to implement a new cardinality estimator is to use the**
:class:`~postbound.CardinalityEstimator` **within a** :class:`~postbound.MultiStageOptimizationPipeline`. If you should need to
switch to a textbook pipeline later, you can do so without changing your estimator implementation. This is because the
estimator class already provides default implementations to satisfy both interfaces.


.. _cookbook-partial-hinting:

Manual hinting
--------------

You can easily generate hinted queries and execution plans using any of the fundamental
:ref:`optimizer data structures <optimizer-data-structures>` by talking directly to the :class:`~postbound.db.HintService` of
your target :class:`~postbound.Database`:

.. ipython:: python

    query = stats["q-10"]
    print(pb.qal.format_quick(query))
    operators = pb.PhysicalOperatorAssignment()
    operators.add(pb.JoinOperator.HashJoin, query.tables())
    operators

    hinted_query = pg_instance.hinting().generate_hints(query, physical_operators=operators)
    hinted_query

    print(pg_instance.optimizer().query_plan(hinted_query).inspect())

Combined with the :mod:`query transformation tools <postbound.transform>` this is a powerful mechanism to obtain (partial)
plans for arbitrary subqueries:

.. ipython:: python
    :okwarning:

    subquery = pb.transform.extract_query_fragment(query, pb.TableReference("posts", "p"))
    print(pb.qal.format_quick(subquery))
    cards = pb.PlanParameterization()
    cards.add_cardinality(subquery.tables(), 42)
    hinted_subquery = pg_instance.hinting().generate_hints(subquery, plan_parameters=cards)


.. _cookbook-postgres-plans:

Postgres Query Plans
--------------------

When working with Postgres, there are three basic ways to access query plans:

1. You can retrieve the raw plan JSON using a plain :meth:`~postbound.postgres.PostgresInterface.execute_query`
2. You can parse a raw plan into a :class:`~postbound.postgres.PostgresExplainPlan`, which is pretty
   much a 1:1 model of the raw plan with more expressive attribute access and some high-level access methods
3. You can convert an explain into a proper normalized :class:`~postbound.QueryPlan` object

The conversion between the different formats works as follows:

.. ipython:: python

    query = stats["q-10"]
    explain_query = pb.transform.as_explain(query)
    raw_plan = pg_instance.execute_query(explain_query)
    raw_plan
    postgres_plan = pb.postgres.PostgresExplainPlan(raw_plan)
    print(postgres_plan.inspect())
    qep = postgres_plan.as_qep()
    print(qep.inspect())


.. _jsonize:

JSON export
-----------

To export arbitrary objets to JSON, PostBOUND provides a *jsonize* protocol. Essentially, all you need to do is a add a
``__json__`` method to your class. This class can emit arbitrary objects that can either be JSON-serialized by Python's
standard JSON dump logic, or that provide a ``__json__`` method themselves.
To make sure that this method works, use the :func:`~postbound.util.to_json` or
:func:`~postbound.util.to_json_dump` for the export. All of PostBOUND's built-in JSON export does this automatically.


Miscellaneous utilities
-----------------------

There are some general utilities that might make your life a little easier, mostly when it comes to working with one or
multiple instances of some class.

Use :func:`~postbound.util.collections.enlist` if you need to make sure that you are always working with a list of objects.
This function will wrap single objects into a list, but leave iterables as they are:

.. ipython:: python

    pb.util.enlist(42)
    pb.util.enlist([1, 2, 3])
    pb.util.enlist("hello world")  # strings are treated as single objects

:func:`~postbound.util.collections.simplify` can be thought of as the inverse. For iterables that only contain a single
element, it provides that element. Otherwise, it leaves the iterable as it is. Therefore, this function should mostly be used
in situations where you already know that you are dealing with a singleton list. The function is overloaded to provide type
hints for precisely this situation.

.. ipython:: python

    pb.util.simplify([42])
    pb.util.simplify([1, 2, 3])

Use :func:`~postbound.util.collections.flatten` to flatten nested lists or iterables:

.. ipython:: python

    pb.util.flatten([[1, 2], [3, 4]])

Likewise, :func:`~postbound.util.collections.set_union` performs a union over multiple sets, thereby removing duplicates:

.. ipython:: python

    pb.util.set_union([{1, 2, 3}, {2, 3, 4}, {4, 5}])