.. _jensens: Jensen's Bound as potential starting bound ========================================== .. warning:: **Jensen's bound is a valid lower bound for a minimization problem when the recourse value function is convex in the random parameters** Necessary conditions for a valid *outer* bound: 1. The second-stage problem is an LP — no integer recourse. 2. Random parameters appear only in objective coefficients and in the right-hand side, not in ways that break convexity of the recourse value in those parameters. 3. Two-stage structure. ``mpi-sppy`` checks (1) automatically on the outer-bound path and refuses to compute the outer bound when non-nonant integer/binary Vars exist in the recourse variables while (2) and (3) are the user's responsibility. **Inner bounds (xhatters) are valid regardless of the above.** The xhat path uses the average scenario only as a *source of a candidate first-stage solution*; that candidate is then honestly evaluated across the real scenarios, so Jensen's convexity assumption never enters the validity argument. .. note:: The Jensen's xhat path **silently skips** a candidate that is infeasible to pin in some real scenario; that is the right behavior for an opportunistic inner-bound spoke. Downstream consumers that pin a candidate first-stage and solve a per-scenario subproblem with no fallback (e.g. pin-dual rho-setting algorithms) need a strictly stronger contract -- a candidate guaranteed to be feasible to pin in *every* real scenario. See :ref:`feasible_xhat` for the ``feasible_xhat_creator`` convention and the ``---try-feasible-xhat-first`` flags. Per xhat spoke, ``--*-try-jensens-first`` and ``--*-try-feasible-xhat-first`` are mutually exclusive. What the options do ------------------- Two-stage only (for now). Each supported spoke gains one boolean flag: Outer-bound (lower-bounder) spokes * ``--lagrangian-try-jensens-first`` * ``--lagranger-try-jensens-first`` * ``--subgradient-try-jensens-first`` * ``--reduced-costs-try-jensens-first`` Inner-bound (xhat) spokes * ``--xhatshuffle-try-jensens-first`` * ``--xhatxbar-try-jensens-first`` * ``--xhatlooper-try-jensens-first`` * ``--xhatspecific-try-jensens-first`` When the flag is set, before the spoke enters its usual iterations it builds a single *average scenario* (via the module's ``average_scenario_creator``; see below), solves it with the spoke's configured solver, and then: * on the outer-bound path, sends the solver's best dual bound on the average scenario as its first outer bound; * on the xhat path, takes the average scenario's first-stage solution, evaluates it across all scenarios, and reports the expected cost as its first inner bound if it is feasible for all scenarios. .. admonition:: Solver options and what is extracted as the bound :class: note The average-scenario solve uses ``iterk_solver_options`` (the spoke's "production" tolerances), not ``iter0_solver_options``. It is a one-shot deterministic solve, not a first-iteration subproblem solve. On the outer-bound path the value sent to the hub is the **solver's best dual bound** on the average scenario (Pyomo's ``results.problem.lower_bound`` for minimize, ``upper_bound`` for maximize), not the incumbent objective value. With the dual bound, a non-zero MIP gap on the average-scenario solve does **not** invalidate Jensen's outer bound — the dual bound is a valid lower bound on the average-scenario optimum regardless of gap, and Jensen's inequality transitively makes it a valid lower bound on the expected recourse cost. (For an LP the dual bound and the incumbent coincide, so this distinction collapses.) On the xhat path the dual bound is ignored; only the first-stage nonant values are used, and they are honestly re-evaluated across the real scenarios via ``Xhat_Eval.evaluate``. A loose MIP gap on the average-scenario solve there just means a possibly-worse candidate xhat, never an invalid bound. The spoke then continues its normal loop. There is **no** "iteration -1" in PH, APH, L-shaped, or any hub — the Jensen's work happens entirely inside a spoke. **What the average is computed over.** The average-scenario data must be the sample mean of the data for every scenario in the run (*not* the expectation of any underlying continuous distribution). This matches how ``mpi-sppy`` treats the outer problem: the sample-average approximation (SAA) is what is actually being solved, and Jensen's bound applies directly to that problem. Authoring ``average_scenario_creator`` -------------------------------------- A scenario module that wants to participate must define: .. code-block:: python def average_scenario_creator(scenario_name, **kwargs): """Return a Pyomo model with the same shape as scenario_creator(...), but built from sample-averaged random data. Two-stage: _mpisppy_probability must be 1.0 (a single deterministic scenario, not a member of a probabilistic ensemble). """ .. admonition:: Under the Hood :class: note Discovery is via ``getattr(module, "average_scenario_creator", None)``; if a flag is set but the module does not define the function, ``cfg_vanilla`` raises a clear error at spoke-setup time. The recommended authoring pattern — which ``examples/farmer/farmer.py`` now demonstrates — is to split the work into two underscore helpers that are shared between ``scenario_creator`` and ``average_scenario_creator``: .. code-block:: python def _scenario_data(scenario_name, **kwargs): """Pure-Python random data as a plain dict. No Pyomo.""" ... def _build_model(scenario_name, data, *, probability, **kwargs): """Build Pyomo model from the data dict. Shared build path.""" ... def scenario_creator(scenario_name, **kwargs): data = _scenario_data(scenario_name, **kwargs) prob = 1.0 / kwargs["num_scens"] if kwargs.get("num_scens") else "uniform" return _build_model(scenario_name, data, probability=prob, **kwargs) def average_scenario_creator(scenario_name, **kwargs): # NOTE: could be multi-threaded for large num_scens. snames = scenario_names_creator(kwargs["num_scens"]) datas = [_scenario_data(s, **kwargs) for s in snames] avg = _average_scenario_data(datas) return _build_model(scenario_name, avg, probability=1.0, **kwargs) Benefits of this split: * One source of truth for "what does the Pyomo model look like." If ``scenario_creator`` and ``average_scenario_creator`` each build the model inline, they will eventually drift. * Separating data from model makes averaging trivial: average the dict, not the Pyomo components. * Every rank independently calls ``average_scenario_creator`` and gets an identical model. No collective communication needed. Data-only averaging: the model is the model ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The model returned by ``average_scenario_creator`` must be the same model that ``scenario_creator`` returns; only the *data* that varies across scenarios may change. Variable domains — including integrality and bounds — are part of the model and must not be relaxed in the average scenario. Constraint forms must not be edited either. The only thing that legitimately moves between the two creators is the values of whatever data the original ``scenario_creator`` would have read from disk or generated randomly. If averaging the per-scenario data produces a model that is infeasible — for example because averaging a binary stochastic parameter yields a fractional value that conflicts with an integer recourse constraint — then the problem is not amenable to Jensen-style averaging, and the right answer is **not** to ship an ``average_scenario_creator`` at all. Working around the infeasibility by relaxing a Var domain in the average scenario would produce a *different* model whose solution would be solving a different problem — at best an LP-relaxation heuristic that happens to share the same shape, not a Jensen's bound. Even when ``average_scenario_creator`` is not directly usable, the underscore helpers above (``_scenario_data``, ``_average_scenario_data``, ``_build_model``) may still be worth shipping. A ``feasible_xhat_creator`` (see :ref:`feasible_xhat`) is free to build an averaged-data model internally — even one with relaxed Var domains — to derive a starting first-stage point, because its output is then projected and verified feasible in every real scenario rather than handed off as a Jensen's bound. The data-only-averaging principle does not bind ``feasible_xhat_creator``; it only binds ``average_scenario_creator``. When the model isn't amenable: sslp ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ``examples/sslp/sslp.py`` deliberately does **not** ship an ``average_scenario_creator``. Its only stochastic data is the binary indicator ``ClientPresent[i]`` (one per client, in :math:`\{0, 1\}` per scenario). The empirical mean across scenarios is a fraction in :math:`[0, 1]`, which makes the model constraint .. code-block:: python sum(Allocation[i, j] for j in Servers) == ClientPresent[i] infeasible: ``Allocation`` is Binary in the model and cannot be fractional. Relaxing ``Allocation`` to continuous in the average scenario would make the LP feasible but would violate the data-only-averaging principle above, so sslp opts out. A user who flips ``--*-try-jensens-first`` against a model module with no ``average_scenario_creator`` function will get a clear error from ``cfg_vanilla`` saying the function is missing. Seed management --------------- This is the single most common source of bugs in user-authored ``average_scenario_creator`` functions. Please read this section before writing your own. **Use a local RNG per call, not a module-level global.** The ``farmer`` example deliberately avoids the ``farmerstream = np.random.RandomState()`` module-level pattern and instead does: .. code-block:: python def _scenario_data(scenario_name, ..., seedoffset=0): scennum = sputils.extract_num(scenario_name) rng = np.random.RandomState(scennum + seedoffset) # local, fresh ... The module-level pattern is thread-unsafe: two threads calling the same function race on ``.seed()`` and ``.rand()`` and produce garbage. With a local RNG, ``average_scenario_creator`` can safely be parallelized later (e.g. via ``concurrent.futures.ThreadPoolExecutor``). The on-disk results are byte-for-byte identical to the shared-stream pattern for a given ``(scennum, seedoffset)``. **Draw order must be deterministic and explicit.** Do not rely on Pyomo's component-build order to sequence your random draws. Build the data dict in a plain Python loop first; then hand that dict to ``_build_model``. The underscore-helper pattern does this automatically. **``seedoffset`` must be threaded through both creators.** If your module already accepts ``seedoffset`` in ``scenario_creator`` (as the confidence-interval code expects), make sure ``average_scenario_creator`` accepts and forwards it too. Otherwise, a confidence-interval run with a non-zero ``seedoffset`` would get a Jensen bound computed from a scenario set that does not match the one the run is actually solving. **Thread-safety.** The ``average_scenario_creator`` you ship should be safe to call from multiple threads concurrently. With local RNGs (above), it is — and the farmer example's docstring sketches how to wrap the per-scenario loop in a ``ThreadPoolExecutor``. With a shared module-level RNG, it is not. Worked example: farmer ---------------------- See ``examples/farmer/farmer.py`` for the pedagogical reference implementation. The refactor split the original monolithic ``scenario_creator`` into ``_scenario_data`` (pure-Python yield generation using a local ``RandomState``) and ``_build_model`` (Pyomo construction). ``average_scenario_creator`` averages the yield dicts from every would-be scenario and hands the average to ``_build_model`` with ``probability=1.0``. To see it in action: .. code-block:: bash mpiexec -np 3 python -m mpi4py mpisppy/generic_cylinders.py \ --module-name farmer --num-scens 3 --default-rho 1 \ --solver-name cplex --lagrangian --xhatshuffle \ --lagrangian-try-jensens-first --xhatshuffle-try-jensens-first \ --max-iterations 2 The lagrangian spoke will send its Jensen's outer bound before running iteration 0, and the xhatshuffle spoke will evaluate the average scenario's first-stage solution as its first xhat. (A variant that loads scenario data from ``.dat`` files rather than generating it with an RNG can be seen in ``examples/sizes/sizes.py``, which uses a Pyomo AbstractModel.) Interaction with each spoke's own iteration 0 --------------------------------------------- * **lagrangian**: its normal iteration 0 trivial bound (Lagrange multipliers :math:`W = 0`) is itself wait-and-see and therefore already a valid outer bound. The value of this flag is that the average-scenario bound arrives *before* iteration 0 finishes, which can matter a lot when there are many scenarios and iteration 0 is slow. The flag does not replace or disable iteration 0. For some problems Jensen's bound is tighter than the trivial iteration 0 bound. * **subgradient**: iteration 0 solves all scenarios. The average- scenario bound arrives earlier. * **reduced_costs**: inherits the lagrangian behavior. * **lagranger**: like lagrangian, but takes nonants from the hub and computes its own W rather than receiving W. Iteration 0 is still the trivial W=0 wait-and-see bound, so the same reasoning applies as for lagrangian. (note that this bounder seldom works well) Convexity limitations, in more detail ------------------------------------- The integer-recourse check is *necessary but not sufficient*. A continuous-recourse two-stage LP can still violate convexity in :math:`\xi` if, for example, random parameters enter the recourse problem's technology matrix in a way that makes the recourse value non-convex in those parameters. ``mpi-sppy`` does not detect this automatically — if you know your problem has matrix-randomness, verify convexity before enabling the outer-bound flag. The xhat path has no such restriction: the candidate solution is honestly evaluated, so any two-stage recourse structure (including integer recourse and non-convex recourse) is acceptable there.