Jensen’s Bound as potential starting bound

Warning

Jensen’s bound is a valid lower bound for a minimization problem when the recourse value function is convex in the random parameters

Necessary conditions for a valid outer bound:

The second-stage problem is an LP — no integer recourse.
Random parameters appear only in objective coefficients and in the right-hand side, not in ways that break convexity of the recourse value in those parameters.
Two-stage structure.

mpi-sppy checks (1) automatically on the outer-bound path and refuses to compute the outer bound when non-nonant integer/binary Vars exist in the recourse variables while (2) and (3) are the user’s responsibility.

Inner bounds (xhatters) are valid regardless of the above. The xhat path uses the average scenario only as a source of a candidate first-stage solution; that candidate is then honestly evaluated across the real scenarios, so Jensen’s convexity assumption never enters the validity argument.

Note

The Jensen’s xhat path silently skips a candidate that is infeasible to pin in some real scenario; that is the right behavior for an opportunistic inner-bound spoke. Downstream consumers that pin a candidate first-stage and solve a per-scenario subproblem with no fallback (e.g. pin-dual rho-setting algorithms) need a strictly stronger contract – a candidate guaranteed to be feasible to pin in every real scenario. See Per-scenario-feasible candidate xhat for the feasible_xhat_creator convention and the --<xhatter>-try-feasible-xhat-first flags. Per xhat spoke, --*-try-jensens-first and --*-try-feasible-xhat-first are mutually exclusive.

What the options do

Two-stage only (for now). Each supported spoke gains one boolean flag:

Outer-bound (lower-bounder) spokes

--lagrangian-try-jensens-first
--lagranger-try-jensens-first
--subgradient-try-jensens-first
--reduced-costs-try-jensens-first

Inner-bound (xhat) spokes

--xhatshuffle-try-jensens-first
--xhatxbar-try-jensens-first
--xhatlooper-try-jensens-first
--xhatspecific-try-jensens-first

When the flag is set, before the spoke enters its usual iterations it builds a single average scenario (via the module’s average_scenario_creator; see below), solves it with the spoke’s configured solver, and then:

on the outer-bound path, sends the solver’s best dual bound on the average scenario as its first outer bound;
on the xhat path, takes the average scenario’s first-stage solution, evaluates it across all scenarios, and reports the expected cost as its first inner bound if it is feasible for all scenarios.

Solver options and what is extracted as the bound

The average-scenario solve uses iterk_solver_options (the spoke’s “production” tolerances), not iter0_solver_options. It is a one-shot deterministic solve, not a first-iteration subproblem solve.

On the outer-bound path the value sent to the hub is the solver’s best dual bound on the average scenario (Pyomo’s results.problem.lower_bound for minimize, upper_bound for maximize), not the incumbent objective value. With the dual bound, a non-zero MIP gap on the average-scenario solve does not invalidate Jensen’s outer bound — the dual bound is a valid lower bound on the average-scenario optimum regardless of gap, and Jensen’s inequality transitively makes it a valid lower bound on the expected recourse cost. (For an LP the dual bound and the incumbent coincide, so this distinction collapses.)

On the xhat path the dual bound is ignored; only the first-stage nonant values are used, and they are honestly re-evaluated across the real scenarios via Xhat_Eval.evaluate. A loose MIP gap on the average-scenario solve there just means a possibly-worse candidate xhat, never an invalid bound.

The spoke then continues its normal loop. There is no “iteration -1” in PH, APH, L-shaped, or any hub — the Jensen’s work happens entirely inside a spoke.

What the average is computed over. The average-scenario data must be the sample mean of the data for every scenario in the run (not the expectation of any underlying continuous distribution). This matches how mpi-sppy treats the outer problem: the sample-average approximation (SAA) is what is actually being solved, and Jensen’s bound applies directly to that problem.

Authoring `average_scenario_creator`

A scenario module that wants to participate must define:

def average_scenario_creator(scenario_name, **kwargs):
    """Return a Pyomo model with the same shape as scenario_creator(...),
    but built from sample-averaged random data.

    Two-stage: _mpisppy_probability must be 1.0 (a single
    deterministic scenario, not a member of a probabilistic
    ensemble).
    """

Under the Hood

Discovery is via getattr(module, "average_scenario_creator", None); if a flag is set but the module does not define the function, cfg_vanilla raises a clear error at spoke-setup time.

The recommended authoring pattern — which examples/farmer/farmer.py now demonstrates — is to split the work into two underscore helpers that are shared between scenario_creator and average_scenario_creator:

def _scenario_data(scenario_name, **kwargs):
    """Pure-Python random data as a plain dict. No Pyomo."""
    ...

def _build_model(scenario_name, data, *, probability, **kwargs):
    """Build Pyomo model from the data dict. Shared build path."""
    ...

def scenario_creator(scenario_name, **kwargs):
    data = _scenario_data(scenario_name, **kwargs)
    prob = 1.0 / kwargs["num_scens"] if kwargs.get("num_scens") else "uniform"
    return _build_model(scenario_name, data, probability=prob, **kwargs)

def average_scenario_creator(scenario_name, **kwargs):
    # NOTE: could be multi-threaded for large num_scens.
    snames = scenario_names_creator(kwargs["num_scens"])
    datas  = [_scenario_data(s, **kwargs) for s in snames]
    avg    = _average_scenario_data(datas)
    return _build_model(scenario_name, avg, probability=1.0, **kwargs)

Benefits of this split:

One source of truth for “what does the Pyomo model look like.” If scenario_creator and average_scenario_creator each build the model inline, they will eventually drift.
Separating data from model makes averaging trivial: average the dict, not the Pyomo components.
Every rank independently calls average_scenario_creator and gets an identical model. No collective communication needed.

Data-only averaging: the model is the model

The model returned by average_scenario_creator must be the same model that scenario_creator returns; only the data that varies across scenarios may change. Variable domains — including integrality and bounds — are part of the model and must not be relaxed in the average scenario. Constraint forms must not be edited either. The only thing that legitimately moves between the two creators is the values of whatever data the original scenario_creator would have read from disk or generated randomly.

If averaging the per-scenario data produces a model that is infeasible — for example because averaging a binary stochastic parameter yields a fractional value that conflicts with an integer recourse constraint — then the problem is not amenable to Jensen-style averaging, and the right answer is not to ship an average_scenario_creator at all. Working around the infeasibility by relaxing a Var domain in the average scenario would produce a different model whose solution would be solving a different problem — at best an LP-relaxation heuristic that happens to share the same shape, not a Jensen’s bound.

Even when average_scenario_creator is not directly usable, the underscore helpers above (_scenario_data, _average_scenario_data, _build_model) may still be worth shipping. A feasible_xhat_creator (see Per-scenario-feasible candidate xhat) is free to build an averaged-data model internally — even one with relaxed Var domains — to derive a starting first-stage point, because its output is then projected and verified feasible in every real scenario rather than handed off as a Jensen’s bound. The data-only-averaging principle does not bind feasible_xhat_creator; it only binds average_scenario_creator.

When the model isn’t amenable: sslp

examples/sslp/sslp.py deliberately does not ship an average_scenario_creator. Its only stochastic data is the binary indicator ClientPresent[i] (one per client, in \(\{0, 1\}\) per scenario). The empirical mean across scenarios is a fraction in \([0, 1]\), which makes the model constraint

sum(Allocation[i, j] for j in Servers) == ClientPresent[i]

infeasible: Allocation is Binary in the model and cannot be fractional. Relaxing Allocation to continuous in the average scenario would make the LP feasible but would violate the data-only-averaging principle above, so sslp opts out. A user who flips --*-try-jensens-first against a model module with no average_scenario_creator function will get a clear error from cfg_vanilla saying the function is missing.

Seed management

This is the single most common source of bugs in user-authored average_scenario_creator functions. Please read this section before writing your own.

Use a local RNG per call, not a module-level global. The farmer example deliberately avoids the farmerstream = np.random.RandomState() module-level pattern and instead does:

def _scenario_data(scenario_name, ..., seedoffset=0):
    scennum = sputils.extract_num(scenario_name)
    rng = np.random.RandomState(scennum + seedoffset)   # local, fresh
    ...

The module-level pattern is thread-unsafe: two threads calling the same function race on .seed() and .rand() and produce garbage. With a local RNG, average_scenario_creator can safely be parallelized later (e.g. via concurrent.futures.ThreadPoolExecutor). The on-disk results are byte-for-byte identical to the shared-stream pattern for a given (scennum, seedoffset).

Draw order must be deterministic and explicit. Do not rely on Pyomo’s component-build order to sequence your random draws. Build the data dict in a plain Python loop first; then hand that dict to _build_model. The underscore-helper pattern does this automatically.

``seedoffset`` must be threaded through both creators. If your module already accepts seedoffset in scenario_creator (as the confidence-interval code expects), make sure average_scenario_creator accepts and forwards it too. Otherwise, a confidence-interval run with a non-zero seedoffset would get a Jensen bound computed from a scenario set that does not match the one the run is actually solving.

Thread-safety. The average_scenario_creator you ship should be safe to call from multiple threads concurrently. With local RNGs (above), it is — and the farmer example’s docstring sketches how to wrap the per-scenario loop in a ThreadPoolExecutor. With a shared module-level RNG, it is not.

Worked example: farmer

See examples/farmer/farmer.py for the pedagogical reference implementation. The refactor split the original monolithic scenario_creator into _scenario_data (pure-Python yield generation using a local RandomState) and _build_model (Pyomo construction). average_scenario_creator averages the yield dicts from every would-be scenario and hands the average to _build_model with probability=1.0.

To see it in action:

mpiexec -np 3 python -m mpi4py mpisppy/generic_cylinders.py \
    --module-name farmer --num-scens 3 --default-rho 1 \
    --solver-name cplex --lagrangian --xhatshuffle \
    --lagrangian-try-jensens-first --xhatshuffle-try-jensens-first \
    --max-iterations 2

The lagrangian spoke will send its Jensen’s outer bound before running iteration 0, and the xhatshuffle spoke will evaluate the average scenario’s first-stage solution as its first xhat.

(A variant that loads scenario data from .dat files rather than generating it with an RNG can be seen in examples/sizes/sizes.py, which uses a Pyomo AbstractModel.)

Interaction with each spoke’s own iteration 0

lagrangian: its normal iteration 0 trivial bound (Lagrange multipliers \(W = 0\)) is itself wait-and-see and therefore already a valid outer bound. The value of this flag is that the average-scenario bound arrives before iteration 0 finishes, which can matter a lot when there are many scenarios and iteration 0 is slow. The flag does not replace or disable iteration 0. For some problems Jensen’s bound is tighter than the trivial iteration 0 bound.
subgradient: iteration 0 solves all scenarios. The average- scenario bound arrives earlier.
reduced_costs: inherits the lagrangian behavior.
lagranger: like lagrangian, but takes nonants from the hub and computes its own W rather than receiving W. Iteration 0 is still the trivial W=0 wait-and-see bound, so the same reasoning applies as for lagrangian. (note that this bounder seldom works well)

Convexity limitations, in more detail

The integer-recourse check is necessary but not sufficient. A continuous-recourse two-stage LP can still violate convexity in \(\xi\) if, for example, random parameters enter the recourse problem’s technology matrix in a way that makes the recourse value non-convex in those parameters. mpi-sppy does not detect this automatically — if you know your problem has matrix-randomness, verify convexity before enabling the outer-bound flag.

The xhat path has no such restriction: the candidate solution is honestly evaluated, so any two-stage recourse structure (including integer recourse and non-convex recourse) is acceptable there.