AML Agnosticism =============== The mpi-sppy package provides callouts so that algebraic modeling languages (AMLs) other than Pyomo can be used. A growing number of AMLs are supported as `guest` languages (we refer to mpi-sppy as the `host`). This code is in an alpha-release state; use with extreme caution. This is referred to as `tight` integration with the guest. It is also possible to simply read scenario data from an mps file and the mps file (and the associated json nonant file) that can be created however you like. This is refered to as `loose` integration and seems to be fairly robust. .. _loose_integration: Loose integration ^^^^^^^^^^^^^^^^^ You can use ``generic_cylinders.py`` with ``--mps-files-directory`` as the first argument (the module ``mpisppy.problem_io.mps_module`` is inferred automatically, so ``--module-name`` is not needed). Note that at the time of this writing, the number of scenarios is obtained by counting the mps files in the directory given. (It would require only a small amount of programming to support lp files.) The file ``examples.loose_agnostic.AMPL.farmer_example.bash`` has three commands. The second illustrates how to instruct ``MPI-SPPY`` to read mps/json file pairs for each scenario from a directory. The first runs an `AMPLpy` program that creates the scenario files. This program is in ``examples.loose_agnostic.AMPL.farmer_writer.py`` and, apart from the `scenario_creator` function, is pretty general for two-stage problems. You be able to copy the program and write a `scenario_creator` function for your two-stage problem. The third command runs a script that illustrates how to map column names created by the MPS writer back to AMPL variable names. The file ``examples.loose_agnostic.GAMS.farmer_example.bash`` has three commands that mimic the commands for AMPL. The GAMS bash script is not part of the automated tests because I don't want to deal with the license. A somewhat strange example is in the file ``examples.sizes.mps_demo.bash`` has two commands. The second illustrates how to instruct ``MPI-SPPY`` to read mps/json file pairs for each scenario from a directory. The first command illustrates how to use ``MPI-SPPY`` to write them in the first place (but if ``MPI-SPPY`` can get your scenarios, there is probably no reason to write them and then read them again!). This functionality is intended to be used by users of other AMLs or other scenario-based stochastic programming applications. There is low-level support for `.lp` files instead of `.mps` files. Code for creating a Pyomo model from an mps file is in ``mpisppy.problem_io.mps_reader.py``, JSON file format ---------------- The directory named in the ``--mps-files-directory`` needs to have two files for each scenario: a mps file and a json file. The json file need to have certain literal strings as well as scenario-specific data. In this specification, scenario specific data is named with underscores. Note that the total number of tree nodes is given as an integer, but the file only contains the data for nodes for the single scenario. .. code-block:: python { "scenarioData": { "name": scenario_name, "scenProb": scenario_probability, }, "treeData": { "globalNodeCount": number_of_nodes_in_entire_tree, "nodes: { "ROOT": { "condProb": 1.0, "nonAnts": [ "first_root_node_nonant_name", "second_root_node_nonant_name", #... ] } "ROOT_i": { "condProb": conditional_probability_of_second_stage_node_i, "nonAnts": [ first_nonant_name_at_node, second_node_nonant_name_at_node, #... ] } } } } Two-stage JSON example ~~~~~~~~~~~~~~~~~~~~~~ Two-stage problems are simple because there is only one node in the scenario tree and its name must be ROOT. Here is an example .. code-block:: python { "scenarioData": { "name": "unknown", "scenProb": 0.3333333333333333 }, "treeData": { "globalNodeCount": 1, "nodes": { "ROOT": { "serialNumber": 0, "condProb": 1.0, "nonAnts": [ "NumProducedFirstStage(1)", "NumProducedFirstStage(2)", "NumProducedFirstStage(3)", # ... "NumUnitsCutFirstStage(10_10)", ] } } } } Naming Conventions ~~~~~~~~~~~~~~~~~~ - Scenario names should end in a serial number. Zero-based numbering is best, but one-based is supported. - The root node of the scenario tree must be named ROOT. - Other nodes must begin with the name of the parent node and end with an underscore followed by a zero-based serial number for the node at its stage. - The names of the nonanticaptive variables at the node are given in the `nonAnts` list and the names must match column names in the mps file. Tight integration ^^^^^^^^^^^^^^^^^ From the end-user's perspective ------------------------------- When mpi-sppy is used for a model developed in an AML for which support has been added, the end-user runs the ``mpisppy.agnostic.agnostic_cylinders.py`` program which serves as a driver that takes command line arguments and launches the requested cylinders. The file ``mpisppy.agnostic.go.bash`` provides examples of a few command lines. From the modeler's perspective ------------------------------ Assuming support has been added for the desired AML, the modeler supplies two files: - a model file with the model written in the guest AML (AMPL example: ``mpisppy.agnostic.examples.farmer.mod``) - a thin model wrapper for the model file written in Python (AMPL example: ``mpisppy.agnostic.examples.farmer_ampl_model.py``). This thin python wrapper is model specific. There can be a little confusion if there are error messages because both files are sometimes refered to as the `model file.` Most modelers will probably want to import the deterministic guest model into their python wrapper for the model and the scenario_creator function in the wrapper modifies the stochastic paramaters to have values that depend on the scenario name argument to the scenario_creator function. (An exception is when the guest is in Pyomo, then the wrapper file might as well contain the model specification as well so there typically is only one file. However, there is not particularly good reason to use the agnostic machinery for a Pyomo model.) From the developers perspective ------------------------------- If support has not yet been added for an AML, it is almost easier to add support than to write a guest interface for a particular model. To add support for a language, you need to write a general guest interface in Python for it (see, e.g., ampl_guest.py or pyomo_guest.py) and you need to add/edit a few lines in ``mpisppy.agnostic.agnostic_cylinders.py`` to allow end-users to access it. Special Note for developers ^^^^^^^^^^^^^^^^^^^^^^^^^^^ The general-purpose guest interfaces might not be the fastest possible for many guest languages because they don't use indexes from the original model when updating the objective function. If this is an issue, you might want to write a problem-specific module to replace the guest interface and the model wrapper with a single module. For an example, see ``examples.farmer.agnostic.farmer_xxxx_agnostic``, where xxxx is replaced, e.g., by ampl. Architecture ^^^^^^^^^^^^ The following picture presents the architecture of the files. .. image:: images/agnostic_architecture.png :alt: Architecture of the agnostic files :width: 700px :align: center We note "xxxx" the specific problem, for instance farmer. We note "yyyy" the guest language, for instance "ampl". Two methods are presented. Either a method specific to the problem, or a generic method. Regardless of the method, the file ``agnostic.py`` and ``xxxx.yyyy`` need to be used. ``agnostic.py`` is already implemented and must not be modified as all the files presented above the line "developer". ``xxxx.yyyy`` is the model in the guest language and must be given by the modeler such as all the files under the line "modeler". The files ``agnostic_yyyy_cylinders.py`` and ``agnostic_cylinders.py`` are equivalent. The file ``xxxx_yyyy_agnostic.py`` for the specific case is split into ``yyyy_guest.py`` and ``xxxx_yyyy_model.py`` for the generic case. It is worth noting that the scenario creator is defined in 3 files. It is first defined in the file specific to the problem and the guest language ``xxxx_yyyy_model.py``. At this point it may not return a scenario. It is then wrapped in a file only specific to the language ``yyyy_guest.py``. At chich point it returns the dictionary ``gd`` which indludes the scenario. Finally the tree structure is attached in ``agnostic.py``. Bundles ------- The use of scenario bundles can dramatically improve the performance of scenario decomposition algorithms such as PH and APH. Although mpi-sppy has facitilites for forming bundles, the mpi-sppy ``agnostic`` package assumes that bundles will be completely handled by the guest. Bundles will be returned by the scenario creator function as if they are a scenario. Although it seems sort of like a trick, it is really the way bundles are intended to operate so we sometimes refer to `true` bundles, which are used in non-agnostic way as briefly described in section :ref:`Pickled-Bundles`. Overview of Recommended Bundle Practices ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Modify the scenario creator function so that if the scenario name starts with the string "scen" it returns a single scenario, but if the name starts with "bundle" it returns the full extensive formulation for a group of scenarios (i.e. a bundle). We typically number scenarios and the scenario or bundle number is at the end of the first positional argument for the scenario creator function (i.e. at the end of the scenario name). If the name starts with bundle, the scenario creator function can call itself with the proper list of scenario names to get the scenarios to form the EF that will be returned. We recommend names for bundles such as "bundle_xxx_yyy" where xxx and yyy give the first and last scenario number in the bundle. You could also pass in a dictionary that maps bundle numbers to lists of scenario numbers as a keyword argument to the scenario_creator function and then append the bundle number to "bundle" and pass it as the positional scenario name argument to the scenario creator function. Some notes ^^^^^^^^^^ - The helper function called ``scenario_names_creator`` needs to be co-opted to instead create bundle names and the code in the scenario_creator function then needs to create its own scenario names for bundles. At the time of this writing this results in a major hack being needed in order to get bundle information to the names creator in the Pyomo example described below. You need to supply a function called ``bundle_hack`` in your python model file that does whatever needs to be done to alert the names creator that there bundles. The function takes the config object as an argument. See ``mpisppy.agnostic.farmer4agnostic.py`` - There is a heavy bias toward uniform probabilities in the examples and in the mpi-sppy utilities. Scenario probabilities are attached to the scenario as ``_mpisppy_probability`` so if your probabilities are not uniform, you will need to calculate them for each bundle (your EF maker code can do that for you). Note that even if probabilities are uniform for the scenarios, they won't be uniform for the bundles unless you require that the bundle size divides the number of scenarios. - There is a similar bias toward two stage problems, which is extreme for the agnostic package. If you have a multi-stage problem, you can make things a lot easier for yourself if you require that the bundles contain all scenarios emanating from each second stage node (e.g., on bundle per some integer number of second stage nodes). This is what is done in (non-agnostic) :ref:`Pickled-Bundles`. The result of this is that your multi-stage problem will look like a two-stage problem to mpi-sppy. Example ^^^^^^^ The example ``mpisppy.agnostic.farmer4agnostic.py`` contains example code. .. Note:: In order to get information from the command line about bundles into the ``scenario_names_creator`` the ``bundle_hack`` function is called called by the cylinders driver program very early. For this example, function sets global variables called ``bunsize`` and ``numbuns``. The script ``mpisppy.agnostic.examples.go.bash`` runs the example (and maybe some other examples). Notes about Gurobipy -------------------- The current implementation of gurobipy assumes that nonants that are in the objective function appear direclty there (not via some other variable constrained in some way to represent them).