Skip to content
Open

CNN #442

Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 29 additions & 13 deletions docs/source/user/mip-models.rst
Original file line number Diff line number Diff line change
Expand Up @@ -57,29 +57,45 @@ function. By default, the approximation guarantees a maximal error of
keyword argument when the constraints is added.


Neural Networks
===============
Sequential Neural Networks
==========================

The package currently models dense neural network with ReLU activations. For a
given neuron the relation between its inputs and outputs is given by:
The package supports sequential neural networks. Layers are added as building
blocks; the package creates the necessary variables and constraints and wires
them to match the network structure.

Dense layers (details)
----------------------

For dense layers with ReLU activations, each neuron applies an affine
transformation followed by a ReLU. For a neuron with weights
\(\beta \in \mathbb{R}^{p+1}\), inputs \(x\), and output \(y\):

.. math::

y = \max(\sum_{i=1}^p \beta_i x_i + \beta_0, 0).
y = \max\Big(\sum_{i=1}^p \beta_i x_i + \beta_0,\; 0\Big).

The relationship is formulated in the optimization model by using Gurobi
:math:`max` `general constraint
<https://www.gurobi.com/documentation/latest/refman/constraints.html#subsubsection:GeneralConstraints>`_
with:
This is modeled using Gurobi general constraints by introducing an auxiliary
variable \(\omega\) for the affine part and then enforcing the ReLU:

.. math::

& \omega = \sum_{i=1}^p \beta_i x_i + \beta_0
&\omega = \sum_{i=1}^p \beta_i x_i + \beta_0,\\
&y = \max(\omega, 0).

Other layers (summary)
----------------------

& y = \max(\omega, 0)
- Conv2D and MaxPooling2D: supported with padding equivalent to ``valid`` only
(no non‑zero or ``same`` padding). Strides are supported. Internally, tensors
use channels‑last layout (NHWC) in the optimization model.
- Flatten: converts a 4D (NHWC) tensor to 2D (batch, features).
- Dropout: accepted but ignored at inference time (treated as identity).

with :math:`\omega` an auxiliary free variable. The neurons are then connected
according to the topology of the network.
Notes:
- Keras models use NHWC throughout. PyTorch models are evaluated in NCHW, but
the package handles the necessary internal conversions so predicted values
match the framework’s behavior.


Decision Tree Regression
Expand Down
9 changes: 9 additions & 0 deletions docs/source/user/start.rst
Original file line number Diff line number Diff line change
Expand Up @@ -154,6 +154,15 @@ For a simple example on how to use the package please refer to
in the :doc:`../auto_examples/index` section.


.. note::

Variable shapes: For tabular models (scikit-learn, tree ensembles, dense
neural nets), inputs are typically 2D MVars with shape ``(batch, features)``
and outputs are 1D or 2D (the package orients a 1D output based on the
batch size). For convolutional neural networks (Keras/PyTorch), use 4D MVars
with shape ``(batch, H, W, C)`` (channels-last).


.. rubric:: Footnotes

.. [#] Classification models are currently not supported (except binary logistic
Expand Down
53 changes: 36 additions & 17 deletions docs/source/user/supported.rst
Original file line number Diff line number Diff line change
Expand Up @@ -99,27 +99,46 @@ Keras
They can be formulated in a Gurobi model with the function
:py:func:`add_keras_constr <gurobi_ml.keras.add_keras_constr>`.

Currently, only two types of layers are supported:

* `Dense layers <https://keras.io/api/layers/core_layers/dense/>`_ (possibly
with `relu` activation),
* `ReLU layers <https://keras.io/api/layers/activation_layers/relu/>`_ with
default settings.
Supported layers and notes:

- `Dense <https://keras.io/api/layers/core_layers/dense/>`_ with activation
``relu`` or ``linear``.
- `ReLU <https://keras.io/api/layers/activation_layers/relu/>`_ with default
settings (no negative_slope/threshold/max_value variations).
- `Conv2D <https://keras.io/api/layers/convolution_layers/convolution2d/>`_
with activation ``relu`` or ``linear`` and padding ``valid`` only (no
``same`` padding). Strides are supported.
- `MaxPooling2D <https://keras.io/api/layers/pooling_layers/max_pooling2d/>`_
with padding ``valid`` only.
- `Flatten <https://keras.io/api/layers/reshaping_layers/flatten/>`_.
- `Dropout <https://keras.io/api/layers/regularization_layers/dropout/>`_ is
accepted but ignored at inference time (treated as identity).

Input tensors for CNNs use channels-last layout (NHWC). Flatten converts 4D
NHWC tensors to 2D (batch, features).

PyTorch
-------


In PyTorch, only :external+torch:py:class:`torch.nn.Sequential` objects are
supported.

They can be formulated in a Gurobi model with the function
:py:func:`add_sequential_constr <gurobi_ml.torch.sequential.add_sequential_constr>`.

Currently, only two types of layers are supported:

* :external+torch:py:class:`Linear layers <torch.nn.Linear>`,
* :external+torch:py:class:`ReLU layers <torch.nn.ReLU>`.
In PyTorch, :external+torch:py:class:`torch.nn.Sequential` models are supported
via :py:func:`add_sequential_constr <gurobi_ml.torch.sequential.add_sequential_constr>`.

Supported layers and notes:

- :external+torch:py:class:`Linear <torch.nn.Linear>`.
- :external+torch:py:class:`ReLU <torch.nn.ReLU>`.
- :external+torch:py:class:`Conv2d <torch.nn.Conv2d>` with padding equivalent
to ``valid`` only (no non-zero padding or ``same``), strides supported.
- :external+torch:py:class:`MaxPool2d <torch.nn.MaxPool2d>` with padding
equivalent to ``valid`` only.
- :external+torch:py:class:`Flatten <torch.nn.Flatten>`.
- :external+torch:py:class:`Dropout <torch.nn.Dropout>` is accepted and
ignored at inference time (identity).

Input tensors for CNNs are provided as NHWC variables. Internally, inputs are
converted to NCHW for PyTorch evaluation and converted back for error checks.
The first Linear after a Flatten layer is adjusted to account for PyTorch’s
NCHW flatten order so that predictions match exactly.

XGBoost
-------
Expand Down
Loading
Loading