Check YAML, Spec, Coefficients Settings Before Model Runs #950

andkay · 2025-06-10T03:22:29Z

This PR will address #784 -- and supersedes a Draft PR in the Camsys fork..

As scoped, validating expressions is not included, and the code isn't worried about tables at all.

Approach

Smoke test various configuration files using a new settings_checker module in abm.models:

Attempt to load YAML settings files into their relevant Pydantic data models
Attempt to load SPEC files
Attempt to load COEFFICIENTS files
Attempt to evaluate the SPEC and COEFFICIENT files together to determine if mismatched labels exist

Multiple methods are included for situations involving segmented or templated SPEC/COEFFICIENT files.

The settings checker will also loop through all cascaded sub-settings (such as Preprocessor or Annotator settings objects) in a given settings object to at least check whether a SPEC file is defined there, and attempt to load it.

Logging to stdout and a file called settings_checker.log is included.

Running the Checker

The settings checker takes very little time, and so is set up to run by default.

To disable the checker, users can add the following to their settings.yaml file.

check_model_settings: False

Errors

To faithfully simulate the model runtime, a key design decision in this process is to re-use as much code from elsewhere in the ActivitySim codebase as possible, so that errors will be raised consistently. For instance, functions to read and evaluate SPEC and COEFFICIENTS files are imported and used directly instead of relying on custom code.

Errors are a wrapped in a custom Exception, which is collected into a list. If the settings checker encounters any errors, these will be reported to the logs as logger.errors and the checker will raise a RunTimeError , halting the program.

Note: A limitation of allowing errors to be raised is that at each stage of the validation routine, the settings checker will collect the first fatal exception it encounters. It is possible that the checker could need to be run several times to catch additional problems. This is mostly an issue when trying to resolve coefficient labels, since a key error will get raised on the first non-matching label in the SPEC.

Missing File Paths

Due to the inherited structure of the underlying Pydantic data models, it is not possible for the settings checker to determine whether a model actually requires a COEFFICIENT or SPEC filepath to be provided. Most of the data models include these fields, but will allow them to default to None if a value is not provided in the setting's YAML file.

The way this is handled through this PR is to issue a WARNING level log alerting users that a file may be missing from the YAML settings, and this should be double checked and corrected if necessary.

Ultimately, the Pydantic models should be refactored to be explicit about when values for these fields are required (which is what those are for), allowing the settings checker catching the raised errors when trying to read the YAML files.

Settings Definitions

In order to expose Pydantic models to the checker, they are directly imported and set up in a dictionary keyed to the step name as follows:

# import model settings
from activitysim.abm.models.accessibility import AccessibilitySettings

# Setup for checker
CHECKER_SETTINGS = {
    "compute_accessibility": {
        "settings_cls": AccessibilitySettings,
        "settings_file": "accessibility.yaml",
    }, 
   ...
}

By default, the checker assumes that the relevant fields to look for CSV files are SPEC and COEFFICIENTS. This can be overridden using "spec_coefficient_keys" in the settings dictionary. These are assumed to be "paired" (i.e. the specifications are expected to contain coefficient labels).

    "school_escorting": {
        "settings_cls": SchoolEscortSettings,
        "settings_file": "school_escorting.yaml",
        "spec_coefficient_keys": [
            {"spec": "OUTBOUND_SPEC", "coefs": "OUTBOUND_COEFFICIENTS"},
            {"spec": "INBOUND_SPEC", "coefs": "INBOUND_COEFFICIENTS"},
            {"spec": "OUTBOUND_COND_SPEC", "coefs": "OUTBOUND_COND_COEFFICIENTS"},
        ]
    },

Included Settings Definitions

As of this PR, settings are defined for the following. Keep in mind that some YAML files are directly read in when their parent is constructed.

'compute_accessibility'
'atwork_subtour_destination'
'atwork_subtour_frequency'
'atwork_subtour_mode_choice'
'atwork_subtour_scheduling'
'auto_ownership_simulate'
'cdap_simulate'
'compute_disaggregate_accessibility'
'free_parking'
'initialize_households'
'initialize_landuse'
'initialize_los'
'input_checker'
'joint_tour_composition'
'joint_tour_destination'
'joint_tour_frequency_composition'
'joint_tour_frequency'
'joint_tour_participation'
'joint_tour_scheduling'
'mandatory_tour_frequency'
'mandatory_tour_scheduling'
'non_mandatory_tour_destination'
'non_mandatory_tour_frequency'
'non_mandatory_tour_scheduling'
'parking_location'
'school_escorting'
'school_location'
'shadow_pricing'
'stop_frequency'
'summarize'
'telecommute_frequency'
'tour_mode_choice_simulate'
'tour_od_choice'
'tour_scheduling_probabilistic'
'transit_pass_ownership'
'transit_pass_subsidy'
'trip_departure_choice'
'trip_destination'
'trip_mode_choice'
'trip_purpose'
'trip_purpose_and_destination'
'vehicle_allocation'
'vehicle_type_choice'
'work_from_home'
'workplace_location'
'write_data_dictionary'
'write_trip_matrices'

Explicit Exclusions

Two models are not included in the registry as of this PR. They appear to missing required configurations in the example models, and were causing persistent failures in the settings checker. If additional guidance is provided, they could likely be added.

trip_scheduling_choice: The default YAML file (trip_scheduling_choice.yaml) is missing
trip_scheduling: The required trip_scheduling_coefficients file is missing

Extensions

The settings checker now supports a simple means of defining settings to check for extensions. To do so, developers should:

Define a module called settings_checker.py in their extensions module, which contains:
Import the required settings classes as well as core components, such as the State
Define a dictionary called EXTENSION_SETTING_CHECKER, mapping components to settings classes and files. This is identical in structure to the core settings as described in Settings Definitions above

The design directly extends the registry of model settings to validate in the core settings_checker module, rather than defining a separate checking routine in the extensions modules.

An example implementation will be provided in the SANDAG ABM3 Example repository.

…(accessibility)

… add second model to validate

… settings to independent function

…nction called from main checker. run formatter

…n empty dataframe, even if no top level spec exists

…or disaggregate accessibility (needs more testing)

…e flow for nested specs

…uisite refactors to error collection

…re unavaialable

…main model spec

…in model settings

missing values. because the fields in the settings classes are inherited, it is not usually possible for the settings checker to determine if a path to an external file is actually required for a particular model component. the workaround is to issue a specific warning that a path *may* be expected and users should check the YAML file. the ultimate solution would be to define more robust Pydantic data models that ensure that fields are marked as required when appropriate.

…s to checker entrypoint as argument.

…ettings as dict.

andkay · 2025-06-10T03:41:06Z

activitysim/cli/run.py

@@ -283,7 +285,7 @@ def run(args):
        # Memory sidecar is only useful for single process runs
        # multiprocess runs log memory usage without blocking in the controlling process.
        mem_prof_log = state.get_log_file_path("memory_profile.csv")
-        from ..core.memory_sidecar import MemorySidecar
+        from activitysim.core.memory_sidecar import MemorySidecar


The relative import here caused problems for my runs - suggest changing it to absolute.

…ot loadable

andkay added 30 commits March 19, 2025 10:03

feat: adds initial setting_checker module to abm/models

48b7696

chore: add .vscode settings to gitignore

92fd304

feat: adds first test case for prepopulating settings pydantic model …

c7a1ca3

…(accessibility)

feat: adds initial spec checker - wip

fb7d6f1

feat: update settings checker to load spec and evaluate coefficients.…

9053107

… add second model to validate

fix: adds missing arg to eval_coefficients

404190d

feat: adds atworksubtour_frequency to settings checker

2bda8ac

feat: moves loading spec and coefficients to independent functions

0e88b9e

refactor: renames main settings checker function. moves loading model…

fe65f3a

… settings to independent function

refactor: moves load, spec, and coef eval checks to an independent fu…

7333c0b

…nction called from main checker. run formatter

feat: adds load SPEC check for preprocessor settings

138b2a4

chore: run formatter

637e98d

feat: initial checking for model setting with nested coefficients

19af41f

feat: adds additional model settings. force try_load_spec to return a…

f78723c

…n empty dataframe, even if no top level spec exists

feat: adds additional model settings to check. adds special handler f…

47da4b3

…or disaggregate accessibility (needs more testing)

feat: adds settings check for initalize_landuse

625908d

refactor: reorders checking for joint tour composition settings

269301e

feat: adds setting check for joint tour destination

e14cc9c

feat: adds settings check for joint tour frequency composition

b1ca08a

feat: adds settings check for joint tour frequency

c724fd5

feat: adds settings check for joint tour participation

172e5be

feat: adds setting check for joint tour scheduling

3e4779b

feat: adds settings checks for workplace location and school location

9dd5073

feat: adds setting check for mandatory tour frequency

8f7b84f

feat: adds settings check for non mandatory tour destination

a25706c

feat: adds settings check for parking location choice

0118406

feat: adds settings check for school escorting

15ab51f

wip: adds stop_frequencies to components dict, but requires a separat…

81bfbee

…e flow for nested specs

feat: adds setting checks for summarize

df42d08

feat: adds settings check for telecommute frequency

3a594bf

andkay added 26 commits May 12, 2025 22:21

feat: adds initial handling for SPEC_NEST configs. includes some preq…

5bbb9c1

…uisite refactors to error collection

refactor: fix logging

2f95215

docs: update one comment

570ee13

refactor: remove commented raise statements used for debugging

48a638d

chore: run black formatter

963b6ef

refactor: only ever evaluate spec/coef if both are available

3664ec2

chore: better logging in model settings load

360a5cd

feat: make setting checker optional

48349ed

feat: adds setting check for input checker

0f99aa1

feat: adds setting check for shadow pricing yaml

7f39fb5

chore: remove some outdated comments

87481f6

feat: adds setting checks for intialize los

7c81ded

refactor: changes return type from empty df -> None when spec/coefs a…

c88b3b3

…re unavaialable

feat: adds logic for checking settings with templated coefficients

b598f1e

chore: run formatter

ecc190c

chore: fix alphabetization of components registry

67f2b6f

feat: check settings for write_data_dictionary

e7d7b8d

feat: allow for detailed checks for SPEC_SEGMENTS with PTYPE against …

7b1c948

…main model spec

feat: allow for arbitrary loading of spec files from any subsettings …

605c238

…in model settings

feat: allow custom checks for unusual spec/coefficient pairs

6c8ba86

feat: adds custom SettingCheckerError exception for improved logging

471faf7

chore: fix spelling.

003071b

refactor: renames COMPONENTS_TO_SETTINGS -> CHECKER_SETTINGS and feed…

3f386ab

…s to checker entrypoint as argument.

refactor: single entry for settings checker. import extension check s…

351836f

…ettings as dict.

chore: run black formatter

f7de6c2

andkay commented Jun 10, 2025

View reviewed changes

refactor: remove commented code

dbc2abf

dhensle self-requested a review June 10, 2025 18:22

fix: skip errors for write_data_dictionary if optional yaml file is n…

66797a8

…ot loadable

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Check YAML, Spec, Coefficients Settings Before Model Runs #950

Check YAML, Spec, Coefficients Settings Before Model Runs #950

Uh oh!

andkay commented Jun 10, 2025 •

edited

Loading

Uh oh!

andkay Jun 10, 2025

Uh oh!

Uh oh!

Check YAML, Spec, Coefficients Settings Before Model Runs #950

Are you sure you want to change the base?

Check YAML, Spec, Coefficients Settings Before Model Runs #950

Uh oh!

Conversation

andkay commented Jun 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Approach

Running the Checker

Errors

Missing File Paths

Settings Definitions

Included Settings Definitions

Explicit Exclusions

Extensions

Uh oh!

andkay Jun 10, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

andkay commented Jun 10, 2025 •

edited

Loading