Skip to content

make n_jobs=-1 as default #122

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

akshitasure12
Copy link
Contributor

addressing #111

This PR sets n_jobs=-1 as the default in the NetworkX config system.

Copy link
Member

@Schefflera-Arboricola Schefflera-Arboricola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in the meeting we discussed that if we are going to make n_job=-1 as default then it would also mean we should make active=True -- because we wanted the user to not get into configs to get a fast running code-- i.e. we should run all cores by default and that also means using networkx's configs by default(i.e. active=True).

Also, I've left a few comments below. Thanks!

@akshitasure12
Copy link
Contributor Author

I think in the meeting we discussed that if we are going to make n_job=-1 as default then it would also mean we should make active=True -- because we wanted the user to not get into configs to get a fast running code-- i.e. we should run all cores by default and that also means using networkx's configs by default(i.e. active=True).

I may have misinterpreted it — I'll work on addressing this. Thank you!

Copy link
Member

@dschult dschult left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a bunch of small suggestions (below). I think almost all of them are updating comments which describe which config system is being used. Feel free to change it to something other than what I suggest.

I think this is ready to go in, but I may be missing some subtle config impacts.
I will trust this more after @Schefflera-Arboricola reviews it too.

akshitasure12 and others added 3 commits July 1, 2025 21:03
Co-authored-by: Dan Schult <[email protected]>
Co-authored-by: Dan Schult <[email protected]>
Co-authored-by: Dan Schult <[email protected]>
@Schefflera-Arboricola
Copy link
Member

you can add suggestions to batch and commit -- https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/incorporating-feedback-in-your-pull-request

Copy link
Member

@Schefflera-Arboricola Schefflera-Arboricola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this change might break users' code-- they won't get any errors but they will stop getting the speedups. For example, if someone was using joblib.parallel_config(n_jobs=-1) to use all cores then after this change they would stop getting speedups because now by default networkx's config system is being used, instead of joblib's.

just wanted to point this out-- but we don't have to worry about it right now because nx-parallel is still in experimental stages and we can be a bit aggressive with API changes like this. Also, this is also why configs are not preferred by everyone-- especially the ones that changes code behaviour without notification. LMKWYT.

But, we should make sure that in our tests we don't ignore such kind of a situation.

Copy link
Member

@Schefflera-Arboricola Schefflera-Arboricola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for all your work!

To help ensure our documentation is as clear and helpful as possible—especially for new users and contributors—could you please review some material for writing narrative, user-focused docstrings? Reading the documentation multiple times from the perspective of someone new to the project can help identify areas that might need clarification and build a better flow in the document. Also, having a solid understanding of what you are documenting is essential for writing effective documentation. Here are some resources you might find helpful:

Improving the clarity and narrative style of docstrings will make the codebase more accessible and help speed-up the review process. If you have any questions or would like feedback on specific sections, please feel free to ask. Also, please convert the PR to "Ready for review" once it’s ready.

Thank you again for your contributions!

Comment on lines +83 to +86
```py
# modify number of jobs, if you want to limit the number of cores used
nx.config.backends.parallel.n_jobs = 4
```

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this example could be a bit more detailed i think-- maybe switching logging on and including the output would be a good idea. lmkwyt.

Copy link
Contributor Author

@akshitasure12 akshitasure12 Jul 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about this, but I feel adding logging setup directly in the README might make the example harder to follow for users. I'll make sure to show the expected output :)

In `nx-parallel`, there's a `_configure_if_nx_active` decorator applied to all algorithms. This decorator checks the value of `active`(in `nx.config.backends.parallel`) and then accordingly uses the appropriate configuration system (`joblib` or `networkx`). If `active=True`, it extracts the configs from `nx.config.backends.parallel` and passes them in a `joblib.parallel_config` context manager and calls the function in this context. Otherwise, it simply calls the function.
To use [`joblib.parallel_config`](https://joblib.readthedocs.io/en/latest/generated/joblib.parallel_config.html) with `nx-parallel`, set `nx.config.backends.parallel.active = False`. This disables the default NetworkX configuration so joblib settings can take effect. This can be done globally or within a context manager.

**2.1.1 Disable NetworkX config globally**

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this section is showing how joblib configs can be globally and in a context-- so "Disable NetworkX config globally" heading seems out of place

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea here is to explain to users that there are two ways to disable NetworkX’s config system so that joblib’s own configuration can take effect — either globally or within a context manager. The focus is on how to disable NetworkX’s config, not on how joblib can be set globally or in a context by itself. Although, different ways of setting joblib configs has been implicitly conveyed with:

# Setting global configs for Joblib
parallel_config(n_jobs=5, verbose=50)
nx.square_clustering(H)

# Setting configs in Joblib's context
with parallel_config(n_jobs=7, verbose=0):
    nx.square_clustering(H)

let me know if you think any tweaks are needed -- but this looks clear to me.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in the start of this Config.md, "This guide explains how to configure nx-parallel using both systems.", and this second section discusses how to use joblib.parallel_config -- and to use that the first step is to disable networkx's config via active parameter (the default) -- so it's just a step not the whole process. HTH.

nx.square_clustering(H)
```

**2.1.2 Disable NetworkX config in a context**

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

@Schefflera-Arboricola Schefflera-Arboricola marked this pull request as draft July 12, 2025 14:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

make n_jobs=-1 as default instead of n_jobs=None?
3 participants