Skip to content

Improve the performance when using enumeration#8395

Merged
alexey-tikhonov merged 9 commits intoSSSD:masterfrom
aplopez:enumerate
Mar 27, 2026
Merged

Improve the performance when using enumeration#8395
alexey-tikhonov merged 9 commits intoSSSD:masterfrom
aplopez:enumerate

Conversation

@aplopez
Copy link
Copy Markdown
Contributor

@aplopez aplopez commented Jan 21, 2026

This PR includes:

  • Removal of an unused function.
  • Stop logging a possibly extremely long filter.
  • Fixes a wrong condition invalidating an optimization.
  • Adds a test case for an existing test.

Enumeration, specially when there are 15,000+ users, is slow. This fix helps, but it doesn't work miracles.
In my test environment, the enumeration went from 8 minutes to about 1.

It is important to know that, with such an amount of users, many operations time out. It is necessary to increment the timeout in[nss] and for the domain, but also set large values for ldap_enumeration_refresh_timeout and ldap_search_timeout in the domain. I used these values to avoid any timeout (YMMV):

[domain/ldap.test]
ldap_enumeration_refresh_timeout = 30000
ldap_search_timeout = 6000
timeout = 6000
...

[nss]
timeout = 6000
...

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively improves performance by optimizing logging, removing an unused function, and correcting a condition related to enumeration. The changes are well-aligned with the stated goals of enhancing enumeration performance, especially for large user bases. The addition of a new test case for the general enumeration scenario ensures that the modified logic is adequately covered.

@alexey-tikhonov
Copy link
Copy Markdown
Member

Mistype in the commit message: "We must look into de TS cache"

@aplopez
Copy link
Copy Markdown
Contributor Author

aplopez commented Jan 22, 2026

Mistype in the commit message: "We must look into de TS cache"

Fixed.

@alexey-tikhonov
Copy link
Copy Markdown
Member

alexey-tikhonov commented Jan 23, 2026

I think fix is correct in the sense it fixes a bug.

But I think logic of sysdb_enumpwent_filter() can and should be improved in general to avoid a case when dn_filter expands to entire db.

In particular, if addtl_filter isn't set, then sysdb_search_ts_users(enum_filter(NULL)) is expected to return entire db, right? And using this as additional filter results in the same as '*' but extremely slow.
Or do I miss something?

@aplopez aplopez marked this pull request as ready for review February 24, 2026 13:19
@alexey-tikhonov alexey-tikhonov added the coverity Trigger a coverity scan label Feb 24, 2026
@alexey-tikhonov
Copy link
Copy Markdown
Member

Note: Covsan is green so far.

@alexey-tikhonov alexey-tikhonov removed the coverity Trigger a coverity scan label Feb 24, 2026
@alexey-tikhonov
Copy link
Copy Markdown
Member

Hm,
F44:

FAILED tests/test_infopipe.py::test_infopipe__list_by_name (ldap) - AssertionError: ListByName('user-*', 0) is missing element 10002
assert '/org/freedesktop/sssd/infopipe/Users/test/10002' in ['/org/freedesktop/sssd/infopipe/Users/test/10001', '/org/freedesktop/sssd/infopipe/Users/test/10003']

Looks relevant, but why f44 only... race condition?

@aplopez
Copy link
Copy Markdown
Contributor Author

aplopez commented Feb 25, 2026

Looks relevant, but why f44 only... race condition?

I reran the tests and a different test failed. 😮‍💨
Locally, on my PC (Fedora 43, though) the test passes every time.

@aplopez
Copy link
Copy Markdown
Contributor Author

aplopez commented Feb 26, 2026

And now all the tests passed. There is some instability in F44, but not related to this PR.

@alexey-tikhonov
Copy link
Copy Markdown
Member

And now all the tests passed. There is some instability in F44, but not related to this PR.

It is very suspicious that it was test_infopipe__list_by_name that I didn't see failing before.
Can there be a race condition in the test itself that is triggered by slow runner?

@aplopez
Copy link
Copy Markdown
Contributor Author

aplopez commented Feb 26, 2026

It is very suspicious that it was test_infopipe__list_by_name that I didn't see failing before. Can there be a race condition in the test itself that is triggered by slow runner?

I thought the same until I noticed this test failed once and never again. The second time a completely different test failed. The third time, the latest, none.

@aplopez aplopez force-pushed the enumerate branch 2 times, most recently from 6d46dbe to 043d60e Compare March 27, 2026 11:10
@aplopez
Copy link
Copy Markdown
Contributor Author

aplopez commented Mar 27, 2026

@alexey-tikhonov Things have changed since you approved. What do you think now?

Copy link
Copy Markdown
Contributor

@sumit-bose sumit-bose left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,

thank you for the updates, I have no further comments, ACK.

bye,
Sumit

@alexey-tikhonov
Copy link
Copy Markdown
Member

@alexey-tikhonov Things have changed since you approved. What do you think now?

Besides using '//', looks good to me.

@aplopez aplopez added the coverity Trigger a coverity scan label Mar 27, 2026
@aplopez
Copy link
Copy Markdown
Contributor Author

aplopez commented Mar 27, 2026

Coverity is green.

@aplopez aplopez added Accepted and removed coverity Trigger a coverity scan Waiting for review labels Mar 27, 2026
aplopez added 9 commits March 27, 2026 18:02
Function sysdb_enumpwent() is not used.
It was replaced by sysdb_enumpwent_filter().

Reviewed-by: Alexey Tikhonov <atikhono@redhat.com>
Reviewed-by: Sumit Bose <sbose@redhat.com>
When there are too many users (17,000+) this message can be too long.
Limit it to the first 50 characters.

Resolves: SSSD#6951
Reviewed-by: Alexey Tikhonov <atikhono@redhat.com>
Reviewed-by: Sumit Bose <sbose@redhat.com>
We must look into the TS cache only when a name is provided.
Using the TS cache on an unfiltered enumeration is useless.

Resolves: SSSD#6951
Reviewed-by: Alexey Tikhonov <atikhono@redhat.com>
Reviewed-by: Sumit Bose <sbose@redhat.com>
Added a case that was not checked before. It is the case
when `attr`, `attr_name` and `addtl_filter` are all `NULL`.

Reviewed-by: Alexey Tikhonov <atikhono@redhat.com>
Reviewed-by: Sumit Bose <sbose@redhat.com>
Create the filter to retrieve only the requested entries.

Do not create a new filter and search for matches if there is
no results from the previous search. The called functions
handle this case correctly but why wasting time calling them?

Reviewed-by: Alexey Tikhonov <atikhono@redhat.com>
Reviewed-by: Sumit Bose <sbose@redhat.com>
Function cache_req_user_by_filter_lookup() will set or not the recent
filter depending on whether data->name.attr is set or not. As mentioned
in the comment, it should be done base on whether the refernced
attribute is name or not.

Reviewed-by: Alexey Tikhonov <atikhono@redhat.com>
Reviewed-by: Sumit Bose <sbose@redhat.com>
The message said that sysdb_enumpwent() had failed, but it was
actually sysdb_enumpwent_filter() which failed.

Reviewed-by: Alexey Tikhonov <atikhono@redhat.com>
Reviewed-by: Sumit Bose <sbose@redhat.com>
The "name" attribute was not being added to the TS cache, even though
that it is part of the DN (ldb doesn't enforce it). Adding this
attribute requires that the DB version is incremented for the TS cache
to be regenerated with the missing attribute.

This made the if-block in sysdb_enumpwent_filter() rather useless.

In addition, once this if-block is executed, the fuction leaves without
further processing.

Reviewed-by: Alexey Tikhonov <atikhono@redhat.com>
Reviewed-by: Sumit Bose <sbose@redhat.com>
Although ts_res.count is set to 0 when sysdb_search_ts_users()
return ERR_NO_TS, before using it we make an extra check to verify
that the returned code is EOK.

Reviewed-by: Alexey Tikhonov <atikhono@redhat.com>
Reviewed-by: Sumit Bose <sbose@redhat.com>
@sssd-bot
Copy link
Copy Markdown
Contributor

The pull request was accepted by @aplopez with the following PR CI status:


🟢 CodeQL (success)
🟢 osh-diff-scan:fedora-rawhide-x86_64:upstream (success)
🟢 rpm-build:centos-stream-10-x86_64:upstream (success)
🟢 rpm-build:fedora-42-x86_64:upstream (success)
🟢 rpm-build:fedora-43-x86_64:upstream (success)
🟢 rpm-build:fedora-44-x86_64:upstream (success)
🟢 rpm-build:fedora-rawhide-x86_64:upstream (success)
🟢 Analyze (target) / cppcheck (success)
🟢 Build / freebsd (success)
🟢 Build / make-distcheck (success)
🟢 ci / intgcheck (centos-10) (success)
🟢 ci / intgcheck (fedora-42) (success)
🟢 ci / intgcheck (fedora-43) (success)
🟢 ci / intgcheck (fedora-44) (success)
🟢 ci / intgcheck (fedora-45) (success)
🟢 ci / prepare (success)
🟡 ci / system (centos-10) (in_progress)
🟡 ci / system (fedora-42) (in_progress)
🟡 ci / system (fedora-43) (in_progress)
🟡 ci / system (fedora-44) (in_progress)
🟡 ci / system (fedora-45) (in_progress)
➖ Coverity scan / coverity (skipped)
🟢 Static code analysis / codeql (success)
🟢 Static code analysis / pre-commit (success)
🟢 Static code analysis / python-system-tests (success)


There are unsuccessful or unfinished checks. Make sure that the failures are not related to this pull request before merging.

@alexey-tikhonov alexey-tikhonov merged commit 0a739f8 into SSSD:master Mar 27, 2026
13 of 16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

NSS enumerated passwd/group truncated output and performance regression since >=2.8.0

4 participants