Skip to content

Fix systemd unit overrides #3832

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Mar 6, 2025
Merged

Conversation

nrwahl2
Copy link
Contributor

@nrwahl2 nrwahl2 commented Mar 3, 2025

These have been getting in the way of trying to set timeouts via override files in #3818.

@nrwahl2 nrwahl2 requested a review from clumens March 3, 2025 00:54
@nrwahl2
Copy link
Contributor Author

nrwahl2 commented Mar 3, 2025

Coverity has started complaining about some non-issues, some of which are related to these changes and some of which are not. Sigh.

@clumens clumens added the review: in progress PRs that are currently being reviewed label Mar 3, 2025
@@ -994,7 +994,7 @@ static bool
str_any_of(const char *s, va_list args, uint32_t flags)
{
if (s == NULL) {
return pcmk_is_set(flags, pcmk__str_null_matches);
return false;
}

while (1) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might as well make this while (true) while you're here.

Copy link
Contributor Author

@nrwahl2 nrwahl2 Mar 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whoops forgot this; at least it's not important

@nrwahl2 nrwahl2 mentioned this pull request Mar 5, 2025
Copy link
Contributor

@clumens clumens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside from my one nit picky comment, I think this PR is good. I'm okay with merging this after #3833 is merged, but without waiting on #3835 as long as there's no coverity errors specific to this code.

@nrwahl2
Copy link
Contributor Author

nrwahl2 commented Mar 5, 2025

as long as there's no coverity errors specific to this code.

These are specific to this code, although they're false positives.

[2025-03-03T01:14:32.824Z] Error: REVERSE_INULL (CWE-476):
[2025-03-03T01:14:32.824Z] /pacemaker/lib/services/systemd.c:947: deref_ptr_in_call: Dereferencing pointer "filename".
[2025-03-03T01:14:32.824Z] /pacemaker/lib/services/systemd.c:964: check_after_deref: Null-checking "filename" suggests that it may be null, but it has already been dereferenced on all paths leading to the check.
[2025-03-03T01:14:32.824Z] 
[2025-03-03T01:14:32.824Z] Error: REVERSE_INULL (CWE-476):
[2025-03-03T01:14:32.824Z] /pacemaker/lib/services/systemd.c:878: deref_ptr: Directly dereferencing pointer "filename".
[2025-03-03T01:14:32.824Z] /pacemaker/lib/services/systemd.c:929: check_after_deref: Null-checking "filename" suggests that it may be null, but it has already been dereferenced on all paths leading to the check.

Similarly all those new FORWARD_NULL Coverity errors in crm_resource.c are false positives, but this PR is (SOMEHOW) what triggers Coverity to flag those.

Both of these sets of Coverity errors cause CI builds to fail. So we will have to either rework the code or suppress them. For the purpose of this PR, adding suppressions is fine (as long as no further false positives crop up).

@clumens
Copy link
Contributor

clumens commented Mar 5, 2025

These are specific to this code, although they're false positives.

[2025-03-03T01:14:32.824Z] Error: REVERSE_INULL (CWE-476):
[2025-03-03T01:14:32.824Z] /pacemaker/lib/services/systemd.c:947: deref_ptr_in_call: Dereferencing pointer "filename".
[2025-03-03T01:14:32.824Z] /pacemaker/lib/services/systemd.c:964: check_after_deref: Null-checking "filename" suggests that it may be null, but it has already been dereferenced on all paths leading to the check.
[2025-03-03T01:14:32.824Z] 
[2025-03-03T01:14:32.824Z] Error: REVERSE_INULL (CWE-476):
[2025-03-03T01:14:32.824Z] /pacemaker/lib/services/systemd.c:878: deref_ptr: Directly dereferencing pointer "filename".
[2025-03-03T01:14:32.824Z] /pacemaker/lib/services/systemd.c:929: check_after_deref: Null-checking "filename" suggests that it may be null, but it has already been dereferenced on all paths leading to the check.

It looks like coverity just doesn't understand that pcmk__g_strcat will call pcmk__assert to check for NULL. I seem to remember that we frequently run into problems where some static analysis tool doesn't understand our assert macros and functions. I wonder if there's some annotation we could use to deal with this once and for all.

Similarly all those new FORWARD_NULL Coverity errors in crm_resource.c are false positives, but this PR is (SOMEHOW) what triggers Coverity to flag those.

Weird. I don't know how coverity decides what to scan. I thought it was just looking at the files changed in a PR, but I guess I'm wrong.

Both of these sets of Coverity errors cause CI builds to fail. So we will have to either rework the code or suppress them. For the purpose of this PR, adding suppressions is fine (as long as no further false positives crop up).

In the interests of hurrying the systemd stuff along and making the backporting job less daunting, I would say let's just suppress them here and deal with them for real elsewhere.

nrwahl2 added 12 commits March 5, 2025 23:24
Use crm_strdup_printf() and pcmk__str_copy() to assert on error. Also
rename to systemd_unit_name(), since we handle other unit types.

Signed-off-by: Reid Wahl <[email protected]>
This really shouldn't fail.

Previously, we warned in this case. However, we ought to make sure the
service is cluster-controlled, and a future commit will implement action
timeouts via systemd overrides. So we'll actually need to make sure the
override succeeds.

Signed-off-by: Reid Wahl <[email protected]>
I think this is clearer. The glibc doc also recommends fchmod() over
umask():
https://www.gnu.org/software/libc/manual/html_node/Setting-Permissions.html

It shouldn't matter in our case. umask() isn't thread-safe, but
Pacemaker is single-threaded. Still, setting the permissions explicitly
seems clearer.

Signed-off-by: Reid Wahl <[email protected]>
And improve the corresponding check in one of its callers.

Signed-off-by: Reid Wahl <[email protected]>
This is a static function and we never pass in pcmk__str_null_matches.

Signed-off-by: Reid Wahl <[email protected]>
Separate the path string creation logic from the directory creation
logic. This reduces some duplication.

Signed-off-by: Reid Wahl <[email protected]>
Previously, the override directory paths were valid only for systemd
resources whose type attribute has no extension and thus defaults to
".service". For example, 'class="systemd" type="httpd"' works, but
'class="systemd" type="httpd.service"' results in an override directory
/run/systemd/system/httpd.service.service.d. The same issue affected
resources for non-service units. For example, for a resource managing
unit "custom.timer", we would create an override directory
/run/systemd/system/custom.timer.service.d. In these cases, the
overrides would not be applied.

This commit fixes that. We also ensure that the [Service] section is
added only for service units.

Signed-off-by: Reid Wahl <[email protected]>
Best practice and makes Coverity happy.

Signed-off-by: Reid Wahl <[email protected]>
Best practice and makes Coverity happy.

Signed-off-by: Reid Wahl <[email protected]>
Best practice and makes Coverity happy.

Signed-off-by: Reid Wahl <[email protected]>
Best practice and makes Coverity happy.

Signed-off-by: Reid Wahl <[email protected]>
@nrwahl2
Copy link
Contributor Author

nrwahl2 commented Mar 6, 2025

It looks like coverity just doesn't understand that pcmk__g_strcat will call pcmk__assert to check for NULL

I'm not sure it's about that. get_override_dir() returns a value that it initially gets from g_string_sized_new(). Coverity may just not be understanding that CRM_CHECK() sends us to the done label. Hard to tell... the execution path doesn't mention what happens at CRM_CHECK(). It only consists of the pcmk__build_path() call and the check before freeing in the done section.

nrwahl2 added 2 commits March 6, 2025 00:17
Coverity seems not to understand what CRM_CHECK() is doing.

Signed-off-by: Reid Wahl <[email protected]>
@clumens clumens merged commit 47fa0d8 into ClusterLabs:main Mar 6, 2025
1 check passed
@nrwahl2 nrwahl2 deleted the nrwahl2-systemd branch March 6, 2025 19:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
review: in progress PRs that are currently being reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants