Skip to content

[Diagnostics] Add diagnostics of execution time and periodicity of the hardware components #2086

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
18 commits
Select commit Hold shift + click to select a range
4c9e7e0
Add diagnostics of execution time and periodicity of the hardware com…
saikishor Mar 1, 2025
b572ef1
Update the description of the parameters
saikishor Mar 1, 2025
7bd16c9
Publish periodicity stats for controllers and hardware running at dif…
saikishor Mar 1, 2025
7542544
Update the summary of hardware components
saikishor Mar 1, 2025
e0a2185
Update release notes
saikishor Mar 1, 2025
4cd545e
Update controller_manager/src/controller_manager.cpp
saikishor Mar 1, 2025
e45355c
Fix compilation
christophfroehlich Mar 2, 2025
5c77293
Merge branch 'master' into add/statistics/hardware_components
saikishor Mar 5, 2025
ca2efad
add missing false arg in newly added test
saikishor Mar 5, 2025
f9a062a
Merge branch 'master' into add/statistics/hardware_components
saikishor Mar 16, 2025
c96ee6b
Merge branch 'master' into add/statistics/hardware_components
saikishor Mar 19, 2025
1d0514c
Merge branch 'master' into add/statistics/hardware_components
saikishor Mar 19, 2025
96167a1
Merge branch 'master' into add/statistics/hardware_components
christophfroehlich Apr 4, 2025
c217baa
Fix pre-commit
christophfroehlich Apr 4, 2025
be9d02e
Merge branch 'master' into add/statistics/hardware_components
saikishor Apr 10, 2025
0faca71
Merge branch 'master' into add/statistics/hardware_components
bmagyar Apr 16, 2025
8e890fc
Merge branch 'master' into add/statistics/hardware_components
bmagyar Apr 17, 2025
91fefea
Merge branch 'master' into add/statistics/hardware_components
christophfroehlich Apr 25, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 9 additions & 3 deletions controller_manager/doc/parameters_context.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,13 @@ hardware_components_initial_state: |
- "base3"

diagnostics.threshold.controllers.periodicity: |
The ``periodicity`` diagnostics will be published only for the asynchronous controllers, because any affect to the synchronous controllers will be reflected directly in the controller manager's periodicity.
The ``periodicity`` diagnostics will be published for the asynchronous controllers, because any affect to the synchronous controllers will be reflected directly in the controller manager's periodicity. It is also published for the synchronous controllers that have a different update rate than the controller manager update rate.

diagnostics.threshold.controllers.execution_time: |
The ``execution_time`` diagnostics will be published for all controllers. The ``mean_error`` for a synchronous controller will be computed against zero, as it should be as low as possible. However, the ``mean_error`` for an asynchronous controller will be computed against the controller's desired update period, as the controller can take a maximum of the desired period cycle to execute it's update cycle.
diagnostics.threshold.controllers.execution_time.mean_error: |
The ``execution_time`` diagnostics will be published for all controllers. The ``mean_error`` for a synchronous controller will be computed against zero, as it should be as low as possible. However, the ``mean_error`` for an asynchronous controller will be computed against the controller's desired update period, as the controller can take a maximum of the desired period to execute its update cycle.

diagnostics.threshold.hardware_components.periodicity: |
The ``periodicity`` diagnostics will be published for the asynchronous hardware components, because any affect to the synchronous hardware components will be reflected directly in the controller manager's periodicity. It is also published for the synchronous hardware components that have a different read/write rate than the controller manager update rate.

diagnostics.threshold.hardware_components.execution_time.mean_error: |
The ``execution_time`` diagnostics will be published for all hardware components. The ``mean_error`` for a synchronous hardware component will be computed against zero, as it should be as low as possible. However, the ``mean_error`` for an asynchronous hardware component will be computed against its desired read/write period, as the hardware component can take a maximum of the desired period to execute the read/write cycle.
176 changes: 161 additions & 15 deletions controller_manager/src/controller_manager.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3562,7 +3562,9 @@ void ControllerManager::controller_activity_diagnostic_callback(
const auto exec_time_stats = controllers[i].execution_time_statistics->GetStatistics();
stat.add(
controllers[i].info.name + exec_time_suffix, make_stats_string(exec_time_stats, "us"));
if (is_async)
const bool publish_periodicity_stats =
is_async || (controllers[i].c->get_update_rate() != this->get_update_rate());
if (publish_periodicity_stats)
{
stat.add(
controllers[i].info.name + periodicity_suffix,
Expand Down Expand Up @@ -3662,12 +3664,21 @@ void ControllerManager::controller_activity_diagnostic_callback(
void ControllerManager::hardware_components_diagnostic_callback(
diagnostic_updater::DiagnosticStatusWrapper & stat)
{
if (!is_resource_manager_initialized())
{
stat.summary(
diagnostic_msgs::msg::DiagnosticStatus::ERROR, "Resource manager is not yet initialized!");
return;
}

bool all_active = true;
bool atleast_one_hw_active = false;
const std::string read_cycle_suffix = ".read_cycle";
const std::string write_cycle_suffix = ".write_cycle";
const std::string state_suffix = ".state";
const auto & hw_components_info = resource_manager_->get_components_status();
for (const auto & [component_name, component_info] : hw_components_info)
{
stat.add(component_name, component_info.state.label());
if (component_info.state.id() != lifecycle_msgs::msg::State::PRIMARY_STATE_ACTIVE)
{
all_active = false;
Expand All @@ -3677,31 +3688,166 @@ void ControllerManager::hardware_components_diagnostic_callback(
atleast_one_hw_active = true;
}
}
if (!is_resource_manager_initialized())
if (hw_components_info.empty())
{
stat.summary(
diagnostic_msgs::msg::DiagnosticStatus::ERROR, "Resource manager is not yet initialized!");
diagnostic_msgs::msg::DiagnosticStatus::ERROR, "No hardware components are loaded!");
return;
}
else if (hw_components_info.empty())
else if (!atleast_one_hw_active)
{
stat.summary(
diagnostic_msgs::msg::DiagnosticStatus::ERROR, "No hardware components are loaded!");
diagnostic_msgs::msg::DiagnosticStatus::WARN, "No hardware components are currently active");
return;
}
else

stat.summary(
diagnostic_msgs::msg::DiagnosticStatus::OK,
all_active ? "All hardware components are active" : "Not all hardware components are active");

if (cm_param_listener_->is_old(*params_))
{
if (!atleast_one_hw_active)
*params_ = cm_param_listener_->get_params();
}

auto make_stats_string =
[](const auto & statistics_data, const std::string & measurement_unit) -> std::string
{
std::ostringstream oss;
oss << std::fixed << std::setprecision(2);
oss << "Avg: " << statistics_data.average << " [" << statistics_data.min << " - "
<< statistics_data.max << "] " << measurement_unit
<< ", StdDev: " << statistics_data.standard_deviation;
return oss.str();
};

// Variable to define the overall status of the controller diagnostics
auto level = diagnostic_msgs::msg::DiagnosticStatus::OK;

std::vector<std::string> high_exec_time_hw;
std::vector<std::string> bad_periodicity_async_hw;

for (const auto & [component_name, component_info] : hw_components_info)
{
stat.add(component_name + state_suffix, component_info.state.label());
if (component_info.state.id() != lifecycle_msgs::msg::State::PRIMARY_STATE_ACTIVE)
{
stat.summary(
diagnostic_msgs::msg::DiagnosticStatus::WARN,
"No hardware components are currently active");
all_active = false;
}
else
{
stat.summary(
diagnostic_msgs::msg::DiagnosticStatus::OK, all_active
? "All hardware components are active"
: "Not all hardware components are active");
atleast_one_hw_active = true;
}
if (component_info.state.id() == lifecycle_msgs::msg::State::PRIMARY_STATE_ACTIVE)
{
auto update_stats =
[&bad_periodicity_async_hw, &high_exec_time_hw, &stat, &make_stats_string, this](
const std::string & comp_name, const auto & statistics,
const std::string & statistics_type_suffix, auto & diag_level, const auto & comp_info,
const auto & params)
{
if (!statistics)
{
return;
}
const bool is_async = comp_info.is_async;
const std::string periodicity_suffix = ".periodicity";
const std::string exec_time_suffix = ".execution_time";
const auto periodicity_stats = statistics->periodicity.get_statistics();
const auto exec_time_stats = statistics->execution_time.get_statistics();
stat.add(
comp_name + statistics_type_suffix + exec_time_suffix,
make_stats_string(exec_time_stats, "us"));
const bool publish_periodicity_stats =
is_async || (comp_info.rw_rate != this->get_update_rate());
if (publish_periodicity_stats)
{
stat.add(
comp_name + statistics_type_suffix + periodicity_suffix,
make_stats_string(periodicity_stats, "Hz") +
" -> Desired : " + std::to_string(comp_info.rw_rate) + " Hz");
const double periodicity_error =
std::abs(periodicity_stats.average - static_cast<double>(comp_info.rw_rate));
if (
periodicity_error >
params->diagnostics.threshold.hardware_components.periodicity.mean_error.error ||
periodicity_stats.standard_deviation > params->diagnostics.threshold.hardware_components
.periodicity.standard_deviation.error)
{
diag_level = diagnostic_msgs::msg::DiagnosticStatus::ERROR;
add_element_to_list(bad_periodicity_async_hw, comp_name);
}
else if (
periodicity_error >
params->diagnostics.threshold.hardware_components.periodicity.mean_error.warn ||
periodicity_stats.standard_deviation >
params->diagnostics.threshold.hardware_components.periodicity.standard_deviation.warn)
{
if (diag_level != diagnostic_msgs::msg::DiagnosticStatus::ERROR)
{
diag_level = diagnostic_msgs::msg::DiagnosticStatus::WARN;
}
add_element_to_list(bad_periodicity_async_hw, comp_name);
}
}
const double max_exp_exec_time =
is_async ? 1.e6 / static_cast<double>(comp_info.rw_rate) : 0.0;
if (
(exec_time_stats.average - max_exp_exec_time) >
params->diagnostics.threshold.hardware_components.execution_time.mean_error.error ||
exec_time_stats.standard_deviation > params->diagnostics.threshold.hardware_components
.execution_time.standard_deviation.error)
{
diag_level = diagnostic_msgs::msg::DiagnosticStatus::ERROR;
high_exec_time_hw.push_back(comp_name);
}
else if (
(exec_time_stats.average - max_exp_exec_time) >
params->diagnostics.threshold.hardware_components.execution_time.mean_error.warn ||
exec_time_stats.standard_deviation > params->diagnostics.threshold.hardware_components
.execution_time.standard_deviation.warn)
{
if (diag_level != diagnostic_msgs::msg::DiagnosticStatus::ERROR)
{
diag_level = diagnostic_msgs::msg::DiagnosticStatus::WARN;
}
high_exec_time_hw.push_back(comp_name);
}
};

// For components : {actuator, sensor and system}
update_stats(
component_name, component_info.read_statistics, read_cycle_suffix, level, component_info,
params_);
// For components : {actuator and system}
update_stats(
component_name, component_info.write_statistics, write_cycle_suffix, level, component_info,
params_);
}
}

if (!high_exec_time_hw.empty())
{
std::string high_exec_time_hw_string;
for (const auto & hw_comp : high_exec_time_hw)
{
high_exec_time_hw_string.append(hw_comp);
high_exec_time_hw_string.append(" ");
}
stat.mergeSummary(
level, "\nHigh execution jitter or mean error : [ " + high_exec_time_hw_string + "]");
}
if (!bad_periodicity_async_hw.empty())
{
std::string bad_periodicity_async_hw_string;
for (const auto & hw_comp : bad_periodicity_async_hw)
{
bad_periodicity_async_hw_string.append(hw_comp);
bad_periodicity_async_hw_string.append(" ");
}
stat.mergeSummary(
level,
"\nHigh periodicity jitter or mean error : [ " + bad_periodicity_async_hw_string + "]");
}
}

Expand Down
71 changes: 71 additions & 0 deletions controller_manager/src/controller_manager_parameters.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -134,3 +134,74 @@ controller_manager:
gt<>: 0.0,
}
}
hardware_components:
periodicity:
mean_error:
warn: {
type: double,
default_value: 5.0,
description: "The warning threshold for the mean error of the hardware component's read/write loop in Hz. If the mean error exceeds this threshold, a warning diagnostic will be published.",
validation: {
gt<>: 0.0,
}
}
error: {
type: double,
default_value: 10.0,
description: "The error threshold for the mean error of the hardware component's read/write loop in Hz. If the mean error exceeds this threshold, an error diagnostic will be published.",
validation: {
gt<>: 0.0,
}
}
standard_deviation:
warn: {
type: double,
default_value: 5.0,
description: "The warning threshold for the standard deviation of the hardware component's read/write loop in Hz. If the standard deviation exceeds this threshold, a warning diagnostic will be published.",
validation: {
gt<>: 0.0,
}
}
error: {
type: double,
default_value: 10.0,
description: "The error threshold for the standard deviation of the hardware component's read/write loop in Hz. If the standard deviation exceeds this threshold, an error diagnostic will be published.",
validation: {
gt<>: 0.0,
}
}
execution_time:
mean_error:
warn: {
type: double,
default_value: 1000.0,
description: "The warning threshold for the mean error of the hardware component's read/write cycle execution time in microseconds. If the mean error exceeds this threshold, a warning diagnostic will be published.",
validation: {
gt<>: 0.0,
}
}
error: {
type: double,
default_value: 2000.0,
description: "The error threshold for the mean error of the hardware component's read/write cycle execution time in microseconds. If the mean error exceeds this threshold, a error diagnostic will be published.",
validation: {
gt<>: 0.0,
}
}
standard_deviation:
warn: {
type: double,
default_value: 100.0,
description: "The warning threshold for the standard deviation of the hardware component's read/write cycle execution time in microseconds. If the standard deviation exceeds this threshold, a warning diagnostic will be published.",
validation: {
gt<>: 0.0,
}
}
error: {
type: double,
default_value: 200.0,
description: "The error threshold for the standard deviation of the hardware component's update cycle execution time in microseconds. If the standard deviation exceeds this threshold, an error diagnostic will be published.",
validation: {
gt<>: 0.0,
}
}
1 change: 1 addition & 0 deletions doc/release_notes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -168,6 +168,7 @@ hardware_interface
* With (`#1567 <https://github.com/ros-controls/ros2_control/pull/1567>`_) all the Hardware components now have a fully functional asynchronous functionality, by simply adding ``is_async`` tag to the ros2_control tag in the URDF. This will allow the hardware components to run in a separate thread, and the controller manager will be able to run the controllers in parallel with the hardware components.
* The hardware components can be easily introspect the internal member variables using the macro ``REGISTER_ROS2_CONTROL_INTROSPECTION`` (`#1918 <https://github.com/ros-controls/ros2_control/pull/1918>`_)
* Added new ``get_optional`` method that returns ``std::optional`` of the templated type, and this can be used to check if the value is available or not (`#1976 <https://github.com/ros-controls/ros2_control/pull/1976>`_ and `#2061 <https://github.com/ros-controls/ros2_control/pull/2061>`_)
* Added hardware components execution time and periodicity statistics diagnostics (`#2086 <https://github.com/ros-controls/ros2_control/pull/2086>`_)

joint_limits
************
Expand Down
8 changes: 8 additions & 0 deletions hardware_interface/include/hardware_interface/actuator.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@
#include "hardware_interface/handle.hpp"
#include "hardware_interface/hardware_info.hpp"
#include "hardware_interface/types/hardware_interface_return_values.hpp"
#include "hardware_interface/types/statistics_types.hpp"
#include "rclcpp/duration.hpp"
#include "rclcpp/logger.hpp"
#include "rclcpp/node_interfaces/node_clock_interface.hpp"
Expand Down Expand Up @@ -85,6 +86,10 @@ class Actuator final

const rclcpp::Time & get_last_write_time() const;

const HardwareComponentStatisticsCollector & get_read_statistics() const;

const HardwareComponentStatisticsCollector & get_write_statistics() const;

return_type read(const rclcpp::Time & time, const rclcpp::Duration & period);

return_type write(const rclcpp::Time & time, const rclcpp::Duration & period);
Expand All @@ -98,6 +103,9 @@ class Actuator final
rclcpp::Time last_read_cycle_time_;
// Last write cycle time
rclcpp::Time last_write_cycle_time_;
// Component statistics
HardwareComponentStatisticsCollector read_statistics_;
HardwareComponentStatisticsCollector write_statistics_;
};

} // namespace hardware_interface
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,14 +19,21 @@
#ifndef HARDWARE_INTERFACE__HARDWARE_COMPONENT_INFO_HPP_
#define HARDWARE_INTERFACE__HARDWARE_COMPONENT_INFO_HPP_

#include <memory>
#include <string>
#include <vector>

#include <rclcpp/time.hpp>
#include "rclcpp/time.hpp"
#include "rclcpp_lifecycle/state.hpp"

#include "hardware_interface/types/statistics_types.hpp"
namespace hardware_interface
{
struct HardwareComponentStatisticsData
{
ros2_control::MovingAverageStatisticsData execution_time;
ros2_control::MovingAverageStatisticsData periodicity;
};
/// Hardware Component Information
/**
* This struct contains information about a given hardware component.
Expand Down Expand Up @@ -59,6 +66,12 @@ struct HardwareComponentInfo

/// List of provided command interfaces by the component.
std::vector<std::string> command_interfaces;

/// Read cycle statistics of the component.
std::shared_ptr<HardwareComponentStatisticsData> read_statistics = nullptr;

/// Write cycle statistics of the component.
std::shared_ptr<HardwareComponentStatisticsData> write_statistics = nullptr;
};

} // namespace hardware_interface
Expand Down
Loading
Loading