Skip to content

Commit

Permalink
Mon 59116 previous otel pr comments (#1407)
Browse files Browse the repository at this point in the history
REFS:MON-59116
* otel_command => otel_connector
data_point => otl_data_point
host_list.is_allowed => host_list.contains
otl_converter => otl_check_result_builder

* nagios_converter => nagios_check_result_builder
  • Loading branch information
jean-christophe81 authored Jun 7, 2024
1 parent 0845e3d commit f442b02
Show file tree
Hide file tree
Showing 34 changed files with 465 additions and 458 deletions.
32 changes: 16 additions & 16 deletions engine/doc/engine-doc.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,8 +64,8 @@ Engine can receive open telemetry data on a grpc server
A new module is added opentelemetry
It works like that:
* metrics are received
* extractors tries to extract host name and service description for each data_point. On success, data_point are pushed on fifos indexed by host, service
* a service that used these datas wants to do a check. The cmd line identifies the otl_converter that will construct check result from host service data_point fifos. If converter achieves to build a result from metrics, it returns right now, if it doesn't, a handler will be called as soon as needed metrics will be available or timeout expires.
* extractors tries to extract host name and service description for each otl_data_point. On success, otl_data_point are pushed on fifos indexed by host, service
* a service that used these datas wants to do a check. The cmd line identifies the otl_check_result_builder that will construct check result from host service otl_data_point fifos. If converter achieves to build a result from metrics, it returns right now, if it doesn't, a handler will be called as soon as needed metrics will be available or timeout expires.

### open telemetry request
The proto is organized like that
Expand Down Expand Up @@ -173,31 +173,31 @@ The proto is organized like that
```

### Concepts and classes
* data_point: data_point is the smallest unit of received request, data_point class contains data_point protobuf object and all his parents (resource, scope, metric)
* host serv extractors: When we receive otel metrics, we must extract host and service, this is his job. It can be configurable in order for example to search host name in data_point attribute or in scope. host serv extractors also contains host serv allowed. This list is updated by register_host_serv command method
* data_point fifo: a container that contains data points indexed by timestamp
* data_point fifo container: fifos indexed by host service
* otel_command: a fake connector that is used to make the link between engine and otel module
* otl_data_point: otl_data_point is the smallest unit of received request, otl_data_point class contains otl_data_point protobuf object and all his parents (resource, scope, metric)
* host serv extractors: When we receive otel metrics, we must extract host and service, this is his job. It can be configurable in order for example to search host name in otl_data_point attribute or in scope. host serv extractors also contains host serv allowed. This list is updated by register_host_serv command method
* otl_data_point fifo: a container that contains data points indexed by timestamp
* otl_data_point fifo container: fifos indexed by host service
* otel_connector: a fake connector that is used to make the link between engine and otel module
* otl_server: a grpc server that accept otel collector incoming connections
* otl_converter: This short lived object is created each time engine wants to do a check. His final class as his configuration is done from the command line of the check. His job is to create a check result from data_point fifo container datas. It's destroyed when he achieved to create a check result or when timeout expires.
* host_serv_list: in order to extract host and service, an host_serv extractor must known allowed host service pairs. As otel_command may be notified of host service using it by register_host_serv method while otel module is not yet loaded. This object shared between otel_command and host_serv_extractor is actualized from otel_command::register_host_serv.
* otl_check_result_builder: This short lived object is created each time engine wants to do a check. His final class as his configuration is done from the command line of the check. His job is to create a check result from otl_data_point fifo container datas. It's destroyed when he achieved to create a check result or when timeout expires.
* host_serv_list: in order to extract host and service, an host_serv extractor must known allowed host service pairs. As otel_connector may be notified of host service using it by register_host_serv method while otel module is not yet loaded. This object shared between otel_connector and host_serv_extractor is actualized from otel_connector::register_host_serv.

### How engine access to otl object
In otel_interface.hh, otel object interface are defined in engine commands namespace.
Object used by both otel module and engine are inherited from these interfaces.
Engine only knows a singleton of the interface open_telemetry_base. This singleton is initialized at otl module loading.

### How to configure it
We use a fake connector. When configuration is loaded, if a connector command line begins with "open_telemetry", we create an otel_command. Arguments following "open_telemetry" are used to create an host service extractor. If otel module is loaded, we create extractor, otherwise, the otel_command initialization will be done at otel module loading.
We use a fake connector. When configuration is loaded, if a connector command line begins with "open_telemetry", we create an otel_connector. Arguments following "open_telemetry" are used to create an host service extractor. If otel module is loaded, we create extractor, otherwise, the otel_connector initialization will be done at otel module loading.
So user has to build one connector by host serv extractor configuration.
Then commands can use these fake connectors (class otel_command) to run checks.
Then commands can use these fake connectors (class otel_connector) to run checks.

### How a service do a check
When otel_command::run is called, it calls the check method of open_telemetry singleton.
The check method of open_telemetry object will use command line passed to run to create an otl_converter object that has to convert metrics to check result.
The open_telemetry call sync_build_result_from_metrics, if it can't achieve to build a result, otl_converter is stored in a container.
When a metric of a waiting service is received, async_build_result_from_metrics of otl_converter is called.
In open_telemetry object, a second timer is also used to call async_time_out of otl_converter on timeout expire.
When otel_connector::run is called, it calls the check method of open_telemetry singleton.
The check method of open_telemetry object will use command line passed to run to create an otl_check_result_builder object that has to convert metrics to check result.
The open_telemetry call sync_build_result_from_metrics, if it can't achieve to build a result, otl_check_result_builder is stored in a container.
When a metric of a waiting service is received, async_build_result_from_metrics of otl_check_result_builder is called.
In open_telemetry object, a second timer is also used to call async_time_out of otl_check_result_builder on timeout expire.

### other configuration
other configuration parameters are stored in a dedicated json file. The path of this file is passed as argument in centengine.cfg
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,8 @@
* For more information : [email protected]
*/

#ifndef CCE_COMMANDS_OTEL_COMMAND_HH
#define CCE_COMMANDS_OTEL_COMMAND_HH
#ifndef CCE_COMMANDS_OTEL_CONNECTOR_HH
#define CCE_COMMANDS_OTEL_CONNECTOR_HH

#include "com/centreon/engine/commands/command.hh"
#include "com/centreon/engine/commands/otel_interface.hh"
Expand All @@ -31,19 +31,19 @@ namespace com::centreon::engine::commands {
* open telemetry request run command line configure converter who converts
* data_points to result
*/
class otel_command : public command,
public std::enable_shared_from_this<otel_command> {
class otel_connector : public command,
public std::enable_shared_from_this<otel_connector> {
otel::host_serv_list::pointer _host_serv_list;

public:
using otel_command_container =
absl::flat_hash_map<std::string, std::shared_ptr<otel_command>>;
using otel_connector_container =
absl::flat_hash_map<std::string, std::shared_ptr<otel_connector>>;

private:
static otel_command_container _commands;
static otel_connector_container _commands;

std::shared_ptr<otel::host_serv_extractor> _extractor;
std::shared_ptr<otel::converter_config> _conv_conf;
std::shared_ptr<otel::check_result_builder_config> _conv_conf;

std::shared_ptr<spdlog::logger> _logger;

Expand All @@ -59,18 +59,20 @@ class otel_command : public command,
static bool update(const std::string& connector_name,
const std::string& cmd_line);

static std::shared_ptr<otel_command> get_otel_command(
static std::shared_ptr<otel_connector> get_otel_connector(
const std::string& connector_name);

static void clear();

static void init_all();

static const otel_command_container& get_otel_commands() { return _commands; }
static const otel_connector_container& get_otel_connectors() {
return _commands;
}

otel_command(const std::string& connector_name,
const std::string& cmd_line,
commands::command_listener* listener);
otel_connector(const std::string& connector_name,
const std::string& cmd_line,
commands::command_listener* listener);

void update(const std::string& cmd_line);

Expand Down
45 changes: 22 additions & 23 deletions engine/inc/com/centreon/engine/commands/otel_interface.hh
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@
namespace com::centreon::engine::commands::otel {

/**
* @brief struct returned by otl_converter::extract_host_serv_metric
* @brief struct returned by otl_check_result_builder::extract_host_serv_metric
* success if host not empty
* service may be empty if it's a host check
*
Expand All @@ -52,7 +52,7 @@ struct host_serv_metric {

/**
* @brief this list is the list of host service(may be empty) pairs
* This list is shared between otel_command and his extractor
* This list is shared between otel_connector and his extractor
*
*/
class host_serv_list {
Expand All @@ -64,20 +64,19 @@ class host_serv_list {

void register_host_serv(const std::string& host,
const std::string& service_description);
void unregister_host_serv(const std::string& host,
const std::string& service_description);
void remove(const std::string& host, const std::string& service_description);

bool is_allowed(const std::string& host,
const std::string& service_description) const;
bool contains(const std::string& host,
const std::string& service_description) const;

template <typename host_set, typename service_set>
host_serv_metric is_allowed(const host_set& hosts,
const service_set& services) const;
host_serv_metric match(const host_set& hosts,
const service_set& services) const;
};

template <typename host_set, typename service_set>
host_serv_metric host_serv_list::is_allowed(const host_set& hosts,
const service_set& services) const {
host_serv_metric host_serv_list::match(const host_set& hosts,
const service_set& services) const {
host_serv_metric ret;
absl::ReaderMutexLock l(&_data_m);
for (const auto& host : hosts) {
Expand Down Expand Up @@ -110,12 +109,11 @@ host_serv_metric host_serv_list::is_allowed(const host_set& hosts,
class host_serv_extractor {
public:
virtual ~host_serv_extractor() = default;

};

class converter_config {
class check_result_builder_config {
public:
virtual ~converter_config() = default;
virtual ~check_result_builder_config() = default;
};

using result_callback = std::function<void(const result&)>;
Expand All @@ -141,16 +139,17 @@ class open_telemetry_base
const std::string& cmdline,
const host_serv_list::pointer& host_serv_list) = 0;

virtual std::shared_ptr<converter_config> create_converter_config(
const std::string& cmd_line) = 0;

virtual bool check(const std::string& processed_cmd,
const std::shared_ptr<converter_config>& conv_conf,
uint64_t command_id,
nagios_macros& macros,
uint32_t timeout,
commands::result& res,
result_callback&& handler) = 0;
virtual std::shared_ptr<check_result_builder_config>
create_check_result_builder_config(const std::string& cmd_line) = 0;

virtual bool check(
const std::string& processed_cmd,
const std::shared_ptr<check_result_builder_config>& conv_conf,
uint64_t command_id,
nagios_macros& macros,
uint32_t timeout,
commands::result& res,
result_callback&& handler) = 0;
};

}; // namespace com::centreon::engine::commands::otel
Expand Down
4 changes: 2 additions & 2 deletions engine/modules/opentelemetry/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -51,12 +51,12 @@ ${SRC_DIR}/grpc_config.cc
${SRC_DIR}/host_serv_extractor.cc
${SRC_DIR}/open_telemetry.cc
${SRC_DIR}/otl_config.cc
${SRC_DIR}/otl_converter.cc
${SRC_DIR}/otl_check_result_builder.cc
${SRC_DIR}/otl_data_point.cc
${SRC_DIR}/otl_server.cc
${SRC_DIR}/main.cc
${SRC_DIR}/telegraf/conf_server.cc
${SRC_DIR}/telegraf/nagios_converter.cc
${SRC_DIR}/telegraf/nagios_check_result_builder.cc
${SRC_DIR}/opentelemetry/proto/collector/metrics/v1/metrics_service.grpc.pb.cc
)

Expand Down
32 changes: 16 additions & 16 deletions engine/modules/opentelemetry/doc/opentelemetry.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@ Engine can receive open telemetry data on a grpc server
A new module is added opentelemetry
It works like that:
* metrics are received
* extractors tries to extract host name and service description for each data_point. On success, data_point are pushed on fifos indexed by host, service
* a service that used these datas wants to do a check. The cmd line identifies the otl_converter that will construct check result from host service data_point fifos. If converter achieves to build a result from metrics, it returns right now, if it doesn't, a handler will be called as soon as needed metrics will be available or timeout expires.
* extractors tries to extract host name and service description for each otl_data_point. On success, otl_data_point are pushed on fifos indexed by host, service
* a service that used these datas wants to do a check. The cmd line identifies the otl_check_result_builder that will construct check result from host service otl_data_point fifos. If converter achieves to build a result from metrics, it returns right now, if it doesn't, a handler will be called as soon as needed metrics will be available or timeout expires.

### open telemetry request
The proto is organized like that
Expand Down Expand Up @@ -113,31 +113,31 @@ The proto is organized like that
```

### Concepts and classes
* data_point: data_point is the smallest unit of received request, data_point class contains data_point protobuf object and all his parents (resource, scope, metric)
* host serv extractors: When we receive otel metrics, we must extract host and service, this is his job. It can be configurable in order for example to search host name in data_point attribute or in scope. host serv extractors also contains host serv allowed. This list is updated by register_host_serv command method
* data_point fifo: a container that contains data points indexed by timestamp
* data_point fifo container: fifos indexed by host service
* otel_command: a fake connector that is used to make the link between engine and otel module
* otl_data_point: otl_data_point is the smallest unit of received request, otl_data_point class contains otl_data_point protobuf object and all his parents (resource, scope, metric)
* host serv extractors: When we receive otel metrics, we must extract host and service, this is his job. It can be configurable in order for example to search host name in otl_data_point attribute or in scope. host serv extractors also contains host serv allowed. This list is updated by register_host_serv command method
* otl_data_point fifo: a container that contains data points indexed by timestamp
* otl_data_point fifo container: fifos indexed by host service
* otel_connector: a fake connector that is used to make the link between engine and otel module
* otl_server: a grpc server that accept otel collector incoming connections
* otl_converter: This short lived object is created each time engine wants to do a check. His final class as his configuration is done from the command line of the check. His job is to create a check result from data_point fifo container datas. It's destroyed when he achieved to create a check result or when timeout expires.
* host_serv_list: in order to extract host and service, an host_serv extractor must known allowed host service pairs. As otel_command may be notified of host service using it by register_host_serv method while otel module is not yet loaded. This object shared between otel_command and host_serv_extractor is actualized from otel_command::register_host_serv.
* otl_check_result_builder: This short lived object is created each time engine wants to do a check. His final class as his configuration is done from the command line of the check. His job is to create a check result from otl_data_point fifo container datas. It's destroyed when he achieved to create a check result or when timeout expires.
* host_serv_list: in order to extract host and service, an host_serv extractor must known allowed host service pairs. As otel_connector may be notified of host service using it by register_host_serv method while otel module is not yet loaded. This object shared between otel_connector and host_serv_extractor is actualized from otel_connector::register_host_serv.

### How engine access to otl object
In otel_interface.hh, otel object interface are defined in engine commands namespace.
Object used by both otel module and engine are inherited from these interfaces.
Engine only knows a singleton of the interface open_telemetry_base. This singleton is initialized at otl module loading.

### How to configure it
We use a fake connector. When configuration is loaded, if a connector command line begins with "open_telemetry", we create an otel_command. Arguments following "open_telemetry" are used to create an host service extractor. If otel module is loaded, we create extractor, otherwise, the otel_command initialization will be done at otel module loading.
We use a fake connector. When configuration is loaded, if a connector command line begins with "open_telemetry", we create an otel_connector. Arguments following "open_telemetry" are used to create an host service extractor. If otel module is loaded, we create extractor, otherwise, the otel_connector initialization will be done at otel module loading.
So user has to build one connector by host serv extractor configuration.
Then commands can use these fake connectors (class otel_command) to run checks.
Then commands can use these fake connectors (class otel_connector) to run checks.

### How a service do a check
When otel_command::run is called, it calls the check method of open_telemetry singleton.
The check method of open_telemetry object will use command line passed to run to create an otl_converter object that has to convert metrics to check result.
The open_telemetry call sync_build_result_from_metrics, if it can't achieve to build a result, otl_converter is stored in a container.
When a metric of a waiting service is received, async_build_result_from_metrics of otl_converter is called.
In open_telemetry object, a second timer is also used to call async_time_out of otl_converter on timeout expire.
When otel_connector::run is called, it calls the check method of open_telemetry singleton.
The check method of open_telemetry object will use command line passed to run to create an otl_check_result_builder object that has to convert metrics to check result.
The open_telemetry call sync_build_result_from_metrics, if it can't achieve to build a result, otl_check_result_builder is stored in a container.
When a metric of a waiting service is received, async_build_result_from_metrics of otl_check_result_builder is called.
In open_telemetry object, a second timer is also used to call async_time_out of otl_check_result_builder on timeout expire.

### other configuration
other configuration parameters are stored in a dedicated json file. The path of this file is passed as argument in centengine.cfg
Expand Down
Loading

0 comments on commit f442b02

Please sign in to comment.