-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Question] <title>Optimization suggestion: Implementation of refined JMX monitoring metrics #2908
Comments
thanks , @doveLin0818 I think this is a good idea. Can you consider optimizing both Kafka Client and Kafka JMX? For example, which indicators you will monitor. |
+1 i think it's a good idea 👍 . It is recommended to describe your custom design here before implementing it. |
@tomsun28 @zhangshenghang ok, I will try to balance both practicality and HertzBeat's design principles. |
Background: Due to the limited scalability of generic JMX collection methods, it is necessary to develop a customized JMX collection solution. Modification Process: In order to retain HertzBeat's design philosophy while supporting JMX customization, we need to first check if the current collection metrics are already registered for customization before collection. Taking Kafka's objectName=kafka.server:type=BrokerTopicMetrics,name=BytesInPerSec,topic=* as an example, we need to customize this monitoring. The process involves registering Kafka as an app in the CustomizedJmxFactory (JMX Monitoring Customization Factory) and then creating a KafkaJmxValidator to specifically handle Kafka's customization. The details are as follows: In summary, these components are used to determine whether the current monitoring needs to be customized. The logic for determining this is as follows: If the current monitoring scenario is not effectively registered, the normal process will be followed. This modification process effectively addresses the scalability issue of the generic JMX protocol, while also preserving the design principle that allows users to customize their protocols. The newly added functionality has been abstracted into a factory, making it more generic. If the community agrees with this approach, I will continue to improve it. @tomsun28 @zhangshenghang |
@doveLin0818 Thanks. Does Kafka's JMX monitoring support all versions? The differences among different versions also need to be considered. |
@tomsun28 hi tom,this is the parameter passed from upstream, refer to the general code of JMX: |
The point you made is very important, but jmx theoretically supports all kafka versions. I will consider this issue, such as using the general jmx code as the backup logic. |
Yes, confirm to avoid different keys of JMX. I have found this problem on many services. The key formats of different versions of JMX are different. |
Of course, your design is also universal, for other custom metrics that cannot be configured in the protocol. We recommend that support custom configuration in the template first, then hard code the way |
hi tom, @tomsun28 And when I tried to merge ReplicaManager monitoring information through aliasFields today, I also encountered some incompatibility issues. When designing the template kafka.server:type=ReplicaManager,name=*, this MBean, I found that aliasFields did not support me to merge because their attributes are all "Value". In fact, ReplicaManager has 12 indicators, and heartbeat currently only shows two. If users want to see these 12 indicators, they need to write 12 redundant copies. So if it is necessary to fully utilize the capabilities of JMX, I currently have no good way to change it through templates, and I prefer customized development, because customization does not affect users to customize monitoring templates themselves, but only increases the workload of writing code. |
Question
Current Situation Analysis (Using JMX Monitoring of Kafka as an Example):
Currently, many components' monitoring information is obtained through the JMX protocol, and the backend uses a generic method to retrieve monitoring metrics for all JMX-based monitoring logic. This leads to poor scalability of monitoring metrics and a suboptimal user experience. For example, consider the following scenarios:
First: Some metrics on the page can be merged for display. Otherwise, they may appear unfriendly and unattractive, as shown in the diagram below. Of course, users could implement this through a custom protocol, but having users define these metrics introduces a learning curve and negatively impacts the user experience. If handled by the backend, it would become easier and more visually appealing.
Second: The generic JMX protocol-based method for retrieving metrics does not fully leverage JMX's capabilities. Since the backend must consider universality, it inevitably struggles to accommodate custom needs. For instance, the following diagram shows aggregated information for all topics in a Kafka broker. However, in daily use or within organizations, there is often more focus on the consumption status of individual topics. Let’s assume I am a HertzBeat user. To view these metrics, I first need to learn the JMX protocol for monitoring Kafka's metrics and then modify the template, as shown in the diagram below.
However, HertzBeat users can only see the metrics for each topic, without knowing which specific topic the data corresponds to, as the JMX protocol does not provide a topicName metric. Displaying this requires backend implementation.
Third: Companies and enterprises are also concerned about Kafka rebalance issues. This is difficult to achieve with the generic JMX monitoring method or user-customized monitoring templates.
and many metrics cannot be implemented through the generic protocol...
In a word: The generic JMX method brings issues related to scalability. Custom development for components (such as Kafka) is needed. On the one hand, requiring users to learn JMX protocol metrics is a cost, which may lead to HertzBeat losing some competitive edge. On the other hand, the generic JMX monitoring method cannot fully harness JMX's capabilities.
If the community deems it necessary, I can attempt to customize JMX-based monitoring metrics for Kafka as a preliminary exploration.I will attempt to make Kafka's metrics more comprehensive and universal, while maintaining HertzBeat's design principles
The text was updated successfully, but these errors were encountered: