Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Logstash API spec - first pass #16546

Draft
wants to merge 13 commits into
base: main
Choose a base branch
from
252 changes: 252 additions & 0 deletions docs/static/spec/openapi/starter.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,252 @@

openapi: 3.1.0
info:
title: Logstash APIs
description: |
When you run Logstash, it automatically captures runtime metrics that you can use to monitor the health and performance of your Logstash deployment.
The metrics collected by Logstash include:
- Logstash node info, like pipeline settings, OS info, and JVM info.
- Plugin info, including a list of installed plugins.
- Node stats, like JVM stats, process stats, event-related stats, and pipeline runtime stats.
- Hot threads.
- Health report.
The APIs that retrieve these metrics are available by default without requiring any extra configuration.
## Documentation source and versions
This documentation is derived from the `main` branch of the [logstash](https://github.com/elastic/logstash) repository.
It is provided under license [Attribution-NonCommercial-NoDerivatives 4.0 International](https://creativecommons.org/licenses/by-nc-nd/4.0/).
version: '1.0'
x-doc-license:
name: Attribution-NonCommercial-NoDerivatives 4.0 International
url: https://creativecommons.org/licenses/by-nc-nd/4.0/
x-feedbackLink:
label: Feedback
url: https://github.com/elastic/docs-content/issues/new?assignees=&labels=feedback%2Ccommunity&projects=&template=api-feedback.yaml&title=%5BFeedback%5D%3A+
servers:
- url: /
security:
- apiKeyAuth: []
paths:
/_node/<types>:
get:
summary: Get node info
description: >
Get information about Logstash nodes, where `<types>` (optional) specifies the types of node info you want returned.
You can limit the info that is returned by combining any of these types in a comma-separated list:
- `pipelines`.
- `os`.
- `jvm`.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the trick to getting items on their own lines? (I tried \n, but they rendered literally.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Node Info section has subsections for pipelines, os, and jvm, each with their own examples.
ToDo: Format nested content

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the trick to getting items on their own lines? (I tried \n, but they rendered literally.)

I think this is now fixed (by using | )

image

operationId: nodeInfo
parameters:
- name: pretty
in: query
schema:
type: boolean
description: >
If you append `?pretty=true` to the request, the JSON returned will be pretty formatted. Use it for debugging only!
responses:
'200':
description: Indicates a successful call
content:
application/json:
examples:
nodeInfoExample1:
pipelines:
- test:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trick to keep null from appearing for entries without values.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like an indentation issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've created 81b6294 and it matches the original docs now

workers: 1
batch_size: 1
batch_delay: 5
config_reload_automatic: false
config_reload_interval: 3
- test2:
workers: 8
batch_size: 125
batch_delay: 5
config_reload_automatic: false
config_reload_interval: 3

/_node/plugins:
get:
summary: Get plugin info
description: >
Get information about all Logstash plugins that are currently installed.
This API returns the same output you get by running the `bin/logstash-plugin list --verbose` command.
operationId: nodePlugins
parameters:
- name: pretty
in: query
schema:
type: boolean
description: >
If you append `?pretty=true` to the request, the JSON returned will be pretty formatted. Use it for debugging only!
responses:
'200':
description: Indicates a successful call
content:
application/json:
examples:
nodePluginsExample1:
total: 1
plugins:
- name: logstash-codec-cef
version: 6.2.8
- name: logstash-codec-collectd
version: 3.0.3
- name: logstash-codec-dots
version: 3.0.2
- name: logstash-coded-edn
version: 3.0.2

/_node/stats:
get:
summary: Get node stats
description: >
Get runtime stats for Logstash, where `<types>` (optional) specifies the types of stats you want to return.
You can limit the info that is returned by combining any of these types in a comma-separated list:
- `jvm` gets JVM stats, including stats about threads, memory usage, garbage collectors, and uptime.
- `process` gets process stats, including stats about file descriptors, memory consumption, and CPU usage.
- `events` gets event-related statistics for the Logstash instance (regardless of how many pipelines were created and destroyed).
- `flow` gets flow-related statistics for the Logstash instance (regardless of how many pipelines were created and destroyed).
- `pipelines` gets runtime stats about each Logstash pipeline.
- `reloads` gets runtime stats about config reload successes and failures.
- `os` gets runtime stats about cgroups when Logstash is running in a container.
- `geoip_download_manager` gets stats for databases used with the Geoip filter plugin.
Comment on lines +153 to +160
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Formatting for bulleted list. Table format?

Copy link
Contributor

@lcawl lcawl Jan 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't tried a table yet but AFAIK it should work if you use a simple markdown table. Might be something that could be tested in a future update, however, if this is good enough for now.

operationId: nodeStats
parameters:
- name: pretty
in: query
schema:
type: boolean
description: >
If you append `?pretty=true` to the request, the JSON returned will be pretty formatted (use it for debugging only!).
responses:
'200':
description: Indicates a successful call
content:
application/json:
examples:
nodeStatsExample1:
jvm:
threads:
count: 49
peak_count: 50
mem:
heap_used_percent: 14
heap_committed_in_bytes: 309866496
heap_max_in_bytes: 1037959168
heap_used_in_bytes: 151686096
non_heap_used_in_bytes: 122486176
non_heap_committed_in_bytes: 133222400
pools:
survivor:
peak_used_in_bytes: 8912896
used_in_bytes: 288776
peak_max_in_bytes: 35782656
max_in_bytes: 35782656
committed_in_bytes: 8912896
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Example is only partially reproduced.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I copied the rest of the info into the yaml file in 558a2b6


/_node/hot_threads:
get:
summary: Get hot threads
description: >
Get information about current hot threads for Logstash.
A hot thread is a Java thread that has high CPU usage and takes longer than normal to execute.
operationId: nodeHot_threads
parameters:
- name: threads
in: query
schema:
type: integer
description: >
The number of hot threads to return. The default is 10.
- name: stacktrace_size
in: query
schema:
type: integer
description: >
The depth of the stack trace to report for each thread. The default is 50.
- name: ignore_idle_threads
in: query
schema:
type: boolean
description: >
If true, does not return idle threads. The default is `true`.
- name: pretty
in: query
schema:
type: boolean
description: >
If you append `?pretty=true` to the request, the JSON returned will be pretty formatted. Use it for debugging only!
- name: human
in: query
schema:
type: boolean
description: >
If you append `?human=true` to the request, the JSON returned will be in a human-readable format.
responses:
'200':
description: Indicates a successful call
content:
application/json:
examples:
nodeHotThreadsExample1:
hot_threads:
- time: 2025-01-06T18:25:28-07:00
busiest_threads: 3
threads:
name: Ruby-0-Thread-7
percent_of_cpu_time: 0.0
state: timed_waiting
path: /path/to/logstash-8.17.0/vendor/bundle/jruby/1.9/gems/puma-2.16.0-java/lib/puma/thread_pool.rb:187
traces: java.lang.Object.wait(Native Method)", "org.jruby.RubyThread.sleep(RubyThread.java:1002)", "org.jruby.RubyKernel.sleep(RubyKernel.java:803)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the ordering principle for entries?
Note that busiest_threads is rendering last but it's not entered there.

Screenshot 2024-12-27 at 3 21 02 PM

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, I've never noticed that before. I tried adding the schema definitions to see if their absence was affecting that (64db183) but it's still the same. I don't think the order matters, so I think it's not a big issue but if we want to know for future reference we'd need to query the folks at Bump.sh


/_health_report:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Health_report appears last in the spec, but is rendering first. (Not sure it matters, but I'd like to know what's going on.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This relates to https://docs.bump.sh/help/customization-options/operations-navigation/ and will be affected by the tags we added.

get:
summary: Get health status
description: >
lcawl marked this conversation as resolved.
Show resolved Hide resolved
The health API returns a report with the health status of Logstash and the pipelines that are running inside of it.
The report contains a list of indicators that compose Logstash functionality.
Each indicator has a health status of: green, unknown, yellow, or red.
The indicator provides an explanation and metadata describing the reason for its current health status.
The top-level status is controlled by the worst indicator status.
In the event that an indicator status is non-green, a list of impacts may be present in the indicator result which detail the functionalities that are negatively affected by the health issue.
Each impact carries with it a severity level, an area of the system that is affected, and a simple description of the impact on the system.
Some health indicators can determine the root cause of a health problem and prescribe a set of steps that can be performed in order to improve the health of the system.
The root cause and remediation steps are encapsulated in a diagnosis.
A diagnosis contains a cause detailing a root cause analysis, an action containing a brief description of the steps to take to fix the problem, and the URL for detailed troubleshooting help.
NOTE: The health indicators perform root cause analysis of non-green health statuses.
This can be computationally expensive when called frequently.
Comment on lines +312 to +328
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This content needs line breaks. I tried \n, but they render literally.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is just pure YAML. The scalar (>) folds on line 212 (description: >). If instead we use a literal scalar | it should preserve line breaks exactly as written (description: |). https://github.com/elastic/logstash/pull/16546/files#r1899659257

operationId: healthStatus
parameters:
- name: pretty
in: query
schema:
type: boolean
description: >
If you append `?pretty=true` to the request, the JSON returned will be pretty formatted. Use it for debugging only!
responses:
'200':
description: Indicates a successful call
content:
application/json:
examples:
healthStatusExample1:
status:
- green: Logstash is healthy.
unknown: Logstash health could not be determined.
yellow: The functionality of Logstash is in a degraded state and may need remediation to avoid the health becoming red.
red: Logstash is experiencing an outage or certain features are unavailable for use.
indicators: Information about the health of Logstash indicators.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Health report API docs show hierarchical descriptions rather that examples. What's the best way to represent this type of content? Also, research nested format.


Loading