Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[META]Add PPL eval CASE funcitonality support #3008

Open
YANG-DB opened this issue Sep 10, 2024 · 3 comments
Open

[META]Add PPL eval CASE funcitonality support #3008

YANG-DB opened this issue Sep 10, 2024 · 3 comments
Assignees
Labels
enhancement New feature or request PPL Piped processing language

Comments

@YANG-DB
Copy link
Member

YANG-DB commented Sep 10, 2024

Is your feature request related to a problem?
Current CASE functionality syntax is only available in SQL query, this functionality is important for actual use cases for Observability and in addition more general cases where eval CASE add helpful functionality to the query.

Examples:

The next example shows the case function evaluates the HTTP error codes stored in the error field:

 ... | eval error_msg = case(error == 404, "Not found", error == 500, "Internal Server Error", error == 200, "OK")

The next example shows the case function evaluates the sort_field wd field type stored in the field:

 ... | eval sort_field=case(wd=="SUPPORT",1, wd=="APPLICATION",2, wd=="STORAGE",3)

Existing SQL syntax support has the following definition:

specificFunction
   : CASE expression caseFuncAlternative+ (ELSE elseArg = functionArg)? END     # caseFunctionCall
   | CASE caseFuncAlternative+ (ELSE elseArg = functionArg)? END                # caseFunctionCall
   | CAST '(' expression AS convertedDataType ')'                               # dataTypeFunctionCall
   ;

caseFuncAlternative
   : WHEN condition = functionArg THEN consequent = functionArg
   ;

We need the same support for PPL as shown in the above example.

Here is an example based on the existing SQL case syntax:

source = my_index
| eval status_category = CASE status_code
    WHEN status_code >= 200 AND status_code < 300 THEN 'Success'
    WHEN status_code >= 300 AND status_code < 400 THEN 'Redirection'
    WHEN status_code >= 400 AND status_code < 500 THEN 'Client Error'
    WHEN status_code >= 500 THEN 'Server Error'
    ELSE 'Unknown'
  END
| stats count() by status_category

What solution would you like?

We need support for PPL eval CASE functionality for both:

- OpenSearch based PPL engine

- Spark based PPL engine

Do you have any additional context?

@YANG-DB YANG-DB added enhancement New feature or request untriaged PPL Piped processing language labels Sep 10, 2024
@YANG-DB YANG-DB self-assigned this Sep 10, 2024
@YANG-DB YANG-DB changed the title [META]Add PPL eval CASE command support [META]Add PPL eval CASE funcitonality support Sep 10, 2024
@YANG-DB
Copy link
Member Author

YANG-DB commented Sep 11, 2024

@LantaoJin I'll be happy to get your feedback here

@LantaoJin
Copy link
Member

The CASE function is very useful whatever in SQL or PPL. I think we need it in PPL.
Simple is one of important design concepts of PPL, it would be redundant to completely copy the SQL syntax in PPL. Instead of adding the WHEN, THEN, ELSE and END keywords, I prefer to replace the branches with list of predicate-value pairs. My proposal of CASE function is:

CASE(<predicate1>, <value1>, [<predicate2>, <value2>, ...] [ELSE] <default>)

ELSE could be optional, removing it will save an additional keyword in PPL. It is also more like a function than a clause.
The example in description could be change to

source = my_index
| eval status_category =
    case(status_code >= 200 AND status_code < 300, 'Success',
         status_code >= 300 AND status_code < 400, 'Redirection'
         status_code >= 400 AND status_code < 500, 'Client Error'
         status_code >= 500, 'Server Error',
         'Unknown')
| stats count() by status_category

@YANG-DB
Copy link
Member Author

YANG-DB commented Sep 11, 2024

@LantaoJin thanks for your comments - I agree this is more pipeline oriented and looks in corespondent to other PPL commands
@penghuo @dai-chen - LMK what you think ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request PPL Piped processing language
Projects
None yet
Development

No branches or pull requests

2 participants