Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of case function. #695

Merged
merged 2 commits into from
Sep 27, 2024
Merged

Conversation

lukasz-soszynski-eliatra
Copy link
Contributor

Description

Implementation of the case function

Issues Resolved

#645

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@@ -504,7 +506,63 @@ class FlintSparkPPLEvalITSuite
comparePlans(logicalPlan, expectedPlan, checkAnalysis = false)
}

test("eval case function") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also add more complex tests use case including composition of additional commands such as where conditions and stats ?
thanks

@@ -348,4 +348,38 @@ class FlintSparkPPLFiltersITSuite
assert(compareByString(expectedPlan) === compareByString(logicalPlan))
}

test("case function used as filter") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can u also add here some additional tests use cases of more complicated composition ppl commands ?

@@ -265,6 +266,15 @@ Assumptions: `a`, `b`, `c` are existing fields in `table`
- `source = table | eval f = ispresent(a)`
- `source = table | eval r = coalesce(a, b, c) | fields r`
- `source = table | eval e = isempty(a) | fields e`
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plz add more examples for using case in other use cases that include where clause / stats commands

@@ -171,6 +171,8 @@ public T visitIsEmpty(IsEmpty node, C context) {
return visitChildren(node, context);
}

// TODO add case
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is it TODO ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I need to remove this TODO.

Signed-off-by: Lukasz Soszynski <[email protected]>
@YANG-DB
Copy link
Member

YANG-DB commented Sep 26, 2024

@lukasz-soszynski-eliatra please ping me when this is ready for review
thanks

@lukasz-soszynski-eliatra
Copy link
Contributor Author

I'll do my best to make corrections today.

@lukasz-soszynski-eliatra
Copy link
Contributor Author

@YANG-DB
Please take a look at the commit. I added more tests and documentation.

@YANG-DB YANG-DB merged commit c0924cc into opensearch-project:main Sep 27, 2024
4 checks passed
@LantaoJin
Copy link
Member

LantaoJin commented Oct 28, 2024

@YANG-DB does this syntax match what we expected? From the issue description, the valid synatx is:

CASE(<predicate1>, <value1>, [<predicate2>, <value2>, ...] [[ELSE] <default>])

An valid example:

source = my_index
| eval status_category =
    case(status_code >= 200 AND status_code < 300, 'Success',
         status_code >= 300 AND status_code < 400, 'Redirection'
         status_code >= 400 AND status_code < 500, 'Client Error'
         status_code >= 500, 'Server Error',
         'Unknown')
| stats count() by status_category

But the implementation we merged is totally different as:

CASE(<predicate1>, <value1>[, <predicate2>, <value2>, ...] [ELSE <default>])

The example above needs rewritten to

source = my_index
| eval status_category =
    case(status_code >= 200 AND status_code < 300, 'Success',
         status_code >= 300 AND status_code < 400, 'Redirection'
         status_code >= 400 AND status_code < 500, 'Client Error'
         status_code >= 500, 'Server Error'
         ELSE 'Unknown')
| stats count() by status_category

Note, the , after 'Server Error' is removed and ELSE keyword is required for this case.
Not sure which would be better, please double confirm the current syntax is by designed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0.6 Lang:PPL Pipe Processing Language support
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants