-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
local spark ppl testing documentation #902
local spark ppl testing documentation #902
Conversation
Signed-off-by: YANGDB <[email protected]>
Signed-off-by: YANGDB <[email protected]>
Signed-off-by: YANGDB <[email protected]>
Signed-off-by: YANGDB <[email protected]>
@YANG-DB What's the motivation to add this doc? I think we already have the guide about the local spark ppl usage in root README: And the ppl commands testing is somehow duplicate with ppl-commands doc, that place should be the single of truth for each command |
## emails table | ||
```sql | ||
CREATE TABLE emails (name STRING, age INT, email STRING, street_address STRING, year INT, month INT) PARTITIONED BY (year, month); | ||
INSERT INTO testTable (name, age, email, street_address, year, month) VALUES ('Alice', 30, '[email protected]', '123 Main St, Seattle', 2023, 4), ('Bob', 55, '[email protected]', '456 Elm St, Portland', 2023, 5), ('Charlie', 65, '[email protected]', '789 Pine St, San Francisco', 2023, 4), ('David', 19, '[email protected]', '101 Maple St, New York', 2023, 5), ('Eve', 21, '[email protected]', '202 Oak St, Boston', 2023, 4), ('Frank', 76, '[email protected]', '303 Cedar St, Austin', 2023, 5), ('Grace', 41, '[email protected]', '404 Birch St, Chicago', 2023, 4), ('Hank', 32, '[email protected]', '505 Spruce St, Miami', 2023, 5), ('Ivy', 9, '[email protected]', '606 Fir St, Denver', 2023, 4), ('Jack', 12, '[email protected]', '707 Ash St, Seattle', 2023, 5); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
testTable
should be emails
# Testing PPL using local Spark | ||
|
||
## Produce the PPL artifact | ||
The first step would be to produce the spark-ppl artifact: `sbt clean sparkPPLCosmetic/publishM2` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This action is dangerous when a user has write credentials and remote repo settings in env. How about change to sbt clean sparkPPLCosmetic/assembly
?
It will generate the spark-ppl artifact and print it in the end:
[info] Built: ./opensearch-spark/sparkPPLCosmetic/target/scala-2.12/opensearch-spark-ppl-assembly-x.y.z-SNAPSHOT.jar
[info] Jar hash: 71dd9c
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@LantaoJin I've updated - please review and see if anything else is missing
thanks
Signed-off-by: YANGDB <[email protected]>
Hi @LantaoJin |
Hi @qianheng-aws - thanks for the feedback |
… queries Signed-off-by: YANGDB <[email protected]>
## Start Spark with the plugin | ||
Once installed, run spark with the generated PPL artifact: | ||
```shell | ||
bin/spark-sql --jars "/PATH_TO_ARTIFACT/oopensearch-spark-ppl-assembly-x.y.z-SNAPSHOT.jar" \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
double o
typo here
Signed-off-by: YANGDB <[email protected]>
* add local spark ppl testing documentation and details Signed-off-by: YANGDB <[email protected]> * update more sample test tables and commands Signed-off-by: YANGDB <[email protected]> * update more sample test tables and commands Signed-off-by: YANGDB <[email protected]> * update more sample test tables and commands Signed-off-by: YANGDB <[email protected]> * update for using opensearch-spark-ppl-assembly-x.y.z-SNAPSHOT.jar Signed-off-by: YANGDB <[email protected]> * update tutorial documentation on using a local spark-cluster with ppl queries Signed-off-by: YANGDB <[email protected]> * typo fix Signed-off-by: YANGDB <[email protected]> --------- Signed-off-by: YANGDB <[email protected]>
Description
add local spark ppl testing documentation and details
Related Issues
#896
Check List
--signoff
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.