[FEATURE]Create spark ppl local test documentation instructions #896

YANG-DB · 2024-11-12T21:27:44Z

Is your feature request related to a problem?
Update the opensearch documentation for running a local spark cluster with opensearch flint jar and test the ppl new commands

What solution would you like?
we need a documentation of setting up a local spark cluster - we would like to see detailed instructions for setting the spark cluster including the spark flint jars and possibly a docker-compose defining these services

Do you have any additional context?

see comment

# Produce the artifact
sbt clean sparkPPLCosmetic/publishM2

# Start Spark with the plugin
bin/spark-sql --jars "/ABSOLUTE_PATH_TO_ARTIFACT/opensearch-spark-ppl_2.12-0.6.0-SNAPSHOT.jar" \
--conf "spark.sql.extensions=org.opensearch.flint.spark.FlintPPLSparkExtensions"  \
--conf "spark.sql.catalog.dev=org.apache.spark.opensearch.catalog.OpenSearchCatalog" \
--conf "spark.hadoop.hive.cli.print.header=true"

# Insert test table and data
CREATE TABLE employees (name STRING, dept STRING, salary INT, age INT, con STRING);

INSERT INTO employees VALUES ("Lisa", "Sales------", 10000, 35, 'test');
INSERT INTO employees VALUES ("Evan", "Sales------", 32000, 38, 'test');
INSERT INTO employees VALUES ("Fred", "Engineering", 21000, 28, 'test');
INSERT INTO employees VALUES ("Alex", "Sales", 30000, 33, 'test');
INSERT INTO employees VALUES ("Tom", "Engineering", 23000, 33, 'test');
INSERT INTO employees VALUES ("Jane", "Marketing", 29000, 28, 'test');
INSERT INTO employees VALUES ("Jeff", "Marketing", 35000, 38, 'test');
INSERT INTO employees VALUES ("Paul", "Engineering", 29000, 23, 'test');
INSERT INTO employees VALUES ("Chloe", "Engineering", 23000, 25, 'test');

# Execute WMA with basic option:

source=employees | trendline sort age wma(2, salary);

name	dept	salary	age	con	salary_trendline
Paul	Engineering	29000	23	test	NULL
Chloe	Engineering	23000	25	test	25000.0
Jane	Marketing	29000	28	test	27000.0
Fred	Engineering	21000	28	test	23666.666666666668
Alex	Sales------	30000	33	test	27000.0
Tom	Engineering	23000	33	test	25333.333333333332
Lisa	Sales------	10000	35	test	14333.333333333334
Jeff	Marketing	35000	38	test	26666.666666666668
Evan	Sales------	32000	38	test	33000.0


# Execute WMA with alias:

source=employees | trendline sort age wma(2, salary) as CUSTOM_NAME

name	dept	salary	age	con	CUSTOM_NAME
Paul	Engineering	29000	23	test	NULL
Chloe	Engineering	23000	25	test	25000.0
Jane	Marketing	29000	28	test	27000.0
Fred	Engineering	21000	28	test	23666.666666666668
Alex	Sales------	30000	33	test	27000.0
Tom	Engineering	23000	33	test	25333.333333333332
Lisa	Sales------	10000	35	test	14333.333333333334
Jeff	Marketing	35000	38	test	26666.666666666668
Evan	Sales------	32000	38	test	33000.0


# Execute WMA with multiple calculations:

source=employees | trendline sort age wma(2, salary) as WMA_2 wma(3, salary) as WMA_3;


name	dept	salary	age	con	WMA_2	WMA_3
Paul	Engineering	29000	23	test	NULL	NULL
Chloe	Engineering	23000	25	test	25000.0	NULL
Jane	Marketing	29000	28	test	27000.0	27000.0
Fred	Engineering	21000	28	test	23666.666666666668	24000.0
Alex	Sales------	30000	33	test	27000.0	26833.333333333332
Tom	Engineering	23000	33	test	25333.333333333332	25000.0
Lisa	Sales------	10000	35	test	14333.333333333334	17666.666666666668
Jeff	Marketing	35000	38	test	26666.666666666668	24666.666666666668
Evan	Sales------	32000	38	test	33000.0	29333.333333333332
Time taken: 0.466 seconds, Fetched 9 row(s)

The text was updated successfully, but these errors were encountered:

YANG-DB · 2024-11-13T04:52:27Z

In addition we need to produce an html report similar to this one for sanity tests

YANG-DB added enhancement New feature or request untriaged Lang:PPL Pipe Processing Language support labels Nov 12, 2024

anasalkouz removed the untriaged label Nov 12, 2024

YANG-DB mentioned this issue Nov 13, 2024

local spark ppl testing documentation #902

Merged

5 tasks

YANG-DB closed this as completed Dec 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE]Create spark ppl local test documentation instructions #896

[FEATURE]Create spark ppl local test documentation instructions #896

YANG-DB commented Nov 12, 2024

YANG-DB commented Nov 13, 2024

[FEATURE]Create spark ppl local test documentation instructions #896

[FEATURE]Create spark ppl local test documentation instructions #896

Comments

YANG-DB commented Nov 12, 2024

YANG-DB commented Nov 13, 2024