UpdatesafterReview

tigergraph · Nov 3, 2023 · 0ee9472 · 0ee9472
1 parent ca3e1ac
commit 0ee9472
Showing 1 changed file with 28 additions and 20 deletions.
diff --git a/modules/data-loading/pages/spark-connection-via-jdbc-driver.adoc b/modules/data-loading/pages/spark-connection-via-jdbc-driver.adoc
@@ -27,31 +27,21 @@ This limits the concurrent JDBC connections to 40.
 ====
 
 == Load From a Data Lake via Spark
-. Create a GSQL xref:gsql-ref:ddl-and-loading:creating-a-loading-job.adoc[Loading Job], where you map the source column index to target vertex/edge attribute.
-. Use the `write()` function of the `DataFrame` to build a `DataFrameWriter`.
+. Use the `write()` function of the `DataFrame` to build a Spark `DataFrameWriter`.
 +
 .. Specify the `mode("overwrite")` to set the save mode.
 .. Specify `format("jdbc")` to leverage the xref:https://spark.apache.org/docs/latest/sql-data-sources-jdbc.html[JDBC Data Source].
 .. Specify the JDBC connection properties in configuration options.
+. Create a GSQL xref:gsql-ref:ddl-and-loading:creating-a-loading-job.adoc[Loading Job], where you map the source column index to target vertex/edge attribute.
 
-=== 1) Create a Loading Job
-.Example:
-[source, gsql]
-CREATE LOADING JOB load_Social FOR GRAPH Social {
-    DEFINE FILENAME file1;
-    DEFINE FILENAME file2;
-    LOAD file1 TO VERTEX Person VALUES ($0, $1, $2);
-    LOAD file2 TO EDGE Friendship VALUES ($0, $1);
-}
 
-The loading job above, `load_Social` loads the 1st, 2nd, and 3rd columns of source file, `file1`, to the 1st, 2nd, and 3rd attributes of the vertex `Person`.
+=== 1) Write a Spark `DataFrameWriter`
+
+Write a Spark DataFrameWriter function that will write data to CSV files, following the example below.
+
+NOTE: You need to choose names for a GSQL loading job and its data files that you will be using in Step 2.
 
-=== 2) Build a `DataFrameWriter`
 
-. Build a `DataFrameWriter`, with `write()`.
-.. Specify a `mode("overwrite")`.
-.. Specify a `format("jdbc")`.
-.. Specify the JDBC connection properties in configuration options.
 
 .Example: `DataFrameWriter` as  "df"
 [source, gsql]
@@ -91,10 +81,28 @@ Jerry,45,male
 Jenny,33,female
 Lizzy,19,female
 
-.Example: Post Request to TigerGraph
+=== 2) Create a Loading Job
+
+Write a GSQL loading job, using the job and file names that you used in step 1, to map data from the CSV file(s) to TigerGraph vertices and edges.
+
+.Example:
 [source, gsql]
-http://host:port/restpp/ddl/Social?tag=load_Person&filename=file1
---data <delimited_data>
+CREATE LOADING JOB load_Social FOR GRAPH Social {
+    DEFINE FILENAME file1;
+    DEFINE FILENAME file2;
+    LOAD file1 TO VERTEX Person VALUES ($0, $1, $2);
+    LOAD file2 TO EDGE Friendship VALUES ($0, $1);
+}
+
+The loading job above, `load_Social` loads the 1st, 2nd, and 3rd columns of source file, `file1`, to the 1st, 2nd, and 3rd attributes of the vertex `Person`.
+
+//Alternatively, loading jobs can be run as post requests.
+//.Example: Post Request to TigerGraph
+//[source, gsql]
+//http://host:port/restpp/ddl/Social?tag=load_Social&filename=file1
+//--data <delimited_data>
+
+See the pages xref:gsql-ref:ddl-and-loading:creating-a-loading-job.adoc[], xref:gsql-ref:ddl-and-loading:running-a-loading-job.adoc[] and xref:tigergraph-server:API:built-in-endpoints.adoc#_loading_jobs[Loading Jobs as a REST Endpoint] for more information about loading jobs in TigerGraph.
 
 == Advanced Usages with Spark