tigergraph · parkererickson-tg · Jun 30, 2023 · Aug 31, 2023 · Aug 31, 2023 · Sep 15, 2023
diff --git a/antora.yml b/antora.yml
@@ -1,7 +1,7 @@
 name: pytigergraph
 title: pyTigerGraph
-version: 1.4
-display_version: "1.4"
+version: 1.5
+display_version: "1.5"
 start_page: intro:index.adoc
 
 nav:
@@ -12,5 +12,6 @@ nav:
 - modules/gds/nav.adoc
 - modules/datasets/nav.adoc
 - modules/visualization/nav.adoc
+- modules/object_oriented_schema/nav.adoc
 - modules/contributing/nav.adoc
 - modules/release-notes/nav.adoc
diff --git a/generator/docstring2asciidoc_cfg_template.json b/generator/docstring2asciidoc_cfg_template.json
@@ -13,6 +13,7 @@
     "pyTigerGraph/pyTigerGraphUDT.py": "modules/core-functions/pages/udt.adoc",
     "pyTigerGraph/pyTigerGraphUtils.py": "modules/core-functions/pages/utils.adoc",
     "pyTigerGraph/pyTigerGraphVertex.py": "modules/core-functions/pages/vertex.adoc",
+    "pyTigerGraph/schema.py": "modules/core-functions/pages/schema-def.adoc",
     "pyTigerGraph/gds/gds.py": "modules/gds/pages/gds.adoc",
     "pyTigerGraph/gds/dataloaders.py": "modules/gds/pages/dataloaders.adoc",
     "pyTigerGraph/gds/featurizer.py": "modules/gds/pages/featurizer.adoc",

diff --git a/modules/core-functions/pages/base.adoc b/modules/core-functions/pages/base.adoc
@@ -18,14 +18,12 @@ a self-signed certificate will be used.
 * `graphname`: The default graph for running queries.
 * `gsqlSecret`: The secret key for GSQL.  +
 See https://docs.tigergraph.com/tigergraph-server/current/user-access/managing-credentials#_secrets.[this] for more details.
-Required for GSQL authentication on TigerGraph Cloud instances created after
-July 5, 2022.
 * `username`: The username on the TigerGraph server.
 * `password`: The password for that user.
 * `tgCloud`: Set to `True` if using TigerGraph Cloud. If your hostname contains `tgcloud`, then
 this is automatically set to `True`, and you do not need to set this argument.
 * `restppPort`: The port for REST++ queries.
-* `gsPort`: The port of all other queries.
+* `gsPort`: The port for gsql server.
 * `gsqlVersion`: The version of the GSQL client to be used. Effectively the version of the database
 being connected to.
 * `version`: DEPRECATED; use `gsqlVersion`.

diff --git a/modules/datasets/pages/dataset_object.adoc b/modules/datasets/pages/dataset_object.adoc
@@ -11,7 +11,10 @@ Stock datasets.
 
 Please see https://tigergraph-public-data.s3.us-west-1.amazonaws.com/inventory.json[this link]
 for datasets that are currently available. The files for the dataset with `name` will be
-downloaded to local `tmp_dir` automatically when this class is instantiated.
+downloaded to local `tmp_dir` automatically when this class is instantiated. 
+For offline environments, download the desired tar manually from the invenetory page, and extract in the desired location.
+Specify the `tmp_dir` parameter to point to where the unzipped directory resides.
+
 
 [discrete]
 ==== Parameters:

diff --git a/modules/gds/pages/gds.adoc b/modules/gds/pages/gds.adoc
@@ -81,7 +81,7 @@ Whether to add a topic for each epoch. Defaults to False.
 
 
 == neighborLoader()
-`neighborLoader(v_in_feats: Union[list, dict] = None, v_out_labels: Union[list, dict] = None, v_extra_feats: Union[list, dict] = None, e_in_feats: Union[list, dict] = None, e_out_labels: Union[list, dict] = None, e_extra_feats: Union[list, dict] = None, batch_size: int = None, num_batches: int = 1, num_neighbors: int = 10, num_hops: int = 2, shuffle: bool = False, filter_by: str = None, output_format: str = "PyG", add_self_loop: bool = False, loader_id: str = None, buffer_size: int = 4, reverse_edge: bool = False, delimiter: str = "|", timeout: int = 300000, callback_fn: Callable = None, reinstall_query: bool = False, distributed_query: bool = False) -> NeighborLoader`
+`neighborLoader(v_in_feats: Union[list, dict] = None, v_out_labels: Union[list, dict] = None, v_extra_feats: Union[list, dict] = None, v_seed_types: Union[str, list] = None, e_in_feats: Union[list, dict] = None, e_out_labels: Union[list, dict] = None, e_extra_feats: Union[list, dict] = None, batch_size: int = None, num_batches: int = 1, num_neighbors: int = 10, num_hops: int = 2, shuffle: bool = False, filter_by: str = None, output_format: str = "PyG", add_self_loop: bool = False, loader_id: str = None, buffer_size: int = 4, reverse_edge: bool = False, delimiter: str = "|", timeout: int = 300000, callback_fn: Callable = None, reinstall_query: bool = False, distributed_query: bool = False) -> NeighborLoader`
 
 Returns a `NeighborLoader` instance.
 A `NeighborLoader` instance performs neighbor sampling from vertices in the graph in batches in the following manner:
@@ -138,6 +138,9 @@ certain attribute doesn't exist in all vertex types. If it is a dict, keys of th
 dict are vertex types to be selected, and values are lists of attributes to be 
 selected for each vertex type. 
 Numeric, boolean and string attributes are allowed. Defaults to None.
+* `v_seed_types (str or list, optional)`: Directly specify the vertex types to use as seeds. If not specified, defaults to
+the vertex types used in filter_by. If not specified there, uses all vertex types.
+Defaults to None.
 * `e_in_feats (list or dict, optional)`: Edge attributes to be used as input features. 
 If it is a list, then the attributes
 in the list from all edge types will be selected. An error will be thrown if
@@ -414,7 +417,7 @@ can be included as seeds. If a dictionary is provided, must be in the form of:
 {"vertex_type": "attribute"}. If a list, must contain multiple filters and an 
 unique loader will be returned for each list element. Defaults to None.
 * `output_format (str, optional)`: Format of the output data of the loader.
-Only "PyG", "DGL", "spektral", and "dataframe" are supported. Defaults to "dataframe".
+Only "PyG", "DGL", "spektral", and "dataframe" are supported. Defaults to "PyG".
 * `add_self_loop (bool, optional)`: Whether to add self-loops to the graph. Defaults to False.
 * `loader_id (str, optional)`: An identifier of the loader which can be any string. It is
 also used as the Kafka topic name if Kafka topic is not given. If `None`, a random string will be generated
@@ -435,7 +438,7 @@ for examples.
 
 
 == edgeNeighborLoader()
-`edgeNeighborLoader(v_in_feats: Union[list, dict] = None, v_out_labels: Union[list, dict] = None, v_extra_feats: Union[list, dict] = None, e_in_feats: Union[list, dict] = None, e_out_labels: Union[list, dict] = None, e_extra_feats: Union[list, dict] = None, batch_size: int = None, num_batches: int = 1, num_neighbors: int = 10, num_hops: int = 2, shuffle: bool = False, filter_by: str = None, output_format: str = "PyG", add_self_loop: bool = False, loader_id: str = None, buffer_size: int = 4, reverse_edge: bool = False, delimiter: str = "|", timeout: int = 300000, callback_fn: Callable = None, reinstall_query: bool = False, distributed_query: bool = False) -> EdgeNeighborLoader`
+`edgeNeighborLoader(v_in_feats: Union[list, dict] = None, v_out_labels: Union[list, dict] = None, v_extra_feats: Union[list, dict] = None, e_in_feats: Union[list, dict] = None, e_out_labels: Union[list, dict] = None, e_extra_feats: Union[list, dict] = None, e_seed_types: Union[str, list] = None, batch_size: int = None, num_batches: int = 1, num_neighbors: int = 10, num_hops: int = 2, shuffle: bool = False, filter_by: str = None, output_format: str = "PyG", add_self_loop: bool = False, loader_id: str = None, buffer_size: int = 4, reverse_edge: bool = False, delimiter: str = "|", timeout: int = 300000, callback_fn: Callable = None, reinstall_query: bool = False, distributed_query: bool = False) -> EdgeNeighborLoader`
 
 Returns an `EdgeNeighborLoader` instance.
 An `EdgeNeighborLoader` instance performs neighbor sampling from all edges in the graph in batches in the following manner:
@@ -512,6 +515,9 @@ selected. An error will be thrown if certain attribute doesn't exist in all
 edge types. If it is a dict, keys of the dict are edge types to be selected, 
 and values are lists of attributes to be selected for each edge type.
 Numeric, boolean and string attributes are allowed. Defaults to None.
+* `e_seed_types (str or list, optional)`: Directly specify the edge types to use as seeds. If not specified, defaults to
+the edge types used in filter_by. If not specified there, uses all edge types.
+Defaults to None.
 * `batch_size (int, optional)`: Number of vertices as seeds in each batch.
 Defaults to None.
 * `num_batches (int, optional)`: Number of batches to split the vertices into as seeds.
@@ -621,7 +627,7 @@ for examples.
 
 
 == hgtLoader()
-`hgtLoader(num_neighbors: dict, v_in_feats: Union[list, dict] = None, v_out_labels: Union[list, dict] = None, v_extra_feats: Union[list, dict] = None, e_in_feats: Union[list, dict] = None, e_out_labels: Union[list, dict] = None, e_extra_feats: Union[list, dict] = None, batch_size: int = None, num_batches: int = 1, num_hops: int = 2, shuffle: bool = False, filter_by: str = None, output_format: str = "PyG", add_self_loop: bool = False, loader_id: str = None, buffer_size: int = 4, reverse_edge: bool = False, delimiter: str = "|", timeout: int = 300000, callback_fn: Callable = None, reinstall_query: bool = False, distributed_query: bool = False) -> HGTLoader`
+`hgtLoader(num_neighbors: dict, v_in_feats: Union[list, dict] = None, v_out_labels: Union[list, dict] = None, v_extra_feats: Union[list, dict] = None, v_seed_types: Union[str, list] = None, e_in_feats: Union[list, dict] = None, e_out_labels: Union[list, dict] = None, e_extra_feats: Union[list, dict] = None, batch_size: int = None, num_batches: int = 1, num_hops: int = 2, shuffle: bool = False, filter_by: str = None, output_format: str = "PyG", add_self_loop: bool = False, loader_id: str = None, buffer_size: int = 4, reverse_edge: bool = False, delimiter: str = "|", timeout: int = 300000, callback_fn: Callable = None, reinstall_query: bool = False, distributed_query: bool = False) -> HGTLoader`
 
 Returns a `HGTLoader` instance.
 A `HGTLoader` instance performs stratified neighbor sampling from vertices in the graph in batches in the following manner:
@@ -677,6 +683,9 @@ certain attribute doesn't exist in all vertex types. If it is a dict, keys of th
 dict are vertex types to be selected, and values are lists of attributes to be 
 selected for each vertex type. 
 Numeric, boolean and string attributes are allowed. Defaults to None.
+* `v_seed_types (str or list, optional)`: Directly specify the vertex types to use as seeds. If not specified, defaults to
+the vertex types used in filter_by. If not specified there, uses all vertex types.
+Defaults to None.
 * `e_in_feats (list or dict, optional)`: Edge attributes to be used as input features. 
 If it is a list, then the attributes
 in the list from all edge types will be selected. An error will be thrown if

diff --git a/modules/object_oriented_schema/nav.adoc b/modules/object_oriented_schema/nav.adoc
@@ -0,0 +1 @@
+* xref:schema-def.adoc[Object Oriented Schema]
diff --git a/modules/object_oriented_schema/pages/schema-def.adoc b/modules/object_oriented_schema/pages/schema-def.adoc
@@ -0,0 +1,272 @@
+= Object-Oriented Schema
+
+The Object-Oriented Schema functionality allows users to manipulate schema elements in the database in an object-oriented approach in Python.
+
+To add an AccountHolder vertex and a HOLDS_ACCOUNT edge to the Ethereum dataset, simply:
+
+```py
+from pyTigerGraph import TigerGraphConnection
+from pyTigerGraph.schema import Graph, Vertex, Edge
+
+from datetime import datetime
+from typing import List, Dict, Optional, Union
+from dataclasses import dataclass, fields
+
+conn = TigerGraphConnection(host="http://YOUR_HOSTNAME_HERE", graphname="Ethereum")
+
+g = Graph(conn)
+
+
+@dataclass
+class AccountHolder(Vertex):
+    name: str
+    address: str
+    accounts: List[str]
+    dob: datetime
+    some_map: Dict[str, int]
+    some_double: "DOUBLE"
+    primary_id: str = "name"  # always of type string, corresponds to the desired primary ID attribute.
+    primary_id_as_attribute: bool = True
+
+@dataclass
+class HOLDS_ACCOUNT(Edge):
+    opened_on: datetime
+    from_vertex: Union[AccountHolder, g.vertex_types["Account"]]
+    to_vertex: g.vertex_types["Account"]
+    is_directed: bool = True
+    reverse_edge: str = "ACCOUNT_HELD_BY"
+    discriminator: str = "opened_on"
+
+g.add_vertex_type(AccountHolder)
+
+g.add_edge_type(HOLDS_ACCOUNT)
+
+g.commit_changes()
+```
+
+One could also define the entire graph schema using the approach. For example, for the Cora dataset, the schema would look something like this:
+
+```py
+from pyTigerGraph import TigerGraphConnection
+from pyTigerGraph.schema import Graph, Vertex, Edge
+
+conn = TigerGraphConnection("http://YOUR_HOSTNAME_HERE", graphname="Cora")
+
+g = Graph()
+
+@dataclass
+class Paper(Vertex):
+    id: int
+    y: int
+    x: List[int]
+    primary_id: str = "id"
+    primary_id_as_attribute: bool = True
+
+@dataclass
+class CITES(Edge):
+    from_vertex: Paper
+    to_vertex: Paper
+    is_directed: bool = True
+    reverse_edge: str = "R_CITES"
+
+g.add_vertex_type(Paper)
+g.add_edge_type(CITES)
+
+g.commit_changes(conn)
+```
+
+== Vertex
+
+Abstract parent class for other types of vertices to be inherited from.
+Contains class methods to edit the attributes associated with the vertex type.
+
+When defining new vertex types, make sure to include the `primary_id` and `primary_id_as_attribute` class attributes, as these are necessary to define the vertex in TigerGraph.
+
+For example, to define an AccountHolder vertex type, use:
+
+
+```py
+@dataclass
+class AccountHolder(Vertex):
+    name: str
+    address: str
+    accounts: List[str]
+    dob: datetime
+    some_map: Dict[str, int]
+    some_double: "DOUBLE"
+    primary_id: str = "name"
+    primary_id_as_attribute: bool = True
+```
+
+
+
+=== add_attribute()
+`add_attribute(attribute_name: str, attribute_type, default_value = None)`
+
+Function to add an attribute to the given vertex type.
+
+[discrete]
+==== Parameters:
+* `attribute_name (str)`: The name of the attribute to add
+* `attribute_type (Python type)`: The Python type of the attribute to add. 
+For types that are not supported in Python but are in GSQL, wrap them in quotes; e.g. "DOUBLE"
+* `default_value (type of attribute, default None)`: The desired default value of the attribute. Defaults to None.
+
+
+=== remove_attribute()
+`remove_attribute(attribute_name)`
+
+Function to remove an attribute from the given vertex type.
+
+[discrete]
+==== Parameter:
+* `attribute_name (str)`: The name of the attribute to remove from the vertex.
+
+
+=== attributes()
+`attributes()`
+
+Class attribute to view the attributes and types of the vertex.
+
+
+== Edge
+
+Abstract parent class for other types of edges to be inherited from.
+Contains class methods to edit the attributes associated with the edge type.
+
+When defining new vertex types, make sure to include the required `from_vertex`, `to_vertex`, `reverse_edge`, `is_directed` attributes and optionally the `discriminator` attribute, as these are necessary to define the vertex in TigerGraph.
+
+For example, to define an HOLDS_ACCOUNT edge type, use:
+
+
+```py
+@dataclass
+class HOLDS_ACCOUNT(Edge):
+    opened_on: datetime
+    from_vertex: Union[AccountHolder, g.vertex_types["Account"]]
+    to_vertex: g.vertex_types["Account"]
+    is_directed: bool = True
+    reverse_edge: str = "ACCOUNT_HELD_BY"
+    discriminator: str = "opened_on"
+```
+
+
+
+=== add_attribute()
+`add_attribute(attribute_name, attribute_type, default_value = None)`
+
+Function to add an attribute to the given edge type.
+
+[discrete]
+==== Parameters:
+* `attribute_name (str)`: The name of the attribute to add.
+* `attribute_type (Python type)`: The Python type of the attribute to add. 
+For types that are not supported in Python but are in GSQL, wrap them in quotes; e.g. "DOUBLE"
+* `default_value (type of attribute, default None)`: The desired default value of the attribute. Defaults to None.
+
+
+=== remove_attribute()
+`remove_attribute(attribute_name)`
+
+Function to remove an attribute from the given edge type.
+
+[discrete]
+==== Parameter:
+* `attribute_name (str)`: The name of the attribute to remove from the edge.
+
+
+=== attributes()
+`attributes()`
+
+Class attribute to view the attributes and types of the vertex.
+
+
+== Graph
+
+The graph object can be used in conjunction with a TigerGraphConnection to retrieve the schema of the connected graph.
+Serves as the way to collect the definitions of Vertex and Edge types.
+
+To instantiate the graph object with a connection to an existing graph, use:
+
+```py
+from pyTigerGraph.schema import Graph
+
+g = Graph(conn)
+```
+
+
+
+=== \__init__()
+`__init__(conn: TigerGraphConnection = None)`
+
+Graph class for schema representation.
+
+[discrete]
+==== Parameter:
+* `conn (TigerGraphConnection, optional)`: Connection to a TigerGraph database. Defaults to None.
+
+
+=== add_vertex_type()
+`add_vertex_type(vertex: Vertex, outdegree_stats = True)`
+
+Add a vertex type to the list of changes to commit to the graph.
+
+[discrete]
+==== Parameters:
+* `vertex (Vertex)`: The vertex type definition to add to the addition cache.
+* `outdegree_stats (bool, optional)`: Whether or not to include "WITH OUTEGREE_STATS=TRUE" in the schema definition.
+Used for caching outdegree, defaults to True.
+
+
+=== add_edge_type()
+`add_edge_type(edge: Edge)`
+
+Add an edge type to the list of changes to commit to the graph.
+
+[discrete]
+==== Parameter:
+* `edge (Edge)`: The edge type definition to add to the addition cache.
+
+
+=== remove_vertex_type()
+`remove_vertex_type(vertex: Vertex)`
+
+Add a vertex type to the list of changes to remove from the graph.
+
+[discrete]
+==== Parameter:
+* `vertex (Vertex)`: The vertex type definition to add to the removal cache.
+
+
+=== remove_edge_type()
+`remove_edge_type(edge: Edge)`
+
+Add an edge type to the list of changes to remove from the graph.
+
+[discrete]
+==== Parameter:
+* `edge (Edge)`: The edge type definition to add to the removal cache.
+
+
+=== commit_changes()
+`commit_changes(conn: TigerGraphConnection = None)`
+
+Commit schema changes to the graph.
+[discrete]
+==== Parameter:
+* `conn (TigerGraphConnection, optional)`: Connection to the database to edit the schema of.
+Not required if the Graph was instantiated with a connection object.        
+
+
+=== vertex_types()
+`vertex_types()`
+
+Vertex types property.
+
+
+=== edge_types()
+`edge_types()`
+
+Edge types property.
+
+