Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V1.5 #4

Open
wants to merge 18 commits into
base: v1.4
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions antora.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
name: pytigergraph
title: pyTigerGraph
version: 1.4
display_version: "1.4"
version: 1.5
display_version: "1.5"
start_page: intro:index.adoc

nav:
Expand All @@ -12,5 +12,6 @@ nav:
- modules/gds/nav.adoc
- modules/datasets/nav.adoc
- modules/visualization/nav.adoc
- modules/object_oriented_schema/nav.adoc
- modules/contributing/nav.adoc
- modules/release-notes/nav.adoc
1 change: 1 addition & 0 deletions generator/docstring2asciidoc_cfg_template.json
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
"pyTigerGraph/pyTigerGraphUDT.py": "modules/core-functions/pages/udt.adoc",
"pyTigerGraph/pyTigerGraphUtils.py": "modules/core-functions/pages/utils.adoc",
"pyTigerGraph/pyTigerGraphVertex.py": "modules/core-functions/pages/vertex.adoc",
"pyTigerGraph/schema.py": "modules/core-functions/pages/schema-def.adoc",
"pyTigerGraph/gds/gds.py": "modules/gds/pages/gds.adoc",
"pyTigerGraph/gds/dataloaders.py": "modules/gds/pages/dataloaders.adoc",
"pyTigerGraph/gds/featurizer.py": "modules/gds/pages/featurizer.adoc",
Expand Down
4 changes: 1 addition & 3 deletions modules/core-functions/pages/base.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -18,14 +18,12 @@ a self-signed certificate will be used.
* `graphname`: The default graph for running queries.
* `gsqlSecret`: The secret key for GSQL. +
See https://docs.tigergraph.com/tigergraph-server/current/user-access/managing-credentials#_secrets.[this] for more details.
Required for GSQL authentication on TigerGraph Cloud instances created after
July 5, 2022.
* `username`: The username on the TigerGraph server.
* `password`: The password for that user.
* `tgCloud`: Set to `True` if using TigerGraph Cloud. If your hostname contains `tgcloud`, then
this is automatically set to `True`, and you do not need to set this argument.
* `restppPort`: The port for REST++ queries.
* `gsPort`: The port of all other queries.
* `gsPort`: The port for gsql server.
* `gsqlVersion`: The version of the GSQL client to be used. Effectively the version of the database
being connected to.
* `version`: DEPRECATED; use `gsqlVersion`.
Expand Down
5 changes: 4 additions & 1 deletion modules/datasets/pages/dataset_object.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,10 @@ Stock datasets.

Please see https://tigergraph-public-data.s3.us-west-1.amazonaws.com/inventory.json[this link]
for datasets that are currently available. The files for the dataset with `name` will be
downloaded to local `tmp_dir` automatically when this class is instantiated.
downloaded to local `tmp_dir` automatically when this class is instantiated.
For offline environments, download the desired tar manually from the invenetory page, and extract in the desired location.
parkererickson-tg marked this conversation as resolved.
Show resolved Hide resolved
Specify the `tmp_dir` parameter to point to where the unzipped directory resides.


[discrete]
==== Parameters:
Expand Down
17 changes: 13 additions & 4 deletions modules/gds/pages/gds.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ Whether to add a topic for each epoch. Defaults to False.


== neighborLoader()
`neighborLoader(v_in_feats: Union[list, dict] = None, v_out_labels: Union[list, dict] = None, v_extra_feats: Union[list, dict] = None, e_in_feats: Union[list, dict] = None, e_out_labels: Union[list, dict] = None, e_extra_feats: Union[list, dict] = None, batch_size: int = None, num_batches: int = 1, num_neighbors: int = 10, num_hops: int = 2, shuffle: bool = False, filter_by: str = None, output_format: str = "PyG", add_self_loop: bool = False, loader_id: str = None, buffer_size: int = 4, reverse_edge: bool = False, delimiter: str = "|", timeout: int = 300000, callback_fn: Callable = None, reinstall_query: bool = False, distributed_query: bool = False) -> NeighborLoader`
`neighborLoader(v_in_feats: Union[list, dict] = None, v_out_labels: Union[list, dict] = None, v_extra_feats: Union[list, dict] = None, v_seed_types: Union[str, list] = None, e_in_feats: Union[list, dict] = None, e_out_labels: Union[list, dict] = None, e_extra_feats: Union[list, dict] = None, batch_size: int = None, num_batches: int = 1, num_neighbors: int = 10, num_hops: int = 2, shuffle: bool = False, filter_by: str = None, output_format: str = "PyG", add_self_loop: bool = False, loader_id: str = None, buffer_size: int = 4, reverse_edge: bool = False, delimiter: str = "|", timeout: int = 300000, callback_fn: Callable = None, reinstall_query: bool = False, distributed_query: bool = False) -> NeighborLoader`

Returns a `NeighborLoader` instance.
A `NeighborLoader` instance performs neighbor sampling from vertices in the graph in batches in the following manner:
Expand Down Expand Up @@ -138,6 +138,9 @@ certain attribute doesn't exist in all vertex types. If it is a dict, keys of th
dict are vertex types to be selected, and values are lists of attributes to be
selected for each vertex type.
Numeric, boolean and string attributes are allowed. Defaults to None.
* `v_seed_types (str or list, optional)`: Directly specify the vertex types to use as seeds. If not specified, defaults to
the vertex types used in filter_by. If not specified there, uses all vertex types.
Defaults to None.
* `e_in_feats (list or dict, optional)`: Edge attributes to be used as input features.
If it is a list, then the attributes
in the list from all edge types will be selected. An error will be thrown if
Expand Down Expand Up @@ -414,7 +417,7 @@ can be included as seeds. If a dictionary is provided, must be in the form of:
{"vertex_type": "attribute"}. If a list, must contain multiple filters and an
unique loader will be returned for each list element. Defaults to None.
* `output_format (str, optional)`: Format of the output data of the loader.
Only "PyG", "DGL", "spektral", and "dataframe" are supported. Defaults to "dataframe".
Only "PyG", "DGL", "spektral", and "dataframe" are supported. Defaults to "PyG".
* `add_self_loop (bool, optional)`: Whether to add self-loops to the graph. Defaults to False.
* `loader_id (str, optional)`: An identifier of the loader which can be any string. It is
also used as the Kafka topic name if Kafka topic is not given. If `None`, a random string will be generated
Expand All @@ -435,7 +438,7 @@ for examples.


== edgeNeighborLoader()
`edgeNeighborLoader(v_in_feats: Union[list, dict] = None, v_out_labels: Union[list, dict] = None, v_extra_feats: Union[list, dict] = None, e_in_feats: Union[list, dict] = None, e_out_labels: Union[list, dict] = None, e_extra_feats: Union[list, dict] = None, batch_size: int = None, num_batches: int = 1, num_neighbors: int = 10, num_hops: int = 2, shuffle: bool = False, filter_by: str = None, output_format: str = "PyG", add_self_loop: bool = False, loader_id: str = None, buffer_size: int = 4, reverse_edge: bool = False, delimiter: str = "|", timeout: int = 300000, callback_fn: Callable = None, reinstall_query: bool = False, distributed_query: bool = False) -> EdgeNeighborLoader`
`edgeNeighborLoader(v_in_feats: Union[list, dict] = None, v_out_labels: Union[list, dict] = None, v_extra_feats: Union[list, dict] = None, e_in_feats: Union[list, dict] = None, e_out_labels: Union[list, dict] = None, e_extra_feats: Union[list, dict] = None, e_seed_types: Union[str, list] = None, batch_size: int = None, num_batches: int = 1, num_neighbors: int = 10, num_hops: int = 2, shuffle: bool = False, filter_by: str = None, output_format: str = "PyG", add_self_loop: bool = False, loader_id: str = None, buffer_size: int = 4, reverse_edge: bool = False, delimiter: str = "|", timeout: int = 300000, callback_fn: Callable = None, reinstall_query: bool = False, distributed_query: bool = False) -> EdgeNeighborLoader`

Returns an `EdgeNeighborLoader` instance.
An `EdgeNeighborLoader` instance performs neighbor sampling from all edges in the graph in batches in the following manner:
Expand Down Expand Up @@ -512,6 +515,9 @@ selected. An error will be thrown if certain attribute doesn't exist in all
edge types. If it is a dict, keys of the dict are edge types to be selected,
and values are lists of attributes to be selected for each edge type.
Numeric, boolean and string attributes are allowed. Defaults to None.
* `e_seed_types (str or list, optional)`: Directly specify the edge types to use as seeds. If not specified, defaults to
the edge types used in filter_by. If not specified there, uses all edge types.
Defaults to None.
* `batch_size (int, optional)`: Number of vertices as seeds in each batch.
Defaults to None.
* `num_batches (int, optional)`: Number of batches to split the vertices into as seeds.
Expand Down Expand Up @@ -621,7 +627,7 @@ for examples.


== hgtLoader()
`hgtLoader(num_neighbors: dict, v_in_feats: Union[list, dict] = None, v_out_labels: Union[list, dict] = None, v_extra_feats: Union[list, dict] = None, e_in_feats: Union[list, dict] = None, e_out_labels: Union[list, dict] = None, e_extra_feats: Union[list, dict] = None, batch_size: int = None, num_batches: int = 1, num_hops: int = 2, shuffle: bool = False, filter_by: str = None, output_format: str = "PyG", add_self_loop: bool = False, loader_id: str = None, buffer_size: int = 4, reverse_edge: bool = False, delimiter: str = "|", timeout: int = 300000, callback_fn: Callable = None, reinstall_query: bool = False, distributed_query: bool = False) -> HGTLoader`
`hgtLoader(num_neighbors: dict, v_in_feats: Union[list, dict] = None, v_out_labels: Union[list, dict] = None, v_extra_feats: Union[list, dict] = None, v_seed_types: Union[str, list] = None, e_in_feats: Union[list, dict] = None, e_out_labels: Union[list, dict] = None, e_extra_feats: Union[list, dict] = None, batch_size: int = None, num_batches: int = 1, num_hops: int = 2, shuffle: bool = False, filter_by: str = None, output_format: str = "PyG", add_self_loop: bool = False, loader_id: str = None, buffer_size: int = 4, reverse_edge: bool = False, delimiter: str = "|", timeout: int = 300000, callback_fn: Callable = None, reinstall_query: bool = False, distributed_query: bool = False) -> HGTLoader`

Returns a `HGTLoader` instance.
A `HGTLoader` instance performs stratified neighbor sampling from vertices in the graph in batches in the following manner:
Expand Down Expand Up @@ -677,6 +683,9 @@ certain attribute doesn't exist in all vertex types. If it is a dict, keys of th
dict are vertex types to be selected, and values are lists of attributes to be
selected for each vertex type.
Numeric, boolean and string attributes are allowed. Defaults to None.
* `v_seed_types (str or list, optional)`: Directly specify the vertex types to use as seeds. If not specified, defaults to
the vertex types used in filter_by. If not specified there, uses all vertex types.
Defaults to None.
* `e_in_feats (list or dict, optional)`: Edge attributes to be used as input features.
If it is a list, then the attributes
in the list from all edge types will be selected. An error will be thrown if
Expand Down
1 change: 1 addition & 0 deletions modules/object_oriented_schema/nav.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
* xref:schema-def.adoc[Object Oriented Schema]
272 changes: 272 additions & 0 deletions modules/object_oriented_schema/pages/schema-def.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,272 @@
= Object-Oriented Schema

The Object-Oriented Schema functionality allows users to manipulate schema elements in the database in an object-oriented approach in Python.

To add an AccountHolder vertex and a HOLDS_ACCOUNT edge to the Ethereum dataset, simply:

```py
from pyTigerGraph import TigerGraphConnection
from pyTigerGraph.schema import Graph, Vertex, Edge

from datetime import datetime
from typing import List, Dict, Optional, Union
from dataclasses import dataclass, fields

conn = TigerGraphConnection(host="http://YOUR_HOSTNAME_HERE", graphname="Ethereum")

g = Graph(conn)


@dataclass
class AccountHolder(Vertex):
name: str
address: str
accounts: List[str]
dob: datetime
some_map: Dict[str, int]
some_double: "DOUBLE"
primary_id: str = "name" # always of type string, corresponds to the desired primary ID attribute.
primary_id_as_attribute: bool = True

@dataclass
class HOLDS_ACCOUNT(Edge):
opened_on: datetime
from_vertex: Union[AccountHolder, g.vertex_types["Account"]]
to_vertex: g.vertex_types["Account"]
is_directed: bool = True
reverse_edge: str = "ACCOUNT_HELD_BY"
discriminator: str = "opened_on"

g.add_vertex_type(AccountHolder)

g.add_edge_type(HOLDS_ACCOUNT)

g.commit_changes()
```

One could also define the entire graph schema using the approach. For example, for the Cora dataset, the schema would look something like this:
parkererickson-tg marked this conversation as resolved.
Show resolved Hide resolved

```py
from pyTigerGraph import TigerGraphConnection
from pyTigerGraph.schema import Graph, Vertex, Edge

conn = TigerGraphConnection("http://YOUR_HOSTNAME_HERE", graphname="Cora")

g = Graph()

@dataclass
class Paper(Vertex):
id: int
y: int
x: List[int]
primary_id: str = "id"
primary_id_as_attribute: bool = True

@dataclass
class CITES(Edge):
from_vertex: Paper
to_vertex: Paper
is_directed: bool = True
reverse_edge: str = "R_CITES"

g.add_vertex_type(Paper)
g.add_edge_type(CITES)

g.commit_changes(conn)
```

== Vertex

Abstract parent class for other types of vertices to be inherited from.
Contains class methods to edit the attributes associated with the vertex type.

When defining new vertex types, make sure to include the `primary_id` and `primary_id_as_attribute` class attributes, as these are necessary to define the vertex in TigerGraph.

For example, to define an AccountHolder vertex type, use:


```py
@dataclass
class AccountHolder(Vertex):
name: str
address: str
accounts: List[str]
dob: datetime
some_map: Dict[str, int]
some_double: "DOUBLE"
primary_id: str = "name"
primary_id_as_attribute: bool = True
```



=== add_attribute()
`add_attribute(attribute_name: str, attribute_type, default_value = None)`

Function to add an attribute to the given vertex type.

[discrete]
==== Parameters:
* `attribute_name (str)`: The name of the attribute to add
* `attribute_type (Python type)`: The Python type of the attribute to add.
For types that are not supported in Python but are in GSQL, wrap them in quotes; e.g. "DOUBLE"
* `default_value (type of attribute, default None)`: The desired default value of the attribute. Defaults to None.


=== remove_attribute()
`remove_attribute(attribute_name)`

Function to remove an attribute from the given vertex type.

[discrete]
==== Parameter:
* `attribute_name (str)`: The name of the attribute to remove from the vertex.


=== attributes()
`attributes()`

Class attribute to view the attributes and types of the vertex.


== Edge

Abstract parent class for other types of edges to be inherited from.
Contains class methods to edit the attributes associated with the edge type.

When defining new vertex types, make sure to include the required `from_vertex`, `to_vertex`, `reverse_edge`, `is_directed` attributes and optionally the `discriminator` attribute, as these are necessary to define the vertex in TigerGraph.

For example, to define an HOLDS_ACCOUNT edge type, use:


```py
@dataclass
class HOLDS_ACCOUNT(Edge):
opened_on: datetime
from_vertex: Union[AccountHolder, g.vertex_types["Account"]]
to_vertex: g.vertex_types["Account"]
is_directed: bool = True
reverse_edge: str = "ACCOUNT_HELD_BY"
discriminator: str = "opened_on"
```



=== add_attribute()
`add_attribute(attribute_name, attribute_type, default_value = None)`

Function to add an attribute to the given edge type.

[discrete]
==== Parameters:
* `attribute_name (str)`: The name of the attribute to add.
* `attribute_type (Python type)`: The Python type of the attribute to add.
For types that are not supported in Python but are in GSQL, wrap them in quotes; e.g. "DOUBLE"
* `default_value (type of attribute, default None)`: The desired default value of the attribute. Defaults to None.


=== remove_attribute()
`remove_attribute(attribute_name)`

Function to remove an attribute from the given edge type.

[discrete]
==== Parameter:
* `attribute_name (str)`: The name of the attribute to remove from the edge.


=== attributes()
`attributes()`

Class attribute to view the attributes and types of the vertex.


== Graph

The graph object can be used in conjunction with a TigerGraphConnection to retrieve the schema of the connected graph.
Serves as the way to collect the definitions of Vertex and Edge types.

To instantiate the graph object with a connection to an existing graph, use:

```py
from pyTigerGraph.schema import Graph

g = Graph(conn)
```



=== \__init__()
`__init__(conn: TigerGraphConnection = None)`

Graph class for schema representation.

[discrete]
==== Parameter:
* `conn (TigerGraphConnection, optional)`: Connection to a TigerGraph database. Defaults to None.


=== add_vertex_type()
`add_vertex_type(vertex: Vertex, outdegree_stats = True)`

Add a vertex type to the list of changes to commit to the graph.

[discrete]
==== Parameters:
* `vertex (Vertex)`: The vertex type definition to add to the addition cache.
* `outdegree_stats (bool, optional)`: Whether or not to include "WITH OUTEGREE_STATS=TRUE" in the schema definition.
Used for caching outdegree, defaults to True.


=== add_edge_type()
`add_edge_type(edge: Edge)`

Add an edge type to the list of changes to commit to the graph.

[discrete]
==== Parameter:
* `edge (Edge)`: The edge type definition to add to the addition cache.


=== remove_vertex_type()
`remove_vertex_type(vertex: Vertex)`

Add a vertex type to the list of changes to remove from the graph.

[discrete]
==== Parameter:
* `vertex (Vertex)`: The vertex type definition to add to the removal cache.


=== remove_edge_type()
`remove_edge_type(edge: Edge)`

Add an edge type to the list of changes to remove from the graph.

[discrete]
==== Parameter:
* `edge (Edge)`: The edge type definition to add to the removal cache.


=== commit_changes()
`commit_changes(conn: TigerGraphConnection = None)`

Commit schema changes to the graph.
[discrete]
==== Parameter:
* `conn (TigerGraphConnection, optional)`: Connection to the database to edit the schema of.
Not required if the Graph was instantiated with a connection object.


=== vertex_types()
`vertex_types()`

Vertex types property.


=== edge_types()
`edge_types()`

Edge types property.


Loading