Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update TigerGraphX.md #245

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
156 changes: 70 additions & 86 deletions demos/guru_scripts/docker/tutorial/4.x/TigerGraphX.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,25 +4,44 @@ In this quick start guide, we will work with the following graph:

![Financial Graph](https://raw.githubusercontent.com/tigergraph/ecosys/master/demos/guru_scripts/docker/tutorial/4.x/FinancialGraph.jpg)


## Installation Guide

Follow this guide to install and set up **TigerGraphX** in your environment.

### Requirements

This project requires **Python 3.12** and **TigerGraph 4.2**. Ensure you meet the following prerequisites before proceeding:
This project requires **Python 3.10, 3.11 or 3.12** and **TigerGraph 4.2**. Ensure you meet the following prerequisites before proceeding:

#### **1. Python 3.12**
- Please ensure Python 3.12 is installed on your system.
#### **1. Python**
- Please ensure Python 3.10, 3.11 or 3.12 is installed on your system.
- You can download and install it from the [official Python website](https://www.python.org/downloads/).

#### **2. TigerGrapsh 4.2**
#### **2. TigerGraph**

TigerGraph is required for this project and can be set up in one of the following ways:
TigerGraph 4.2 is required for this project and can be set up in one of the following ways:

- **TigerGraph DB**: Install and configure a local instance of TigerGraph.
- **TigerGraph Cloud**: Use a cloud-hosted instance of TigerGraph.
- **TigerGraph Docker**: Use a Docker container to run TigerGraph.

##### **Docker Setup Guide**

Follow the [Docker setup guide](https://github.com/tigergraph/ecosys/blob/master/demos/guru_scripts/docker/README.md) to set up your Docker environment.

> **_Note:_** For vector feature preview, please pull the `tigergraph/tigergraph:4.2.0-preview` Docker image instead. Here's how you can do it:
>
> ```bash
> docker run -d -p 14240:14240 --name tigergraph --ulimit nofile=1000000:1000000 -t tigergraph/tigergraph:4.2.0-preview
> ```
>
> After setting up the Docker container, remember to apply your TigerGraph license key to the instance. You can obtain a free developer license [here](https://dl.tigergraph.com/). Once you have your license key, follow these steps:
>
> ```bash
> docker exec -it tigergraph /bin/bash
> gadmin license set <license_key>
> gadmin config apply -y
> gadmin start all
> ```

### Installation Steps

Expand All @@ -39,7 +58,7 @@ This allows you to quickly start using the library without needing the source co

After installing, verify that TigerGraphX is installed correctly by running:
```bash
python -c "import tigergraphx; print('TigerGraphX installed successfully!')"
python -c 'import tigergraphx; print("TigerGraphX installed successfully!")'
```

If the installation was successful, you will see:
Expand Down Expand Up @@ -152,26 +171,28 @@ G = Graph(graph_schema, connection)

```python
nodes_for_adding = [
("Scott", {"isBlocked": False}),
("Jenny", {"isBlocked": False}),
("Steven", {"isBlocked": True}),
("Paul", {"isBlocked": False}),
("Ed", {"isBlocked": False}),
("Scott", {"isBlocked": False, "emb1": [-0.017733968794345856, -0.01019224338233471, -0.016571875661611557]}),
("Jenny", {"isBlocked": False, "emb1": [-0.019265105947852135, 0.0004929182468913496, 0.006711316294968128]}),
("Steven", {"isBlocked": True, "emb1": [-0.01505514420568943, -0.016819344833493233, -0.0221870020031929]}),
("Paul", {"isBlocked": False, "emb1": [0.0011193430982530117, -0.001038988004438579, -0.017158523201942444]}),
("Ed", {"isBlocked": False, "emb1": [-0.003692442551255226, 0.010494389571249485, -0.004631792660802603]}),
]
print("Adding Account Nodes:", G.add_nodes_from(nodes_for_adding, node_type="Account"))
print("Number of Account Nodes Inserted:", G.add_nodes_from(nodes_for_adding, node_type="Account"))

nodes_for_adding = [
("718-245-5888", {"isBlocked": False}),
("650-658-9867", {"isBlocked": True}),
("352-871-8978", {"isBlocked": False}),
("718-245-5888", {"isBlocked": False, "emb1": [0.0023173028603196144, 0.018836047500371933, 0.03107452765107155]}),
("650-658-9867", {"isBlocked": True, "emb1": [0.01969221793115139, 0.018642477691173553, 0.05322211980819702]}),
("352-871-8978", {"isBlocked": False, "emb1": [-0.003442931454628706, 0.016562696546316147, 0.012876809574663639]}),
]
print("Adding Phone Nodes:", G.add_nodes_from(nodes_for_adding, node_type="Phone"))
print("Number of Phone Nodes Inserted:", G.add_nodes_from(nodes_for_adding, node_type="Phone"))

nodes_for_adding = ["New York", "Gainesville", "San Francisco"]
print("Adding City Nodes:", G.add_nodes_from(nodes_for_adding, node_type="City"))
print("Number of City Nodes Inserted:", G.add_nodes_from(nodes_for_adding, node_type="City"))
```

Adding Account Nodes: 5
Adding Phone Nodes: 3
Adding City Nodes: 3
Number of Account Nodes Inserted: 5
Number of Phone Nodes Inserted: 3
Number of City Nodes Inserted: 3


### Add Edges
Expand All @@ -185,15 +206,17 @@ ebunch_to_add = [
("Paul", "650-658-9867"),
("Ed", "352-871-8978"),
]
print("Adding hasPhone Edges:", G.add_edges_from(ebunch_to_add, "Account", "hasPhone", "Phone"))
print("Number of hasPhone Edges Inserted:", G.add_edges_from(ebunch_to_add, "Account", "hasPhone", "Phone"))

ebunch_to_add = [
("Scott", "New York"),
("Jenny", "San Francisco"),
("Steven", "San Francisco"),
("Paul", "Gainesville"),
("Ed", "Gainesville"),
]
print("Adding isLocatedIn Edges:", G.add_edges_from(ebunch_to_add, "Account", "isLocatedIn", "City"))
print("Number of isLocatedIn Edges Inserted:", G.add_edges_from(ebunch_to_add, "Account", "isLocatedIn", "City"))

ebunch_to_add = [
("Scott", "Ed", {"date": "2024-01-04", "amount": 20000}),
("Scott", "Ed", {"date": "2024-02-01", "amount": 800}),
Expand All @@ -204,65 +227,14 @@ ebunch_to_add = [
("Ed", "Paul", {"date": "2024-01-04", "amount": 1500}),
("Paul", "Steven", {"date": "2023-05-09", "amount": 20000}),
]
print("Adding transfer Edges:", G.add_edges_from(ebunch_to_add, "Account", "transfer", "Account"))
print("Number of transfer Edges Inserted:", G.add_edges_from(ebunch_to_add, "Account", "transfer", "Account"))
```

Adding hasPhone Edges: 5
Adding isLocatedIn Edges: 5
Adding transfer Edges: 8
Number of hasPhone Edges Inserted: 5
Number of isLocatedIn Edges Inserted: 5
Number of transfer Edges Inserted: 8


### Inserting Embeddings into Nodes
To insert embeddings into the nodes, you can use the following data format:


```python
data = [
{
"name": "Scott",
"emb1": [-0.017733968794345856, -0.01019224338233471, -0.016571875661611557],
},
{
"name": "Jenny",
"emb1": [-0.019265105947852135, 0.0004929182468913496, 0.006711316294968128],
},
{
"name": "Steven",
"emb1": [-0.01505514420568943, -0.016819344833493233, -0.0221870020031929],
},
{
"name": "Paul",
"emb1": [0.0011193430982530117, -0.001038988004438579, -0.017158523201942444],
},
{
"name": "Ed",
"emb1": [-0.003692442551255226, 0.010494389571249485, -0.004631792660802603],
},
]
print("Number of Account Nodes Inserted:", G.upsert(data, "Account"))
data = [
{
"number": "718-245-5888",
"emb1": [0.0023173028603196144, 0.018836047500371933, 0.03107452765107155],
},
{
"number": "650-658-9867",
"emb1": [0.01969221793115139, 0.018642477691173553, 0.05322211980819702],
},
{
"number": "352-871-8978",
"emb1": [-0.003442931454628706, 0.016562696546316147, 0.012876809574663639],
},
]
print("Number of Phone Nodes Inserted:", G.upsert(data, "Phone"))
```

Number of Account Nodes Inserted: 0
Number of Phone Nodes Inserted: 0


Note: A result of 0 simply indicates that no new nodes were added, as the nodes were already inserted into TigerGraph. This does not mean the update failed.

### Display the Number of Nodes
Next, let's verify that the data has been inserted into the graph by using the following command. As expected, the number of nodes is 5.

Expand Down Expand Up @@ -296,6 +268,21 @@ for result in results:
{'id': 'Steven', 'distance': 0.0325563, 'name': 'Steven', 'isBlocked': True}


After performing the vector search, the following code retrieves the detailed embeddings of the top-k nodes identified in the search. This is achieved by using their IDs and the specified vector attribute. The results are then printed for each node.


```python
node_ids = {item['id'] for item in results}
nodes = G.fetch_nodes(node_ids, vector_attribute_name="emb1", node_type="Account")
for node in nodes.items():
print(node)
```

('Paul', [0.001119343, -0.001038988, -0.01715852])
('Scott', [-0.01773397, -0.01019224, -0.01657188])
('Steven', [-0.01505514, -0.01681934, -0.022187])


### Top-k Vector Search on a Set of Vertex Types' Vector Attributes
The code below performs a multi-vector attribute search on "Account" and "Phone" node types using two vector attributes (emb1). It retrieves the top 5 similar nodes and fetches the isBlocked attribute for each result.

Expand Down Expand Up @@ -332,9 +319,6 @@ G.search_top_k_similar_nodes(
)
```




[{'id': 'Paul', 'distance': 0.3933879, 'name': 'Paul', 'isBlocked': False},
{'id': 'Steven', 'distance': 0.0325563, 'name': 'Steven', 'isBlocked': True}]

Expand Down Expand Up @@ -367,17 +351,17 @@ Let's first retrieves all "Account" nodes where the isBlocked attribute is False
```python
nodes_df = G.get_nodes(
node_type="Account",
filter_expression="s.isBlocked == False",
node_alias="s", # The alias "s" is used in filter_expression. You can remove this line since the default node alias is "s"
filter_expression='s.isBlocked == False AND s.name != "Ed"',
return_attributes=["name"],
)
print(nodes_df)
```

name
0 Paul
1 Ed
2 Scott
3 Jenny
0 Scott
1 Paul
2 Jenny


Then convert the name column of the retrieved DataFrame into a set of candidate IDs and performs a top-2 vector search on the "Account" node type using the "emb1" embedding attribute, restricted to the specified candidate IDs.
Expand Down
Loading