-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Bookstore example #143
Conversation
Added requests.py with search book and search user functions. Minor testing of requests, data generation. Also polished schema and the load_data script.
Moved ordered items generation into load_data.py. Minor fixes
Added search Order functionality. Added tags hierarchy, including reasoning rule. Tags requests - work in progress. Minor fixes.
Added assigning tags of sup-tags rule. Fixed searching tags with reasoning.
PR Review ChecklistDo not edit the content of this comment. The PR reviewer should simply update this comment by ticking each review item below, as they get completed. Trivial Change
Code
Architecture
|
Added README Added comments Modified load_data.py to create/re-create DB Deleted one of the request examples Changed default DB name
Added info about the new bookstore example into the top-level readme file of the TypeDB Examples repository.
|
||
|
||
# This is a list of imported files with datasets for the DB | ||
Inputs = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feel like something we should type a bit more strongly. (We're quite big on strong typing.) How about introducing an Input
class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I follow - It is not used anywhere outside of loading the data. It does not need to be in the schema. It's basically just a list of .csv file names and the correlated functions to load them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can define an abstract base class Input
, with subclasses BookInput
, UserInput
, RatingInput
etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have tried to do that in a separate PR here https://github.com/izmalk/typedb-examples/pull/1/files
But I feel like the result is by far worse than the initial variant. I suggest discussing in this separate PR how to proceed. Maybe it's just a problem with my implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have rebased the PR with classes on the latest version of this repository. So it's ready to be merged
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Setup the database - by loading the schema.
If you want I can rename it to something like load_schema
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The previous comment was about the setup() function. Fixed it in typedb/typedb-examples@0655495
] | ||
|
||
# This is the main body of this script | ||
with TypeDB.core_client("localhost:1729") as client: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a large number of code comments here. In general, we prefer to minimise code comments, because if the code gets refactored / changed (which we do frequently), then the comments go out of date very rapidly, and people forget about them, because they're just comments - a minor thing on the side when compared against the actual source code itself.
Some of the comments are fairly clear. For example, check the db existence
on L264 is obvious from simply reading the code, so we can just delete that comment.
On the other hand, some of the comments are necessary in order to actually make it clear what the line does. We always strive to have our code self-documenting: the code itself should describe what it does, concisely, at a high level.
For example, if check_data() == 0
, is not clear about what it means. What are we checking about the data? What does the number "0" mean?
By reading deep into the implementation of check_data
, we can figure out all these questions: but the average developer reading the code would be better placed to understand it if the method itself was clear in its naming and in its return value. For example, if has_existing_data() == False
(or more tersely, if not has_existing_data()
).
We should try to make this code more readable in the sense of "gaining a conceptual understanding, just by reading this method, of how the whole program works." Moreover, we should try to apply this throughout all of our Python code. This may be a fairly major refactor task.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In regards to how many comments:
I feel like we can't give too many comments here. This is not the usual code, but an example. Instead of writing some additional documentation with links to lines of code, we are using comments to annotate almost any important line. Even if it's obvious. That's important because of many reasons, including:
- We will have a lot of different users onboarding. Including some of the juniors that might be unfamiliar with even obvious code patterns and functions. Some of them might struggle with English and additional comments in simplified English might help them.
- This tutorial's goal is not only to lead newcomers through the process to create a working bookstore. But also to enable them to explore the solution. So the less time and effort they spend to understand the existing solution, the more time and motivation they will have to continue exploring their own ideas and variants. They can spend these additional resources to modify the tutorial a little bit for their intended use case. Or build something of their own. Even the slightest bump on the road can reduce their opinion of our product, motivation to explore it, and/or their time/resources budget for their own exploration. So with that comments even in the most obvious places, I'm trying to do some shortcuts here and there. it's not that these comments will obstruct anything. Please tell me if you think this is the case.
- The example itself will. be mostly static. I'm thinking of adding a few addons (like debug logging), maybe even adding an automated test (to test this example with new releases of TypeDB to get a warning even if something will get broken). But in terms of refactoring, I don't think the example will change a lot over time. Even if it will change - comments are very important here and should be double-checked just like the rest of the code.
That's why, as long as these comments are not obstructing anything, I think we should stick to this comment-rich approach in examples.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In regards to function names - yeah, I'll try to re-read and improve what I can. I feel like there is always room for improvement. So it might be an iterative process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rereading:
- Create a Client, check if DB exists, if not, create it and load data; if it exists, connect to it.
- If DB exists, check if it has existing data. If it does, prompt user to reload it; it not, then "setup".
The narrative is better than before - I think everything is now clear except "setup". What does setup do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Loads schema. I renamed the function in typedb/typedb-examples@0655495
Co-authored-by: Alex Walker <[email protected]>
Co-authored-by: Alex Walker <[email protected]>
Fixed a bug with rating calculations - prevented division by zero. Minor refactoring
Added sorting to some requests. Just to show this ability. Fixed a bug with genre tags loading too late to correctly load genre-taging relations. Minor refactoring
Renamed functions to generate queries from templates to generate query
kebab-case in the schema
Co-authored-by: Alex Walker <[email protected]>
…ples into izmalk_bookstore_init
Renamed setup() to load_schema()
Implemented show_book function. Refactored some comments.
Created show_user function. Minor comments improvements
super-tag-ownership
Added os.path to fix a problem of starting the load_data.py with any other working dir except python/ directory.
Changed Delivery address dataset to auto-generated addresses for more realistic look.
|
||
|
||
def load_data(): # Main data load function | ||
with TypeDB.core_client("localhost:1729") as client: # Establishing connection |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's store "localhost:1729"
in config.py
. This is also good practice for readers particularly as their project matures and connects to an external host.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in typedb/typedb-examples@49cbdd1
commerce/bookstore/python/loaders.py
Outdated
self.verbose = verbose | ||
|
||
|
||
class BookInput(Loader): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When using subclassing we should generally adopt a consistent naming pattern: BookInput
is not inherently related to Loader
; BookLoader
would be.
We should rename these loaders to BookLoader
, UserLoader
, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in typedb/typedb-examples@f815079
commerce/bookstore/python/loaders.py
Outdated
|
||
# This is a list of classes to import data. The order of values is important for loading data order. | ||
# Classes have filenames and corresponding methods to load the parsed data into the TypeDB | ||
Input_types_list = [GenreInput, GenreHierarchyInput, BookInput, UserInput, RatingInput, OrderInput, BookGenreInput] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per comment above we should rename this list to loaders
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the file is called loaders.py
— our reference will be loaders.loaders
that way.
That's why I suggest using Loaders_list
as a variable name. Just for visual aesthetics.
loaders.Loaders_list
More unique, accurate, and descriptive than loaders.loaders
or just loaders.list
. But I can see loaders.list
as an alternative if you want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After we discussed it I implemented the loaders.loaders_list
variant in typedb/typedb-examples@f815079
Before we proceed to merge this PR, we should ensure that the Bookstore example is tested in Python tests. |
I have implemented the tests. I hope that's enough for now. typedb/typedb-examples@167562a |
print("Detected DB " + config.db + ". Connecting.") | ||
if not has_existing_data(): # Most likely the DB is empty and has no schema | ||
print("Attempting to load the schema and data.") | ||
if load_schema(): # Schema has been loaded |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, the calls to core_client
look a lot cleaner now that we've introduced the typedb_server_addr
config - now let's also simplify the implementation!
In the future, our Docs will contain detailed guidelines on when to create Clients, Sessions and Transactions and what their lifetimes should be. Until then, it's all our internal knowledge. And basically, a Client should, ideally, be created once in the whole lifetime of an application. That's what it's conceptually meant to represent (and it holds onto a single persistent connection with the server under-the-hood; this consumes resource, and we rarely need more than one.)
Currently, we open a client at the start of the main
method, but then we open another client in load_schema
(and another in has_existing_data
, etc.)
Let's refactor all other methods in this file to take in a client
parameter. Then, main
will create one TypeDB client, and reuse it throughout the entire lifetime of the application.
I'd then do the same thing in requests.py
. In this case, create the TypeDB client in the main
method. Of course, there is the possibility that the user just exits the application and never uses the client, but this would be an edge case scenario.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. Fixed in typedb/typedb-examples@090229b
import config | ||
from typedb.client import TypeDB, SessionType, TransactionType | ||
|
||
class LoadDataTests(TestCase): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See discussion in Discord.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed comments in typedb/typedb-examples@1d54bc4
Also, I added the numbers. Otherwise, it's hard to decipher this comment. Too many non-numbered elements in the list
@@ -0,0 +1,8 @@ | |||
# ToDo list |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this file meant to be checked in?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, for the future me or future generations.
I believe It's even mentioned in the readme.
What is the goal of this PR?
As a part of my onboarding, I have prepared yet another example with TypeDB. This one is in Python. Its purpose is to demonstrate how to use Python client to TypeDB.
What are the changes implemented in this PR?
Added a directory commerce/bookstore - with all files of the example. The
README.md
file contains all necessary documentation for this example.