Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How do I load a single file into kiara using the python API #9

Closed
caro401 opened this issue Nov 16, 2023 · 9 comments
Closed

How do I load a single file into kiara using the python API #9

caro401 opened this issue Nov 16, 2023 · 9 comments
Labels
how-to Request or outline for a how-to or tutorial type doc python API Docs about how to use kiara via Python API (in jupyter or otherwise)

Comments

@caro401
Copy link
Collaborator

caro401 commented Nov 16, 2023

  • what plugin(s) do I need to have installed? Core? Onboarding? What version(s) does this explanation apply to, has anything changed recently?
  • what exactly do I write in kiara? what's the operation name, what arguments do I have to give it, what exact format are they in
    • relative paths? absolute paths? URLs? special URIs from github or something?
  • what optional things can I give to that operation? Is this a thing where aliases come to play?
  • is there anything else I need to know/think about

Yes I know some of this is in the operation docs, but I really need concrete examples of how to load a file from github and from the user's filesystem.

@caro401 caro401 added the question Further information is requested label Nov 16, 2023
@makkus
Copy link
Collaborator

makkus commented Nov 16, 2023

The only 'stable' file import operation at the moment is import.local.file. It's one of the few modules included in the kiara package (not even kiara_plugin.core_types), so in theory you don't need any plugin installed at all (although I haven't really tested not having core_types installed in a while, so there might be breakage, but feel free to report an issue if that's the case).

The single argument is path (kiara operation explain import.local.file), and it takes either an absolute path, or a relative one from the current directory. It does not support urls or anything non-local.

what optional things can I give to that operation? Is this a thing where aliases come to play?

No, nothing.

is there anything else I need to know/think about

No. Except I'm thinking about a more generic module that can take all kinds of strings and is smart enough to figure out how to retrieve the file from wherever it is (in the onboarding plugin). The main difficulty with that is to figure out a good module interface that will work for as many cases as possible, and it might still turn out to be a bad idea in the first place.
Anyway, that module is not in any way ready to be used as it is expected to change quite a bit. Happy to consider ideas anyone might have in that regard.

If you want to load files from github, there is only the very alpha one in the onboarding plugin at the moment, but I wouldn't recommend to use that atm; it is an area I intend to focus on in the near/medium-term future, and create a set of modules that complement each other well and are able to get retrieve and import datasets from any of the potential sources we've identified.

For the type of functionality you need but that isn't ready yet, it might be a good idea to create your own plugin project and add very basic modules that do what you need. The advantage there is that with those we don't need to think too much about their interface (inputs/outputs schema) yet, and you can replace them fairly easily once there is an 'official' one. And they can be used as input to designing the 'official' one in the first place.
The other advantage is that you have full control over the module, so you don't have to worry that the module interfaces changes under your feet while in development.

@makkus
Copy link
Collaborator

makkus commented Nov 16, 2023

Is this a thing where aliases come to play?

You can assign an alias to the imported file. In the cli you'd do it something like:

kiara run import.local.file path=pyproject.toml --save file=my_alias

In Python, you'd do something like:

from kiara.interfaces.python_api import KiaraAPI
from kiara.models.values.value import Value

api = KiaraAPI.instance()
inputs = {
    "path": "/home/markus/projects/kiara/kiara/pyproject.toml"
}
results = api.run_job("import.local.file", inputs=inputs)

file_result: Value = results["file"]
api.store_value(file_result, "alias_from_python")

@caro401
Copy link
Collaborator Author

caro401 commented Nov 16, 2023

Can I have an example using the python API (kiara.api.KiaraAPI.instance()) rather than the CLI please?

@makkus
Copy link
Collaborator

makkus commented Nov 16, 2023

Yeah, I'm about to write it up, one sec.

@makkus
Copy link
Collaborator

makkus commented Nov 16, 2023

Up, finished my comment (above). Happy to change the docstring for that function if you have any suggestions. I guess one area to write up would be the whole concept of storage, but that would probably be too much for this particular comment and needs to go into its own sections in the future docs.

@makkus
Copy link
Collaborator

makkus commented Nov 16, 2023

Btw, you don't need to store a value if you don't want to persist it and don't need a (human-readable) alias in the UI (often it's not necessary for temp data). The value will be available in the runtime until you restart the Python process.

@caro401
Copy link
Collaborator Author

caro401 commented Nov 16, 2023

So the relevant things I learned here was to not use the operations in the onboarding module, and that you have to have files locally. So I've got a bunch of rewriting to do in my app prototype. I'll write this up in a how-to doc and send a PR shortly.

In the future, please could you avoid giving CLI examples, I find it really confusing and hard to follow, because it's similar but not quite the same as usage via the Python API. I think the consensus from Mariella's research was that no end user wants to use the CLI, and I don't want to spend extra time documenting it.

@makkus
Copy link
Collaborator

makkus commented Nov 16, 2023

Ah, and you might rather use queue_job instead of run_job if you don't want the operation to block:

import time
from kiara.interfaces.python_api import KiaraAPI

api: KiaraAPI = KiaraAPI.instance()
inputs = {
    "path": "/home/markus/projects/kiara/kiara/pyproject.toml"
}
job_id = api.queue_job("import.local.file", inputs=inputs)

job = api.get_job(job_id)

# some way to pull, ideally we'd have an event system of some sort, but there is none yet
while job.finished is None:  # will add an 'is_finished()' method to this object in next version, as this only returns a date if the job is finished which might be unintuitive
    time.sleep(1)
    job = api.get_job(job_id)

results = api.get_job_result(job_id)

file_result = results["file"]
api.store_value(file_result, "alias_from_python_2")

@caro401 caro401 added python API Docs about how to use kiara via Python API (in jupyter or otherwise) how-to Request or outline for a how-to or tutorial type doc and removed question Further information is requested labels Nov 16, 2023
@makkus makkus removed their assignment Nov 28, 2023
@caro401
Copy link
Collaborator Author

caro401 commented Dec 11, 2023

closed via #11

@caro401 caro401 closed this as completed Dec 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
how-to Request or outline for a how-to or tutorial type doc python API Docs about how to use kiara via Python API (in jupyter or otherwise)
Projects
None yet
Development

No branches or pull requests

2 participants