-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AI: Search doesn't work in Online mode #47
Comments
Ah wait I see what you're doing there on the result page. You sub-divided pages (by paragraph?) and are grouping things right? Now that I'm using this on my actual space it makes more sense. Ok have you considered creating a heading per page, or does that get too large? Something like
I don't think the similarity score is actually very useful to show, I assume you order them by it already? |
At least this part is because I had
This one I'm not sure on. I think I understand the problem though - when in sync mode, the client calls the embedding api and gets the embeddings to search for, and actually does do the search. The problem is the client (in sync mode) has no indexed embeddings because I limited that to the server only 😅 I'll change that to show "No results" or something more obvious, but I'm not sure what the fix is. I thought adding
|
No that's ignored in sync mode, because Best you can indeed is detect the environment and give an error, but that's not really a proper solution. Maybe there should be a version of Another thing I ran into is performance. With silverbulletmd/silverbullet#965 we now put a 5s limit on e.g. page fetches. On my setup with 1000+ pages indexed, this just times out. Even if we somehow disable this time limit, getting no feedback for 5s+ isn't great. Not sure how, but is there a way to stream or batch results so you avoid timeouts, and get some sort of feedback? |
I added a
Just to make sure, you are hitting that limit on the search page? I do currently get all embeddings from the index before calculating the similarity to them. I can probably split that up and only process 100 or so at a time. I'm not sure how I'd show that as a progress on the page yet, but that would make it nicer |
Yes, on the search page |
@zefhemel can you give it another try when you get a chance? I had to temporarily add a sleep in the loop to test with, but I think the experience should be better now. |
Now I just get nothing (this is online mode):
And then nothing. No error on the server either. Just to be sure I added a new quick note with my name in it, on the server it suggests that it generated embeddings for it, but this search page doesn't find it. |
Hmm, immediately after Assuming you're using ollama, is it accessible from both your server and client? I think the embeddings api call is going to come from the client. To test and make sure I didn't have something cached, I just tried these steps:
My indexPage: index
libraries:
# The "Core" library is recommended for all users
- import: "[[!silverbullet.md/Library/Core/*]]"
- import: "[[!ai.silverbullet.md/Library/AICore/*]]"
ai:
indexEmbeddings: true
indexEmbeddingsExcludePages:
- passwords
indexEmbeddingsExcludeStrings:
- "**user**:"
- BLOOP BLARP
indexSummaryModelName: ollama-gemma2
indexSummary: false
chat:
# bakeMessages: false
# customEnrichFunctions:
# - enrichWithURL
textModels:
- name: gpt-4o
provider: openai
modelName: gpt-4o
embeddingModels:
- name: ollama-all-minilm
modelName: all-minilm
provider: ollama
baseUrl: https://ollama.lan.mydomain
requireAuth: false |
It's not accessible from the client. So let me look into enabling that first in my setup. |
I just pushed an update to display an error on the results page if it can't generate the embeddings, so it should at least give feedback now. |
Okay I changed it so embeddings are always generated on the server. |
So right now I have my main SB instance on a public VPS. Of course with authentication enabled. I also run Ollama as a separate docker container, however not exposed to the internet, just to other containers because it doesn't offer authentication out of the box. This means I will not be able to expose it to the internet unless I deploy some auth later on top of ollama that is compatible with your AI plug in some way. I'm wondering if this is something to support. It could be done by indeed proxying LLM calls through the server. That is probably doable, except for all the streaming stuff in chat I suppose 🤔 I didn't run into this before, because so far I had only used OpenAI, which is of course perfectly reachable from the client. |
I think it's worth supporting if possible. Ollama's a pretty simple solution to stand up, and I bet more people will have SB on the same server as ollama. You can put ollama behind nginx or caddy and require an api key that way, but it's definitely more setup involved. I haven't actually verified, but I am pretty sure the current way of running calls on the server would wait for the whole response (so no streaming). Maybe the ai plug could have a new http endpoint that the client always calls, and it would just stream the response back to the client without buffering it. Do you think it'd be hard to add support for streaming responses from the new http event listener stuff? |
This is a bit weird... When I run the "AI: Search" command in Online mode I get:
editor_ui.tsx:146 Error running command Function searchCommand is not available in client
Then, when I switch to Sync mode, it runs but I get no result. But when while on the result page switch back to Online mode, I do get the results...
Although they look a bit weirdly indented:
The text was updated successfully, but these errors were encountered: