.Net: VectorStore: Create Talk RAG Demo #9548

westey-m · 2024-11-05T15:28:31Z

As a user I can run a sample which shows how to perform RAG against a PDF document (SK learn document).

See citations for any answers provided
I can click on citation links and be brought to the page in the PDF
I can ask questions in different languages
I can ask questions where the answers come from Images in the PDF
I can ask questions where the answers come from Tables in the PDF

As a developer I can modify a sample to use a different Vector Database with minimal lines of change and then run the sample against multiple PDF documents.

Use N Microsoft Learn site documents
Consider running with a local model

sorin-costea · 2024-11-20T16:05:05Z

@westey-m I don't see any commit or version referred here so I must ask: is this about the VectorStoreRAG example?

Maybe I'm missing something obvious, but I would imagine an agent doing this, while that example is doing the data loading and also the search by hand. Are there any components who would 1. create the embeddings in some store at startup, and 2. augment an agent with that? Or I must assemble the functionality by hand like in the VectorStoreRAG example?

westey-m · 2024-12-05T11:26:04Z

@sorin-costea, agents support templates and plugins just like in the VectorStoreRAG demo app, so a similar approach can be used to augment an agent as is used in the app.

I'm not sure what you mean with assemble the functionality by hand. Do you mind sharing more details on what you are looking for?

Also note that generating embeddings can take a very long time if you have a lot of data, especially where large document sets need to be indexed. It may also be an ongoing process if data is re-indexed as it changes. There are of course off the shelf services available for indexing data, e.g. Azure AI Search indexers: https://learn.microsoft.com/en-us/azure/search/search-indexer-overview. For enterprise scenarios where a developer needs to build their own, I would expect this to be part of a service that has, active-active characteristics, queuing, failure recovery, etc. matching the requirements of the organization.

sorin-costea · 2024-12-05T13:19:22Z

@westey-m I'm coming from the integration side. While I use models and their parameters, I prefer to leave that tuning work to some data/AI/ML engineers and focus on building the application. So for me the first class citizen is the agent and its tools to connect with the real world (functions, plugins, you name it). Enterprise application programming the way I experienced it means slapping together many components and data sources, and the enterprise programmers develop and deploy mostly in Java and C#, thus SK was the obvious choice. Now that Autogen should converge, maybe I can hope in some low boilerplate components the way crew.ai (or langchain) are offering, just in a language my team can easily use and integrate in complex solutions (solutions beyond a notebook-size). I thought by just hearing the name that the Foundry would be it but alas, so I'm back to SK and sifting through library code just to figure out why a prompt function ignores the vector storage... So shortly put: 1. I hope documentation will catch up, and 2. I hope there will be a way of less boilerplate. Thank you!

westey-m · 2024-12-05T14:04:30Z

@sorin-costea, are you using the SK agent framework today? See https://learn.microsoft.com/en-us/semantic-kernel/frameworks/agent
It supports plugins the same way as the rest of SK does. See here: https://learn.microsoft.com/en-us/semantic-kernel/frameworks/agent/agent-functions.

For vector search, we have docs here to show how to turn vector search into a plugin and integrate it into a template:
https://learn.microsoft.com/en-us/semantic-kernel/concepts/text-search/text-search-vector-stores

It uses the higher level text search abstraction, which is described in more detail here:
https://learn.microsoft.com/en-us/semantic-kernel/concepts/text-search

If you have any specific areas that you feel has too much boilerplate it would be great to get your input. It's probably worth creating a specific issue for it though, so that it can be assigned to folks who are working in that area.

If you want to speak to the team directly, you can also join our office hours:
https://devblogs.microsoft.com/semantic-kernel/microsoft-semantic-kernel-office-hours-update/
Note, due to the holiday season, the next one for EMEA/US will be in January, but it's a great way to provide specific feedback directly or ask questions.

sorin-costea · 2024-12-10T13:07:01Z

@westey-m Thank you for your time and the links. My problem is, all those examples, and by that I mean really all, are based on code: I build a function, I call it in code, I have a variable with results, I pass it to the next one. All fine and dandy, I can confirm.

Enter agents. Their prompt-based indirection introduce what to me feels like a level of inacceptable randomness.

Example at hand: RAG search. Yes I can have a handlebars prompt-based agent calling the two functions (search and augmentation) but the output of the first will be passed by the agent as a text approximation to the second - thus killing the handlebars variable matching. If there's a way to do RAG without that foreach loop matching the input structures, it's not obvious where it's documented. If there's a way to store the first output in a variable of the active context, again no idea where documented. And anyhow if I have to pass the variables by hand between functions, that's killing the whole point of having a bunch of agents right?
Another point: when I run more than one agent in chat mode, trying to give them discrete tasks, more often than not the very first one will try to be smart and answer already - obviously wrong (temperature doesn't change anything). It all feels random, a true "mixture" of agents. Now of course, I'm fully aware all this might be because of my very very limited knowledge... but I don't even know what to ask, only to explain issues like this. If anything of this deserves creating an "issue" I'll gladly do and work on it. But I don't know even that.

PS: I'm coming here after some experience with crew.ai which seems to have the above figured out. Or has anchored me to a incompatible approach.

westey-m · 2025-01-02T14:58:48Z

@sorin-costea, thanks for the additional detail. Here are some further suggestions that I'm hoping may help.

but the output of the first will be passed by the agent as a text approximation to the second - thus killing the handlebars variable matching. If there's a way to do RAG without that foreach loop matching the input structures

It is possible to do the RAG search in code before the agents are configured. You can for example use the high level text search interface directly from code, instead of from the template. See Using a vector store with text search for an example. The output from the search can be formatted in whatever way you would like using C# code, and then this same result can be passed to each agent as a template parameter like in the Chat Completion Agent Template example.

Also CC'ing @crickman, our .net agents expert, who might have further suggestions around how best to solve this problem using agents, and would be better placed to answer your question around a single agent answering too soon.

sorin-costea · 2025-01-03T19:13:32Z

It is possible to do the RAG search in code before the agents are configured. You can for example use the high level text search interface directly from code,

Exactly my point. I can do many things in code, and I would like to not have to do such (repeatable?) things in code. This is possibly a wrong expectation to have, which is why I'm trying to figure out the SK roadmap... :)

markwallace-microsoft added .NET Issue or Pull requests regarding .NET code triage labels Nov 5, 2024

westey-m self-assigned this Nov 5, 2024

westey-m added memory connector sk team issue A tag to denote issues that where created by the Semantic Kernel team (i.e., not the community) Ignite Features planned for next Ignite conference and removed triage labels Nov 5, 2024

westey-m added this to Semantic Kernel Nov 5, 2024

westey-m moved this to Sprint: In Progress in Semantic Kernel Nov 5, 2024

markwallace-microsoft closed this as completed Jan 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.Net: VectorStore: Create Talk RAG Demo #9548

.Net: VectorStore: Create Talk RAG Demo #9548

westey-m commented Nov 5, 2024

sorin-costea commented Nov 20, 2024 •

edited

Loading

westey-m commented Dec 5, 2024

sorin-costea commented Dec 5, 2024

westey-m commented Dec 5, 2024

sorin-costea commented Dec 10, 2024 •

edited

Loading

westey-m commented Jan 2, 2025

sorin-costea commented Jan 3, 2025

.Net: VectorStore: Create Talk RAG Demo #9548

.Net: VectorStore: Create Talk RAG Demo #9548

Comments

westey-m commented Nov 5, 2024

sorin-costea commented Nov 20, 2024 • edited Loading

westey-m commented Dec 5, 2024

sorin-costea commented Dec 5, 2024

westey-m commented Dec 5, 2024

sorin-costea commented Dec 10, 2024 • edited Loading

westey-m commented Jan 2, 2025

sorin-costea commented Jan 3, 2025

sorin-costea commented Nov 20, 2024 •

edited

Loading

sorin-costea commented Dec 10, 2024 •

edited

Loading