You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am working with the RAG AI plugin and I have successfully got it to work with AWS Bedrock. I am very new to the space of LLMs and I just wanted to ask in the community if anyone has some advice about tweaking the parameters to give better results when asking the AI certain things about the catalog.
Do you have any combinations of embedding model / prompt models that work well with specific parameters for embedding/chunking etc? The current setup I have makes the prompt responses quite useless, with lots of hallucinating and just in general incorrect information.
My main use case is to reliably retrieve information from API definitions to generate some basic code snippets, or just to get information about how some entities are related to each other. Currently when generating embeddings I run into a lot of Throttling exceptions even on a single entity in backstage - I am unsure how to handle that particular problem - I guess for that single entity there is just far too many embeddings being generated.
Here is a sample of the config I am currently running with:
# Roadie RAG AI configurationai:
supportedSources: ['catalog', 'tech-docs']storage:
pgvector:
# (Optional) The size of the chunk to flush when storing embeddings to the DB. Defaults to 500chunksize: 800embeddings:
chunkSize: 800# (Optional) The overlap between adjacent chunks of embeddings. The bigger the number, the more overlap. Defaults to 200chunkOverlap: 700bedrock:
# (Required) Name of the Bedrock model to use to create Embeddings.modelName: 'amazon.titan-embed-text-v1'maxTokens: 1024maxRetries: 10
In general AWS Titan models are not quite up to par at the moment compared to other models. I would suggest to enable and pick any other model that Bedrock provides as a starting point. To create better and more relevant embeddings and responses based on those, it is recommended to enhance the current pipeline that is implemented in the plugin sources.
You'd likely want to categorize your embeddings differently and determine the correct embeddings RetrievalRouter using AugmentationRetriever. This can help determine the actual items you want to send as a context to the LLM. For API specs, it makes sense to for example create embeddings only for API type entities and create retriever determination logic based on the query user is asking. This determination can be something like a small local model assisting and choosing the implementation of a RetrievalRouter you want to use, or better yet, letting the user to choose which one to use, or hardcoding it if only a single type is wanted.
Additionally you more than likely want to post-process the embeddings you have retrieved to match the actual contents you want to feed to the LLM. This could be something like determininig the correct parts of the API spec (or full spec if needed) instead of the cut down snippets only.
Hey guys.
I am working with the RAG AI plugin and I have successfully got it to work with AWS Bedrock. I am very new to the space of LLMs and I just wanted to ask in the community if anyone has some advice about tweaking the parameters to give better results when asking the AI certain things about the catalog.
Do you have any combinations of embedding model / prompt models that work well with specific parameters for embedding/chunking etc? The current setup I have makes the prompt responses quite useless, with lots of hallucinating and just in general incorrect information.
My main use case is to reliably retrieve information from API definitions to generate some basic code snippets, or just to get information about how some entities are related to each other. Currently when generating embeddings I run into a lot of Throttling exceptions even on a single entity in backstage - I am unsure how to handle that particular problem - I guess for that single entity there is just far too many embeddings being generated.
Here is a sample of the config I am currently running with:
The text was updated successfully, but these errors were encountered: