Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Final Deliverables #61

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

error9098x
Copy link
Collaborator

  • Added Chat History for Streamlit
  • Updated Human_Eval wrt new backend
  • Added auto eval - chainforge, script based
  • Added analysis -
    - discussion directly puts latest discussion dataset from hugging face
    - previously conducted evaluation visualisation
  • Added NextJs frontend

@error9098x
Copy link
Collaborator Author

apologies for the last-minute PR. It passed pycheck strict locally, but there are issues with the checks now. I'm currently in-flight to New Zealand for a week-long conference and will try to fix these issues as soon as possible. Let me know if any other updates are needed.

@luarss
Copy link
Collaborator

luarss commented Aug 28, 2024

Also you need to signoff your commits. See the DCO section for help.

edit: You need to add the frontend/requirements-test.txt file for .github/workflows/mypy.yaml CI to work.,

  frontend:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout code
        uses: actions/checkout@v4
      - name: Set up Python
        uses: actions/setup-python@v5
        with:
          python-version: 3.12
      - name: Install dependencies
        run: |
          pip install -r frontend/requirements-test.txt
      - name: Run MyPy
        run: |
          python -m mypy --strict frontend
      - name: Run pre-commit
        run: |
          pre-commit run --files frontend/*


@@ -0,0 +1,277 @@
import streamlit as st
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you rename the dataset files to be separated with _?

instead of Human Eval Dataset.csv -> Human_Eval_Dataset.csv

@@ -0,0 +1,15 @@
# Retriever Metrics Visualization
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

file rename to README.md

## Dependencies
- streamlit
- pandas
- plotly
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are there ._***.py files? what are their purposes

@@ -0,0 +1,74 @@
from chainforge.providers import provider
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python files should not contain spaces

@@ -0,0 +1,110 @@
import requests
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

filename change and put main logic under if __name__ == "__main__":

@@ -0,0 +1,66 @@
# OpenROAD Retriever Benchmark

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

README.md

"response_mime_type": "text/plain",
}

safety_settings = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the default value for this i am wondering. also for production we need to think whether to enable some blocking of harmful outputs.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the default value for this i am wondering. also for production we need to think whether to enable some blocking of harmful outputs.

its default, cause there was issue with the safety settings, sometimes it would block certain questions without having any harmful content which hinders the evaluation script.

Copy link
Collaborator

@luarss luarss Aug 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fine, but I was curious what is the default safety level. I get that we have to disable it for evaluation, but for production we might need it back since we didn't set it for our backend code.

I guess my question is can we make the evaluation mode as close to production mode as close as possible.

@@ -0,0 +1,36 @@
# See https://help.github.com/articles/ignoring-files/ for more about ignoring files.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason these src folders are nested in another folder orassistant-frontend? if not, prefer it to be under nextjs_frontend directly

@@ -0,0 +1,47 @@
# NextJS Frontend for ORAssistant
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

README.md


with st.chat_message('ai'):
message_placeholder = st.empty()

response_buffer = ''
# Option 1: Streaming effect - Send response in chunks
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems to be only one option...

@luarss luarss added the priority Priority item label Sep 5, 2024
@luarss luarss mentioned this pull request Oct 28, 2024
1 task
@luarss luarss closed this Jan 5, 2025
@luarss
Copy link
Collaborator

luarss commented Jan 6, 2025

@error9098x Please add the nextJS code in a separate PR. Thanks!

@luarss luarss reopened this Jan 6, 2025
@error9098x
Copy link
Collaborator Author

@error9098x Please add the nextJS code in a separate PR. Thanks!

Sure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority Priority item
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants