Data Science Domain Question Answering Bot
A specialised chatbot designed to assist users with data science queries by leveraging state of the art NLP models and techniques.
📂 Dataset // The bot will be trained on curated dicussions from:
- Reddit: Discussion in r/datascience.
- Kaggle: Data science forum threads.
- Data Science Textbooks/Tutorials: Excerpts covering concepts, algorithms, and examples.
🔧 Tools & Libraries // Data Scraping and Processing:
- BeautifoulSoup
- Requests
- Pandas
- Numpy
Machine Learning and Deep learning:
- Transformers
- SpaCy
- Scikit-learn
- PyTorch
Model Architecture: // Base Models:
- BERT (ModernBERT)
Advanced Options:
- RAG
- Streamlit
🚀 Potential Features
- Answer data science questions in natural language.
- Explain statistical concepts.
- Provide machine learning algorithm insights.
- Generate code snippets for implementation.
- Explain mathematical formulas and theories.