GGTalk is a voice-enabled chat application built with Next.js. It uses SpeechRecognition (via react-speech-recognition
) to convert your spoken words into text, and Google Generative AI (Gemini/PaLM API) to generate intelligent, context-aware responses. The application then speaks those responses aloud with the Web Speech Synthesis API.
- Overview
- Features
- Technologies Used
- System Requirements
- Project Structure
- Getting Started
- Configuration
- Usage
- Under the Hood
- Troubleshooting
- Deployment
- Contributing
- License
- Contact & Support
- Name: GGTalk
- Description: Real-time speech-to-text conversation app with AI responses.
- Primary Goal: Provide an interactive, hands-free way for users to ask questions and get friendly responses.
GGTalk is ideal for quick prototyping, voice-enabled applications, or demonstrating the power of Voice Recognition combined with Generative AI.
- Voice Recognition
- Uses
react-speech-recognition
to capture microphone input and convert to text.
- Uses
- AI-Assisted Responses
- Integrates Google Generative AI, such as Gemini or PaLM, to understand and respond to user queries.
- Text-to-Speech
- Leverages the built-in Web Speech Synthesis API to speak AI responses back to the user.
- Conversation Sidebar
- A collapsible/fixed sidebar displays the chat history for easy browsing of past messages and AI responses.
- Responsive UI
- Built with Tailwind CSS and Next.js, ensuring it works across mobile, tablet, and desktop.
- Multi-Environment
- Local development and easy deployment to hosting services like Vercel, Netlify, or any Node.js-capable platform.
- Next.js (React framework)
- Tailwind CSS (Utility-first CSS framework)
- React Speech Recognition (
react-speech-recognition
) for capturing voice input - Google Generative AI /
@google/generative-ai
for AI responses - Web Speech API for speech synthesis (text-to-speech)
- Node.js:
^14.17.0
or newer (recommended:^16.0.0
) - npm:
^6.0.0
or Yarn:^1.22.0
- Modern Browser: Must support the Web Speech API (for best results, use Chrome)
(Note: Safari and Firefox have partial or experimental support for the Web Speech API. Check caniuse.com)
A typical Next.js + Tailwind + AI integration structure:
GGTalk/
├── components/
│ └── ConversationPage.js # Core AI + speech logic
├── pages/
│ ├── index.js # Renders ConversationPage (or any custom UI)
│ └── _app.js # Next.js root App, global styles
├── public/ # Public assets (images, etc.)
├── .env.local # Local environment variables (gitignored)
├── tailwind.config.js # Tailwind configuration
├── package.json
├── README.md # This file
└── ... (other config files, optional)
-
Clone the Repository
git clone https://github.com/SpandanM110/GGTalk.git cd GGTalk
-
Install Dependencies
npm install # or yarn install
-
Set Up Environment Variables
- You must have a
NEXT_PUBLIC_GEMINI_API_KEY
for Google Generative AI. - See Configuration below.
- You must have a
-
Run in Development
npm run dev # or yarn dev
- By default, the app is served at http://localhost:3000.
-
Open in Browser
- Go to http://localhost:3000.
- You should see the GGTalk UI with a big microphone animation.
Create a .env.local
file (automatically ignored by Git) in the root directory:
# .env.local
NEXT_PUBLIC_GEMINI_API_KEY=YOUR_GOOGLE_GENERATIVE_AI_KEY
NEXT_PUBLIC_GEMINI_API_KEY
:
Your public environment variable for the @google/generative-ai library.
Note: The prefixNEXT_PUBLIC_
is required for Next.js to expose it to the frontend.
Security Note:
- The Google Generative AI key is somewhat sensitive, but for browser-based apps, it inevitably becomes public. Consider usage quotas or domain restrictions in your Google Cloud Console to protect it from abuse.
-
Launching:
- From the root of your project, run
npm run dev
(oryarn dev
). - Navigate to
http://localhost:3000
.
- From the root of your project, run
-
Start/Stop Listening:
- Click the Start button to begin capturing microphone input.
- Speak your query or message.
- GGTalk detects when you stop speaking, and after a short delay, sends the transcribed text to the AI.
-
Getting AI Responses:
- The AI responds with a text message that GGTalk speaks aloud to you using the Web Speech API.
- This text is also appended to the conversation in the sidebar.
-
Show/Hide Conversations:
- Click Show Conversations (top-left) to open the conversation sidebar.
- Click Hide Conversations to close it again, saving screen space.
-
Stop:
- At any time, click the Stop button to end continuous listening.
-
Expand/Collapse (Optional):
- If you have “expandedAll” logic, you can view the entire conversation text or just a snippet.
-
Speech to Text:
- React Speech Recognition uses the browser’s native
SpeechRecognition
API. - The recognized text is stored in a
transcript
variable, and once idle for ~6 seconds, GGTalk sends it to the AI.
- React Speech Recognition uses the browser’s native
-
AI Calls:
- Using @google/generative-ai, GGTalk connects to the Gemini or PaLM API.
- Your
NEXT_PUBLIC_GEMINI_API_KEY
is used here. - The AI returns a response, which is then appended to the chat history.
-
Text-to-Speech:
- SpeechSynthesis is used to speak the AI’s response.
- In the code, you can adjust the
.rate
,.voice
,.pitch
if you want different speaking styles.
-
Sidebar:
- A fixed or collapsible sidebar lists all message objects: user messages vs. AI messages.
- Responsive widths for mobile (
w-full
) vs. desktop (sm:w-72
, etc.).
-
Responsive Design:
- Tailwind classes like
sm:w-64
,sm:text-lg
,fixed top-4 left-4
, and more adapt the UI across screen sizes.
- Tailwind classes like
-
Mic Not Detected:
- Ensure your browser has permission to access the microphone.
- If on Chrome, check
chrome://settings/content/microphone
. - If on Safari, enable Web Speech in experimental features or use an alternative.
-
AI Not Responding:
- Check that your API key is correct in
.env.local
. - Inspect the browser console or terminal logs for error messages from the AI endpoint.
- Check that your API key is correct in
-
Text-to-Speech Not Working:
- Some browsers require user interaction (click/tap) before speaking can be triggered.
- Make sure your speaker volume is on.
-
Styles Not Loading:
- Verify
tailwind.config.js
is properly set up and that you have imported the Tailwind styles (e.g., inglobals.css
or_app.js
).
- Verify
- Create a Vercel account (if you haven’t already).
- Import the GGTalk GitHub repo into Vercel.
- In your Vercel Project Settings, add the environment variable:
NEXT_PUBLIC_GEMINI_API_KEY=YOUR_KEY
- Click Deploy.
- Vercel automatically handles building and hosting Next.js apps.
- Install the Next on Netlify plugin if needed.
- Create a new Netlify site from your GitHub repo.
- In the site settings, add environment variables under Build & Deploy → Environment.
- Netlify will build your Next.js site with the configured environment vars.
- You can run
npm run build
thennpm run start
on any VPS or Node-friendly platform. - Just make sure to set your environment variables on the server.
Contributions are welcome! To contribute:
- Fork the repository.
- Create a new feature branch:
git checkout -b feature/awesome-change
- Commit your changes:
git commit -m "Add awesome feature"
- Push to your fork:
git push origin feature/awesome-change
- Open a Pull Request against the
main
branch.
We appreciate fixes, new features, or documentation improvements!
See the LICENSE file for details.
- Author: @SpandanM110
- GitHub: GGTalk Repo
For bugs or feature requests, please open an issue on the GitHub repo.
If you need further assistance, feel free to reach out via GitHub or email (if provided).
Thank you for using GGTalk! Let your voice do the talking.