Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Add Voice API support #153

Merged
merged 10 commits into from
Dec 31, 2023
Merged

Conversation

ProgramComputer
Copy link

@ProgramComputer ProgramComputer commented Oct 4, 2023

As implied in #143, there are multiple TTS services that LWT can add functionality for especially the common ones; however, the varied authentication required for most can lead to bloat not to mention your desired TTS service not being present.

With this pull request, a generality of TTS-calls is used for customizability on the user-end whether Azure, Google Cloud, ... * as long as the response contains a data: URI.

A key note is that prior authorization is up to the user to configure. The user will have to provide it in the fetch JSON.

Input is the resource itself to fetch including any URL params. Options are the modifiers supplied to the call.

A placeholder of 'lwt_text' is required in the below object.
lwt_lang is an optional placeholder for the lang code.

Leave the textarea empty to disable otherwise default browser SpeechAPI won't run.
The format is below.

{
"input":
,
"options":
}

steps to test

In text-to-speech settings add the request in JSON format to the Voice API Request text area and save.

Then in read mode, click the audio buttons.

limitations

  • If there is an API word limit, the read browser button may not run at all as API returns empty.

  • To enable, it is required that LWT TTS cookies be allowed to be stored.

next steps

Currently 'lwt_text' and 'lwt_lang' placeholders are revealed for user customizability but pitch and rate could also be given.

example request

As using subscriber services from Amazon Polly or Google text-to-speech would require authentication, huggingface is used for the example. The following can be pasted for the Japanese language text-to-speech configuration.

{
  "input": "https://skytnt-moe-tts.hf.space/run/predict",
  "options": {
    "method": "POST",
    "body": {
      "data": [
        "lwt_text",
        "鎌倉詩桜",
        1,
        false
      ],
      "event_data": "undefined",
      "fn_index": "5"
    },
    "headers": {
      "Content-Type": "application/json"
    }
  }
}

user contribs

If you would like, reply to the thread with the parameters you used in a successful JSON request call for whichever particular resource such as Azure, Polly, or Google text-to-speech as some resources are authentication tedious.

@ProgramComputer ProgramComputer changed the base branch from master to dev October 4, 2023 13:39
@ProgramComputer ProgramComputer changed the title Adds Voice API support and resolves #143 [Feature] Add Voice API support and resolves #143 Oct 4, 2023
@ProgramComputer ProgramComputer marked this pull request as ready for review October 4, 2023 14:43
@ProgramComputer ProgramComputer changed the title [Feature] Add Voice API support and resolves #143 [Feature] Add Voice API support Oct 4, 2023
@HugoFara HugoFara added enhancement Develop an existing feature ux User Experience could be better labels Dec 25, 2023
@HugoFara HugoFara added new-feature A new feature and removed enhancement Develop an existing feature labels Dec 27, 2023
@HugoFara HugoFara linked an issue Dec 31, 2023 that may be closed by this pull request
@HugoFara HugoFara merged commit dbbb52a into HugoFara:dev Dec 31, 2023
4 checks passed
@HugoFara
Copy link
Owner

Phew, I finally managed to merge it 🥵

Instead of saving it as a cookie, it will be saved in the database as a language entity. The automatic database update will come a bit later but you can change it manually running ALTER TABLE languages ADD COLUMN LgTTSVoiceAPI varchar(2048) NOT NULL.

I also created a discussion on #174 so that users can share their tips. Inside LWT, the documentation is minimal for now, I may expend it later (I plan on scavenging data from #174).

It's a nice feature, great job on this!

HugoFara added a commit that referenced this pull request Jan 1, 2024
ProgramComputer pushed a commit to ProgramComputer/lwt that referenced this pull request Jan 1, 2024
HugoFara added a commit that referenced this pull request Jan 2, 2024
Error documentation was also added.
HugoFara added a commit that referenced this pull request Jan 3, 2024
New databse migration strategy.
Fixes feeds (#168).
Adds missing documentation to Docker (#146, #160).
Changes in PHP and JS globals.
Fixes reading position was not set.
Read text through API (#153, #155).
Fixes word was not saved/deleted.
Fixes #170 and #69.
Updates API (#175).
Adds dependency to php-xml (#178, #181).
Updates makefile (#179).
Adds MeCab support on Mac (#135).
Adds the option to hide/show word romanization (#119).
Raises URL size limit to 2048 (#144).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new-feature A new feature ux User Experience could be better
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Azure TTS and Google Neural2 support for text-to-speech
2 participants