diff --git a/README.md b/README.md index 1feb725..d886dce 100644 --- a/README.md +++ b/README.md @@ -1,5 +1,6 @@ # podcast-transcript-convert +[![PyPI](https://img.shields.io/pypi/v/podcast-transcript-convert.svg)](https://pypi.org/project/podcast-transcript-convert/) [![Lint and Test](https://github.com/hbmartin/podcast-transcript-convert/actions/workflows/lint.yml/badge.svg)](https://github.com/hbmartin/podcast-transcript-tools/actions/workflows/lint.yml) [![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff) [![Code style: black](https://img.shields.io/badge/🐧️-black-000000.svg)](https://github.com/psf/black) @@ -9,6 +10,35 @@ Convert podcast transcripts from HTML, SRT, WebVtt, Podlove etc into [PodcastIndex JSON](https://github.com/Podcastindex-org/podcast-namespace/blob/main/transcripts/transcripts.md). +## Installation + +It is recommended to use [pipx](https://pipx.pypa.io/stable/) to install and run the CLI tool. If you wish to use the library, you can install with `pip` instead. + +```bash +brew install pipx +pipx install podcast-transcript-convert +``` + +## Usage +Run the conversion app on your transcripts directory. + +```bash +transcript2json transcripts/ converted/ +``` +You can then inspect the output JSON files in the `converted/` directory. + +## Library Usage +```python +from podcast_transcript_convert.convert import bulk_convert + +bulk_convert("transctipts_dir/", "converted_dir/") +``` + +Individual file type converters are in the `converters` package. You can use them directly if you know the file type. + +You can use `file_typing.identify_file_type(file)` to determine the file type of a transcript file. + + ## Development Pull requests are very welcome! For major changes, please open an issue first to discuss what you would like to change.