From 00dbee4ee6f16685cfe3665655edd37abae1cb0a Mon Sep 17 00:00:00 2001 From: Dimitris Spachos Date: Mon, 13 Jan 2025 18:28:40 +0200 Subject: [PATCH] feat(slb-495): fix format --- apps/converter/README.md | 31 +++++++++++++++++++++++-------- 1 file changed, 23 insertions(+), 8 deletions(-) diff --git a/apps/converter/README.md b/apps/converter/README.md index 0829545d7..108702c7a 100644 --- a/apps/converter/README.md +++ b/apps/converter/README.md @@ -1,15 +1,20 @@ # Silverback Converter -The converter is a Node.js application designed to convert documents from various formats (DocX, PDF, and HTML) into Markdown. +The converter is a Node.js application designed to convert documents from +various formats (DocX, PDF, and HTML) into Markdown. -This tool is particularly useful for developers and content creators who need to transform documents into a format suitable for further processing, analysis, or integration with other systems. +This tool is particularly useful for developers and content creators who need to +transform documents into a format suitable for further processing, analysis, or +integration with other systems. ## Features - **DocX to Markdown**: Convert Word documents (`.docx`) to Markdown. - **PDF to Markdown**: Convert PDF files to Markdown. -- **HTML to Markdown**: Extract main content from web pages and convert it to Markdown. -- **Jina AI Integration**: Fetch and convert content using the Jina AI API. (ATTENTION: EXPERIMENTAL, DO NOT USE THIS) +- **HTML to Markdown**: Extract main content from web pages and convert it to + Markdown. +- **Jina AI Integration**: Fetch and convert content using the Jina AI API. + (ATTENTION: EXPERIMENTAL, DO NOT USE THIS) ## Setup and Installation @@ -21,11 +26,12 @@ This tool is particularly useful for developers and content creators who need to ### Installation 1. **Install dependencies**: + ```bash npm i ``` -3. **Set up environment variables** (optional): +2. **Set up environment variables** (optional): - Create a `.env` file in the root directory. - Add your Jina AI API key if you plan to use the Jina AI integration: ```env @@ -35,6 +41,7 @@ This tool is particularly useful for developers and content creators who need to ### Running the Application To start the application, run the following command: + ```bash npm start ``` @@ -46,16 +53,19 @@ The server will start on `http://localhost:3000`. ### Endpoints - **Convert DocX to Markdown**: + ``` GET /convert?path=/path/to/your/document.docx ``` - **Convert PDF to Markdown**: + ``` GET /pdf-convert?path=/path/to/your/document.pdf ``` - **Convert HTML to Markdown**: + ``` GET /html-convert?path=https://example.com ``` @@ -68,16 +78,21 @@ The server will start on `http://localhost:3000`. ### Example To convert a Word document to Markdown, make a GET request to: + ``` http://localhost:3000/convert?path=/path/to/your/document.docx ``` -The response will include the converted Markdown content, the output directory, and any warnings generated during the conversion process. +The response will include the converted Markdown content, the output directory, +and any warnings generated during the conversion process. ## Configuration -- **Output Directory**: By default, converted files are saved in a directory named after the input file's hash. You can customize the output directory by modifying the `outputDir` variable in the respective conversion scripts. -- **Image Handling**: Images extracted from documents are saved in an `images` subdirectory within the output directory. +- **Output Directory**: By default, converted files are saved in a directory + named after the input file's hash. You can customize the output directory by + modifying the `outputDir` variable in the respective conversion scripts. +- **Image Handling**: Images extracted from documents are saved in an `images` + subdirectory within the output directory. ## Dependencies