diff --git a/AUTHORS.rst b/AUTHORS.rst
index 9517329..992df85 100644
--- a/AUTHORS.rst
+++ b/AUTHORS.rst
@@ -3,3 +3,4 @@ Contributors
 ============
 
 * Matthew Watkins <mwatkins@linuxfoundation.org>
+* David Besslich <david.besslich@investmentdataservices.com>
diff --git a/README.rst b/README.rst
index a6de362..0c829bb 100644
--- a/README.rst
+++ b/README.rst
@@ -4,21 +4,19 @@ On June 26 2024, Linux Foundation announced the merger of its financial services
 
 
 =====================================================================
-OSC Data Extractor Pre-Steps
+OSC Transformer Pre-Steps
 =====================================================================
 
 |osc-climate-project| |osc-climate-slack| |osc-climate-github| |pypi| |build-status| |pdm| |PyScaffold|
 
-OS-Climate Data Extraction Tool
+OS-Climate Transformer Pre-Steps Tool
 ===============================
 
 .. _notes:
 
-This code provides you with an api and a streamlit app to which you
-can provide a pdf document and the output will be the text content in a json format.
-In the backend it is using a python module for extracting text from pdfs, which
-might be extended in the future to other file types.
-The json file is needed for later usage in the context of transformer models
+This code provides you with a cli tool with the possibility to extract data from
+a pdf to a json document and to create a training data set for a later usage in the
+context of transformer models
 to extract relevant information, but it can also be used independently.
 
 Quick start
@@ -39,53 +37,55 @@ We are using typer to have a nice CLI tool here. All details and help will be sh
 tool itself and are not described here in more detail.
 
 
-Install via Github Repository
+Developer space
+===============
+
+Use code directly without CLI via Github Repository
 -----------------------------
 
-For a quick start with the tool install python and clone the repository to your local environment::
+First clone the repository to your local environment::
 
     $ git clone https://github.com/os-climate/osc-transformer-presteps
 
-Afterwards update your python to the requirements (possible for example
-via pdm update) and start a local api server via::
+We are using pdm to manage the packages and tox for a stable test framework.
+Hence, first install pdm (possibly in a virtual environment) via
+
+    $ pip install pdm
 
-    $ python ./src/run_server.py
+Afterwards sync you system via
 
-**Note**:
-    * We assume that you are located in the cloned repository.
-    * To check if it is running open "http://localhost:8000/liveness" and you should see the
-      message {"message": "OSC Transformer Pre-Steps Server is running."}.
+    $ pdm sync
 
-Finally, run the following code to start a streamlit app which opens up the possibility
-to "upload" a file and extract data from pdf to json via this UI. Note that the UI needs
-the running server so you have to open the streamlit and the server in two different
-terminals.::
+Now you have multiple demos on how to go on. See folder
+[here](demo)
 
-    $ streamlit run ./src/osc_transformer_presteps/streamlit/app.py
+pdm
+-----------------------------
 
-**Note**: Check also docs/demo. There you can
-find local_extraction_demo.py which will start an extraction
-without any API call and then there is post_request_demo.py
-which will send a file to the API (of course you have to start
-server as above first).
+For adding new dependencies use pdm. You could add new packages via pdm add.
+For example numpy via::
 
-Developer Notes
-===============
+    $ pdm add numpy
 
-For adding new dependencies use pdm. First install via pip::
+For a very detailed description check the homepage of the pdm project:
 
-    $ pip install pdm
+https://pdm-project.org/en/latest/
 
-And then you could add new packages via pdm add. For example numpy via::
 
-    $ pdm add numpy
+tox
+-----------------------------
 
-For running linting tools just to the following::
+For running linting tools we use tox which you run outside of your virtual environment::
 
     $ pip install tox
     $ tox -e lint
     $ tox -e test
 
+This will automatically apply some checks on your code and run the provided pytests. See
+more details on tox on the homepage of the tox project:
+
+https://tox.wiki/en/4.16.0/
+
 
 .. |osc-climate-project| image:: https://img.shields.io/badge/OS-Climate-blue
   :alt: An OS-Climate Project
diff --git a/demo/README.rst b/demo/README.rst
new file mode 100644
index 0000000..e3f4c14
--- /dev/null
+++ b/demo/README.rst
@@ -0,0 +1,62 @@
+=====================================================================
+DEMO Scripts Overview
+=====================================================================
+
+.. _notes:
+
+In this folder you can find multiple demo scripts on how to use the python scripts in
+different ways beside the *normal* CLI tool.
+
+**Note**:
+
+* We assume that you are located in an environment where you have
+  already installed the necessary requirements (see initial readme).
+
+* The demos are not part of the tox setup and the tests. Hence, it might be that some
+  packages or code parts can be outdated. Those are just ideas on how to use and not
+  prod ready. Feel free to inform us nevertheless if you encounter issues with the demos.
+
+
+extraction_api
+....................
+
+This demo is an implementation of the code via FastAPI. In api.py the API is created and the
+extraction route is build up in extract.py. To start the server run:
+
+    $ python demo/extraction_api/api.py
+
+Then the server will run and you can test in your browser that it worked at:
+
+    http://localhost:8000/liveness
+
+You should see the message {"message": "OSC Transformer Pre-Steps Server is running."}.
+
+extraction
+....................
+
+This demo has two parts to extract data from the input folder to the output folder.
+
+a) The post_request_extract.py is using the api endpoint from extraction_api to send a
+file to the api via a post request and receives the output via an api respons. The file
+you want to extract can be entered in the cmd line:
+
+    $ python demo/extraction/post_request_extract.py
+
+b) The local_extraction_demo.py runs the extraction code directly for Test.pdf file.
+If you want to use another file you have to change that in the code.
+
+extraction_streamlit
+....................
+
+This is an example implementation of a streamlit app which opens up the possibility
+to "upload" a file and extract data from pdf to json. Note that the UI needs
+the running server from extraction_api and so you have to open the streamlit
+and the server in two different terminals. An example file to upload can be found in
+"/demo/extraction/input". You can start the streamlit via:
+
+    $ streamlit run ./src/osc_transformer_presteps/extraction_streamlit/app.py
+
+curation
+....................
+
+T.B.D.
diff --git a/docs/demo/curation/input/kpi_mapping.csv b/demo/curation/input/kpi_mapping.csv
similarity index 100%
rename from docs/demo/curation/input/kpi_mapping.csv
rename to demo/curation/input/kpi_mapping.csv
diff --git a/docs/demo/curation/input/test_annotations.xlsx b/demo/curation/input/test_annotations.xlsx
similarity index 100%
rename from docs/demo/curation/input/test_annotations.xlsx
rename to demo/curation/input/test_annotations.xlsx
diff --git a/docs/demo/curation/local_cuartion_demo.py b/demo/curation/local_cuartion_demo.py
similarity index 100%
rename from docs/demo/curation/local_cuartion_demo.py
rename to demo/curation/local_cuartion_demo.py
diff --git a/docs/demo/curation/output/Test.csv b/demo/curation/output/Test.csv
similarity index 100%
rename from docs/demo/curation/output/Test.csv
rename to demo/curation/output/Test.csv
diff --git a/docs/demo/extraction/input/Test.pdf b/demo/extraction/input/Test.pdf
similarity index 100%
rename from docs/demo/extraction/input/Test.pdf
rename to demo/extraction/input/Test.pdf
diff --git a/docs/demo/extraction/input/test-2.pdf b/demo/extraction/input/test-2.pdf
similarity index 100%
rename from docs/demo/extraction/input/test-2.pdf
rename to demo/extraction/input/test-2.pdf
diff --git a/docs/demo/extraction/local_extraction_demo.py b/demo/extraction/local_extraction_demo.py
similarity index 100%
rename from docs/demo/extraction/local_extraction_demo.py
rename to demo/extraction/local_extraction_demo.py
diff --git a/docs/demo/extraction/output/Test.json b/demo/extraction/output/Test.json
similarity index 100%
rename from docs/demo/extraction/output/Test.json
rename to demo/extraction/output/Test.json
diff --git a/docs/demo/extraction/post_request_extract.py b/demo/extraction/post_request_extract.py
similarity index 90%
rename from docs/demo/extraction/post_request_extract.py
rename to demo/extraction/post_request_extract.py
index 9faf166..5f70bde 100644
--- a/docs/demo/extraction/post_request_extract.py
+++ b/demo/extraction/post_request_extract.py
@@ -1,4 +1,8 @@
-"""Python Script for locally running extraction on FastAPI."""
+"""Python Script for locally running extraction on FastAPI.
+
+Note: To make the following demo work you first have to start the server in the folder demo/extraction_api!
+
+"""
 
 import json
 from pathlib import Path
diff --git a/src/osc_transformer_presteps/api/__init__.py b/demo/extraction_api/__init__.py
similarity index 100%
rename from src/osc_transformer_presteps/api/__init__.py
rename to demo/extraction_api/__init__.py
diff --git a/src/osc_transformer_presteps/api/api.py b/demo/extraction_api/api.py
similarity index 90%
rename from src/osc_transformer_presteps/api/api.py
rename to demo/extraction_api/api.py
index a03c71a..5b1957f 100644
--- a/src/osc_transformer_presteps/api/api.py
+++ b/demo/extraction_api/api.py
@@ -5,9 +5,9 @@
 import uvicorn
 from fastapi import APIRouter, FastAPI
 from starlette.responses import RedirectResponse
+from server_settings import ExtractionServerSettings
 
-from osc_transformer_presteps.api.extract import router as extraction_router
-from osc_transformer_presteps.settings import ExtractionServerSettings
+from extract import router as extraction_router
 
 _logger = logging.getLogger(__name__)
 
diff --git a/src/osc_transformer_presteps/api/extract.py b/demo/extraction_api/extract.py
similarity index 100%
rename from src/osc_transformer_presteps/api/extract.py
rename to demo/extraction_api/extract.py
diff --git a/demo/extraction_api/server_settings.py b/demo/extraction_api/server_settings.py
new file mode 100644
index 0000000..6a2d600
--- /dev/null
+++ b/demo/extraction_api/server_settings.py
@@ -0,0 +1,48 @@
+from pydantic import BaseModel
+from enum import Enum
+import logging
+
+
+class LogLevel(str, Enum):
+    """Class for different log levels."""
+
+    critical = "critical"
+    error = "error"
+    warning = "warning"
+    info = "info"
+    debug = "debug"
+    notset = "notset"
+
+
+_log_dict = {
+    "critical": logging.CRITICAL,
+    "error": logging.ERROR,
+    "warning": logging.WARNING,
+    "info": logging.INFO,
+    "debug": logging.DEBUG,
+    "notset": logging.NOTSET,
+}
+
+
+class ExtractionServerSettingsBase(BaseModel):
+    """Class for Extraction server settings."""
+
+    port: int = 8000
+    host: str = "localhost"
+    log_type: int = 20
+    log_level: LogLevel = LogLevel("info")
+
+
+class ExtractionServerSettings(ExtractionServerSettingsBase):
+    """Settings for configuring the extraction server.
+
+    This class extends `ExtractionServerSettingsBase` and adds additional
+    logging configuration.
+    """
+
+    def __init__(self, **data) -> None:
+        """Initialize the ExtractionServerSettings."""
+        if "log_level" in data:
+            data["log_level"] = LogLevel(data["log_level"])
+        super().__init__(**data)
+        self.log_type: int = _log_dict[self.log_level.value]
diff --git a/demo/extraction_api/temp_storage/.gitkeep b/demo/extraction_api/temp_storage/.gitkeep
new file mode 100644
index 0000000..e69de29
diff --git a/src/osc_transformer_presteps/streamlit/app.py b/demo/extraction_streamlit/app.py
similarity index 88%
rename from src/osc_transformer_presteps/streamlit/app.py
rename to demo/extraction_streamlit/app.py
index 34dba9c..1214a89 100644
--- a/src/osc_transformer_presteps/streamlit/app.py
+++ b/demo/extraction_streamlit/app.py
@@ -27,10 +27,6 @@
     if st.button("Extract data"):
         st.info("Extraction started")
         file_bytes = input_file.getvalue()
-        liveness = requests.get(
-            url="http://localhost:8000/liveness", proxies={"http": "", "https": ""}
-        )
-        st.info(f"Liveness Check: {liveness.status_code}")
         file_upload = requests.post(
             url="http://localhost:8000/extract",
             files={"file": (input_file.name, file_bytes)},
diff --git a/src/osc_transformer_presteps/content_extraction/extraction_factory.py b/src/osc_transformer_presteps/content_extraction/extraction_factory.py
index 7eaf363..c3b805b 100644
--- a/src/osc_transformer_presteps/content_extraction/extraction_factory.py
+++ b/src/osc_transformer_presteps/content_extraction/extraction_factory.py
@@ -38,12 +38,12 @@ def get_extractor(
 
     Args:
     ----
-    - extractor_type (str): Type of extractor to be retrieved
-    - settings: Settings specific to the extractor
+        extractor_type (str): Type of extractor to be retrieved
+        settings: Settings specific to the extractor
 
     Returns:
     -------
-    - BaseExtractor: Instance of the specified extractor type
+        BaseExtractor: Instance of the specified extractor type
 
     """
     _logger.info("The extractor type is: " + extractor_type)
diff --git a/src/osc_transformer_presteps/content_extraction/extractors/base_extractor.py b/src/osc_transformer_presteps/content_extraction/extractors/base_extractor.py
index 9d0bcd2..501220d 100644
--- a/src/osc_transformer_presteps/content_extraction/extractors/base_extractor.py
+++ b/src/osc_transformer_presteps/content_extraction/extractors/base_extractor.py
@@ -21,9 +21,9 @@ class _BaseSettings(BaseModel):
     min_paragraph_length (int)(Optional): Minimum alphabetic characters for paragraph,
                         any paragraph shorter than that will be disregarded.
     annotation_folder (str)(Optional): path to the folder containing all annotated
-            excel files. If provided, just the pdfs mentioned in annotation excels are
+            Excel files. If provided, just the pdfs mentioned in annotation excels are
             extracted. Otherwise, all the pdfs in the pdf folder will be extracted.
-    skip_extracted_files (bool)(Optional): whether to skip extracting a file if it exist in the extraction folder.
+    skip_extracted_files (bool)(Optional): whether to skip extracting a file if it exists in the extraction folder.
     """
 
     annotation_folder: Optional[str] = None
@@ -59,7 +59,7 @@ def __init__(self, settings: Optional[dict] = None):
         self._settings: dict = settings_base
 
     def __init_subclass__(cls, **kwargs):
-        """Intialize the subclass."""
+        """Initialize the subclass."""
         super().__init_subclass__(**kwargs)
         if cls.extractor_name == "base":
             raise ValueError(
@@ -142,7 +142,7 @@ def extract(
             raise ExtractionError(
                 f"While doing the extraction we faced the following error:\n "
                 f"{repr(e)}.\n Trace to the error is given by:\n {traceback_str}"
-            )
+            ) from e
 
     @abstractmethod
     def _generate_extractions(
diff --git a/src/osc_transformer_presteps/settings.py b/src/osc_transformer_presteps/settings.py
index c3c66e4..272dfe1 100644
--- a/src/osc_transformer_presteps/settings.py
+++ b/src/osc_transformer_presteps/settings.py
@@ -27,30 +27,6 @@ class LogLevel(str, Enum):
 }
 
 
-class ExtractionServerSettingsBase(BaseModel):
-    """Class for Extraction server settings."""
-
-    port: int = 8000
-    host: str = "localhost"
-    log_type: int = 20
-    log_level: LogLevel = LogLevel("info")
-
-
-class ExtractionServerSettings(ExtractionServerSettingsBase):
-    """Settings for configuring the extraction server.
-
-    This class extends `ExtractionServerSettingsBase` and adds additional
-    logging configuration.
-    """
-
-    def __init__(self, **data) -> None:
-        """Initialize the ExtractionServerSettings."""
-        if "log_level" in data:
-            data["log_level"] = LogLevel(data["log_level"])
-        super().__init__(**data)
-        self.log_type: int = _log_dict[self.log_level.value]
-
-
 class ExtractionSettings(BaseModel):
     """Settings for controlling extraction behavior.
 
diff --git a/src/osc_transformer_presteps/streamlit/__init__.py b/src/osc_transformer_presteps/streamlit/__init__.py
deleted file mode 100644
index 9dd09fe..0000000
--- a/src/osc_transformer_presteps/streamlit/__init__.py
+++ /dev/null
@@ -1 +0,0 @@
-"""Module for Streamlit app."""
diff --git a/tests/osc_transformer_presteps/content_extraction/extractors/test_base_extractor.py b/tests/osc_transformer_presteps/content_extraction/extractors/test_base_extractor.py
index 3e1ee06..3caf75f 100644
--- a/tests/osc_transformer_presteps/content_extraction/extractors/test_base_extractor.py
+++ b/tests/osc_transformer_presteps/content_extraction/extractors/test_base_extractor.py
@@ -1,3 +1,5 @@
+"""Module to test the base_extractor.py."""
+
 from pathlib import Path
 from typing import Optional
 
@@ -10,7 +12,7 @@
 
 
 def concrete_base_extractor(name: str):
-    """This function replaces all abstract methods by concrete ones."""
+    """Replace all abstract methods by concrete ones."""
 
     class ConcreteBaseExtractor(BaseExtractor):
         extractor_name = name
@@ -25,14 +27,15 @@ def _generate_extractions(
 
 
 class TestBaseExtractor:
+    """Class to collect tests for the BaseExtractor."""
+
     @pytest.fixture()
     def base_extractor(self):
+        """Initialize a concrete BaseExtractor element to test it."""
         return concrete_base_extractor("base_test")
 
     def test_extractor_name_is_base(self):
-        """This function tests if we get a ValueError in case a subclass has not changed extractor_name to
-        something different base.
-        """
+        """Tests if we get a ValueError in case a subclass has not changed extractor_name."""
         with pytest.raises(
             ValueError,
             match="Subclass must define an extractor_name not equal to 'base'.",
@@ -40,6 +43,7 @@ def test_extractor_name_is_base(self):
             concrete_base_extractor("base")
 
     def test_get_settings(self, base_extractor):
+        """Test if retrieving the right settings."""
         settings = base_extractor.get_settings()
         assert settings["annotation_folder"] is None
         assert settings["min_paragraph_length"] == 20
@@ -47,6 +51,7 @@ def test_get_settings(self, base_extractor):
         assert settings["store_to_file"] is True
 
     def test_get_extractions(self, base_extractor):
+        """Test if we can retrieve extraction response correctly."""
         base_extractor._extraction_response = ExtractionResponse(
             **{"dictionary": {"a": "b"}, "success": True}
         )
@@ -54,6 +59,7 @@ def test_get_extractions(self, base_extractor):
         assert base_extractor.get_extractions().success is True
 
     def test_check_for_skip_files(self, base_extractor):
+        """Test if files are really skipped when defined as such."""
         input_file_path = Path(__file__).resolve().parent / "test.pdf"
         output_folder_path = Path(__file__).resolve().parent
         assert not base_extractor.check_for_skip_files(
@@ -77,6 +83,7 @@ def test_check_for_skip_files(self, base_extractor):
         json_file_path.unlink(missing_ok=True)
 
     def test_save_extraction_to_file(self, base_extractor):
+        """Test if we can save the output."""
         output_file_path = Path(__file__).resolve().parent / "output.json"
         er = ExtractionResponse()
         er.dictionary = {"key": "value"}
diff --git a/tests/osc_transformer_presteps/content_extraction/extractors/test_pdf_extractor.py b/tests/osc_transformer_presteps/content_extraction/extractors/test_pdf_extractor.py
index d0a45fc..9bdad4d 100644
--- a/tests/osc_transformer_presteps/content_extraction/extractors/test_pdf_extractor.py
+++ b/tests/osc_transformer_presteps/content_extraction/extractors/test_pdf_extractor.py
@@ -1,3 +1,5 @@
+"""Module to test the pdf_extractor.py."""
+
 import json
 from pathlib import Path
 
@@ -7,8 +9,12 @@
 
 
 class TestPdfExtractor:
+    """Class to collect tests for the PDFExtractor class."""
+
     def test_pdf_with_extraction_issues(self):
-        """In this test we try to extract the data from a pdf, where one can not extract text as it was produced via
+        """Test with extraction issue.
+
+        A test where we try to extract the data from a pdf, where one can not extract text as it was produced via
         a "print". Check the file test_issue.pdf.
         """
         extractor = PDFExtractor()
@@ -17,7 +23,9 @@ def test_pdf_with_extraction_issues(self):
         assert extraction_response.dictionary == {}
 
     def test_pdf_with_no_extraction_issues(self):
-        """In this test we try to extract the data from a pdf, where one can not extract text as it was produced via
+        """Test with no extraction issue.
+
+        In this test we try to extract the data from a pdf, where one can not extract text as it was produced via
         a "print". Check the file test_issue.pdf.
         """
         extractor = PDFExtractor()
diff --git a/tests/osc_transformer_presteps/content_extraction/test_extraction_factory.py b/tests/osc_transformer_presteps/content_extraction/test_extraction_factory.py
index 4be8daa..7c9c8b6 100644
--- a/tests/osc_transformer_presteps/content_extraction/test_extraction_factory.py
+++ b/tests/osc_transformer_presteps/content_extraction/test_extraction_factory.py
@@ -1,3 +1,5 @@
+"""Module to test the extraction_factory.py."""
+
 import pytest
 
 from osc_transformer_presteps.content_extraction.extraction_factory import get_extractor
@@ -7,10 +9,14 @@
 
 
 class TestGetExtractor:
+    """Class to collect tests for the get_extractor function."""
+
     def test_get_pdf_extractor(self):
+        """Test if we can retrieve the pdf extractor."""
         extractor = get_extractor(".pdf")
         assert isinstance(extractor, PDFExtractor)
 
     def test_get_non_existing_extractor(self):
+        """Test for an error message for an invalid extractor type."""
         with pytest.raises(KeyError, match="Invalid extractor type"):
             get_extractor(".thisdoesnotexist")
diff --git a/tests/osc_transformer_presteps/dataset_creation_curation/test_curator.py b/tests/osc_transformer_presteps/dataset_creation_curation/test_curator.py
index ea61a40..8895aa0 100644
--- a/tests/osc_transformer_presteps/dataset_creation_curation/test_curator.py
+++ b/tests/osc_transformer_presteps/dataset_creation_curation/test_curator.py
@@ -1,3 +1,5 @@
+"""Module to test the curator.py."""
+
 import os
 from pathlib import Path
 
@@ -17,6 +19,7 @@
 
 @pytest.fixture
 def mock_curator_data():
+    """Mimics the curator settings data."""
     return {
         "annotation_folder": cwd / "test_annotations_sliced.xlsx",
         "extract_json": cwd / "Test.json",
@@ -28,16 +31,18 @@ def mock_curator_data():
 
 @pytest.fixture
 def curator_object(mock_curator_data):
+    """Fixture to create a fixed Curator object with the given mocked settings data."""
     return Curator(
-        annotation_folder=mock_curator_data["annotation_folder"],
+        annotation_folder=str(mock_curator_data["annotation_folder"]),
         extract_json=mock_curator_data["extract_json"],
-        kpi_mapping_path=mock_curator_data["kpi_mapping_path"],
+        kpi_mapping_path=str(mock_curator_data["kpi_mapping_path"]),
         neg_pos_ratio=1,
         create_neg_samples=True,
     )
 
 
 def annotation_to_df(filepath: Path) -> pd.Series:
+    """Load curation data and return the first row."""
     df = pd.read_excel(filepath, sheet_name="data_ex_in_xls")
     df["annotation_file"] = os.path.basename(filepath)
 
@@ -50,7 +55,10 @@ def annotation_to_df(filepath: Path) -> pd.Series:
 
 
 class TestAnnotationData:
+    """Class to collect tests for the AnnotationData class."""
+
     def test_annotation_data_valid_paths(self, mock_curator_data):
+        """A test to validate that all mentioned paths are ok."""
         data = AnnotationData(
             annotation_folder=mock_curator_data["annotation_folder"],
             extract_json=mock_curator_data["extract_json"],
@@ -61,6 +69,7 @@ def test_annotation_data_valid_paths(self, mock_curator_data):
         assert data.kpi_mapping_path == cwd / "kpi_mapping_sliced.csv"
 
     def test_annotation_data_invalid_paths(self):
+        """A test to validate that wrong paths will raise an error."""
         with pytest.raises(ValidationError):
             AnnotationData(
                 annotation_folder="/invalid/path",
@@ -70,6 +79,8 @@ def test_annotation_data_invalid_paths(self):
 
 
 class TestCurator:
+    """Class to collect tests for the curator module."""
+
     @pytest.mark.parametrize(
         "input_text, expected_output",
         [
@@ -84,29 +95,35 @@ class TestCurator:
         ],
     )
     def test_clean_text(self, curator_object, input_text, expected_output):
+        """A test where we test multiple test sentences."""
         cleaned_text = curator_object.clean_text(input_text)
         assert cleaned_text == expected_output
 
     def test_clean_text_basic(self, curator_object):
+        """A test where test sentence is already clean."""
         cleaned_text = curator_object.clean_text("This is a test sentence.")
         assert cleaned_text == "This is a test sentence."
 
     def test_clean_text_with_fancy_quotes(self, curator_object):
+        """A test on cleaning text with special quotes."""
         text_with_fancy_quotes = "“This is a test sentence.”"
         cleaned_text = curator_object.clean_text(text_with_fancy_quotes)
         assert cleaned_text == '"This is a test sentence."'
 
     def test_clean_text_with_newlines_and_tabs(self, curator_object):
+        """A test on removing new lines and tabs."""
         text_with_newlines_tabs = "This\nis\ta\ttest\nsentence."
         cleaned_text = curator_object.clean_text(text_with_newlines_tabs)
         assert cleaned_text == "This is a test sentence."
 
     def test_clean_text_removing_specific_terms(self, curator_object):
+        """A test on removing specific terms."""
         text_with_boe = "This sentence contains the term BOE."
         cleaned_text = curator_object.clean_text(text_with_boe)
         assert cleaned_text == "This sentence contains the term ."
 
     def test_clean_text_removing_invalid_escape_sequence(self, curator_object):
+        """A test on removing invalid escape sequence."""
         text_with_invalid_escape_sequence = (
             "This sentence has an invalid escape sequence: \x9d"
         )
@@ -114,12 +131,14 @@ def test_clean_text_removing_invalid_escape_sequence(self, curator_object):
         assert cleaned_text == "This sentence has an invalid escape sequence: "
 
     def test_clean_text_removing_extra_backslashes(self, curator_object):
+        """A test on removing extra  backslashes."""
         text_with_extra_backslashes = "This\\ sentence\\ has\\ extra\\ backslashes."
         cleaned_text = curator_object.clean_text(text_with_extra_backslashes)
         assert cleaned_text == "This sentence has extra backslashes."
 
     def test_create_pos_examples_correct_samples(self, curator_object):
-        row = annotation_to_df(curator_object.annotation_folder)
+        """A test where we create positive examples via curator."""
+        row = annotation_to_df(Path(curator_object.annotation_folder))
         pos_example = curator_object.create_pos_examples(row)
         expected_pos_example = [
             "We continue to work towards delivering on our Net Carbon Footprint ambition to "
@@ -133,35 +152,39 @@ def test_create_pos_examples_correct_samples(self, curator_object):
         assert pos_example == expected_pos_example
 
     def test_create_pos_examples_json_filename_mismatch(self, mock_curator_data):
+        """A test for positive examples where we have a json filename mismatch."""
         curator = Curator(
-            annotation_folder=mock_curator_data["annotation_folder"],
+            annotation_folder=str(mock_curator_data["annotation_folder"]),
             extract_json=cwd / "Test_another.json",
-            kpi_mapping_path=mock_curator_data["kpi_mapping_path"],
+            kpi_mapping_path=str(mock_curator_data["kpi_mapping_path"]),
             neg_pos_ratio=1,
             create_neg_samples=True,
         )
-        row = annotation_to_df(curator.annotation_folder)
+        row = annotation_to_df(Path(curator.annotation_folder))
         pos_example = curator.create_pos_examples(row)
         assert pos_example == [""]
 
     def test_create_neg_examples_correct_samples(self, curator_object):
-        row = annotation_to_df(curator_object.annotation_folder)
+        """A test where we create negative examples via curator."""
+        row = annotation_to_df(Path(curator_object.annotation_folder))
         neg_example = curator_object.create_neg_examples(row)
         assert neg_example == ["Shell 2019 Sustainability Report"]
 
     def test_create_neg_examples_json_filename_mismatch(self, mock_curator_data):
+        """A test for negative examples where we have a json filename mismatch."""
         curator = Curator(
-            annotation_folder=mock_curator_data["annotation_folder"],
+            annotation_folder=str(mock_curator_data["annotation_folder"]),
             extract_json=cwd / "Test_another.json",
-            kpi_mapping_path=mock_curator_data["kpi_mapping_path"],
+            kpi_mapping_path=str(mock_curator_data["kpi_mapping_path"]),
             neg_pos_ratio=1,
             create_neg_samples=True,
         )
-        row = annotation_to_df(curator.annotation_folder)
+        row = annotation_to_df(Path(curator.annotation_folder))
         neg_example = curator.create_neg_examples(row)
         assert neg_example == [""]
 
     def test_create_curator_df(self, curator_object):
+        """A test to create the final dataframe output."""
         actual_df = pd.read_csv(cwd / "Actual.csv")
         output = curator_object.create_curator_df()