Skip to content

Commit

Permalink
Merge pull request #16 from BCG-X-Official/dev/1.0.0
Browse files Browse the repository at this point in the history
BUILD: release fluxus 1.0.0
  • Loading branch information
j-ittner authored Jun 19, 2024
2 parents 842f244 + 334ba97 commit e7f7798
Show file tree
Hide file tree
Showing 51 changed files with 1,765 additions and 73 deletions.
62 changes: 38 additions & 24 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,16 @@
Introduction to *fluxus*
========================

**FLUXUS** is a Python framework designed by `BCG X <https://www.bcg.com/x>`_ to
*fluxus* is a Python framework designed by `BCG X <https://www.bcg.com/x>`_ to
streamline the development of complex data processing pipelines (called *flows*),
enabling users to quickly and efficiently build, test, and deploy data workflows,
making complex operations more manageable.
enabling users to quickly and efficiently build, test, and deploy highly concurrent
workflows, making complex operations more manageable.

**FLUXUS** is inspired by the data stream paradigm and is designed to be simple,
expressive, and composable.

**FLUXUS** is inspired by the data stream paradigm and is designed to be simple,
expressive, and composable.

Introducing Flows
-----------------
Expand Down Expand Up @@ -55,23 +61,23 @@ With *fluxus*, we can define this flow as follows:
dict(greeting="Bonjour!"),
]
def lower(greeting: str) -> dict[str, str]:
def lower(greeting: str):
# Convert the greeting to lowercase and keep track of the case change
return dict(
yield dict(
greeting=greeting.lower(),
case="lower",
)
def upper(greeting: str) -> dict[str, str]:
def upper(greeting: str):
# Convert the greeting to uppercase and keep track of the case change
return dict(
yield dict(
greeting=greeting.upper(),
tone="upper",
case="upper",
)
def annotate(greeting: str, case: str = "original") -> dict[str, str]:
def annotate(greeting: str, case: str = "original"):
# Annotate the greeting with the case change; default to "original"
return dict(greeting=f"{greeting!r} ({case})")
yield dict(greeting=f"{greeting!r} ({case})")
flow = (
step("input", input_data) # initial producer step
Expand Down Expand Up @@ -123,12 +129,12 @@ This gives us the following output in :code:`result`:
[
{
'input': {'greeting': 'Hello, World!'},
'upper': {'greeting': 'HELLO, WORLD!', 'tone': 'upper'},
'upper': {'greeting': 'HELLO, WORLD!', 'case': 'upper'},
'annotate': {'greeting': "'HELLO, WORLD!' (original)"}
},
{
'input': {'greeting': 'Bonjour!'},
'upper': {'greeting': 'BONJOUR!', 'tone': 'upper'},
'upper': {'greeting': 'BONJOUR!', 'case': 'upper'},
'annotate': {'greeting': "'BONJOUR!' (original)"}
}
],
Expand All @@ -144,6 +150,11 @@ This gives us the following output in :code:`result`:
]
)
Or, as a *pandas* data frame by calling :code:`result.to_frame()`:

.. image:: sphinx/source/_images/flow-hello-world-results.png
:alt: "Hello World" flow results
:width: 600px

Here's what happened: The flow starts with a single input data item, which is then
passed along three parallel paths. Each path applies different transformations to the
Expand All @@ -158,8 +169,8 @@ The run result not only gives us the final product of the ``annotate`` step but
inputs and intermediate products of the ``lower`` and ``upper`` steps. We refer to this
extended view of the flow results as the *lineage* of the flow.

For a more thorough introduction to FLUXUS, please visit our `User Guide <#>`_ and
`Examples <#>`_!
For a more thorough introduction to FLUXUS, please visit our
`User Guide <https://bcg-x-official.github.io/fluxus/user_guide/index.html>`_.


Why *fluxus*?
Expand All @@ -181,10 +192,9 @@ motivations for using *fluxus* include:
- **Ease of Use**: *fluxus* provides a functional API that abstracts away the
complexities of data processing, making it accessible to developers of all levels.
More experienced users can also leverage the advanced features of its underlying
object-oriented implementation for customisation and optimisation (see
`Advanced Features <#>`_ for more details).


object-oriented implementation for additional customisation and versatility (see
`User Guide <https://bcg-x-official.github.io/fluxus/user_guide/index.html>`_ for more
details).

Concurrent Processing in *fluxus*
---------------------------------
Expand All @@ -207,12 +217,15 @@ applications.
Getting started
===============

- See the `FLUXUS Documentation <#>`_ for a comprehensive User Guide, Examples,
API reference, and more.
- See `Contributing <CONTRIBUTING.md>`_ or visit our detailed `Contributor Guide <#>`_
- See the
`FLUXUS Documentation <https://bcg-x-official.github.io/fluxus/_generated/home.html>`_
for a comprehensive User Guide, API reference, and more.
- See `Contributing <CONTRIBUTING.md>`_ or visit our detailed
`Contributor Guide <https://bcg-x-official.github.io/fluxus/contributor_guide/index.html>`_
for information on contributing.
- We have an `FAQ <#>`_ for common questions. For anything else, please reach out to
[email protected].
- We have an `FAQ <https://bcg-x-official.github.io/fluxus/faq.html>`_ for common
questions. For anything else, please reach out to
`[email protected] <mailto:[email protected]>`_.


User Installation
Expand Down Expand Up @@ -266,7 +279,8 @@ or ``conda``:
Contributing
------------

Contributions to ARTKIT are welcome and appreciated! Please see the `Contributing <CONTRIBUTING.md>`_ section for information.
Contributions to *fluxus* are welcome and appreciated! Please see the
`Contributing <CONTRIBUTING.md>`_ section for information.


License
Expand Down
48 changes: 48 additions & 0 deletions sphinx/add_copyright_notice.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
#!/usr/bin/env python3
import os

# Define the copyright notice
COPYRIGHT_NOTICE = """\
# -----------------------------------------------------------------------------
# © 2024 Boston Consulting Group. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# -----------------------------------------------------------------------------
"""


def add_copyright_notice(file_path):
with open(file_path) as file:
content = file.read()

if (
COPYRIGHT_NOTICE.strip() not in content
): # Avoid adding the notice if it's already present
with open(file_path, "w") as file:
file.write(COPYRIGHT_NOTICE + "\n" + content)


def recursively_add_notice_to_py_files(directory):
for root, _, files in os.walk(directory):
for file in files:
if file.endswith(".py"):
file_path = os.path.join(root, file)
add_copyright_notice(file_path)


# Specify the directory you want to start the search from
start_directory = "../src" # Replace with your directory path

recursively_add_notice_to_py_files(start_directory)

print("Copyright notice added to all .py files.")
2 changes: 2 additions & 0 deletions sphinx/make/conf_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,8 @@ def set_config(
get_package_version(package_path=os.path.join(_dir_src, project))
)

globals_["html_show_sourcelink"] = False

if html_logo:
globals_["html_logo"] = html_logo
globals_["latex_logo"] = html_logo
Expand Down
10 changes: 0 additions & 10 deletions sphinx/make/make_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,6 @@
]
assert len(PACKAGE_NAMES) == 1, "only one package per Sphinx build is supported"
PROJECT_NAME = PACKAGE_NAMES[0]
EXCLUDE_MODULES = []
DIR_DOCS = os.path.join(DIR_REPO_ROOT, "docs")
DIR_DOCS_VERSION = os.path.join(DIR_DOCS, "docs-version")
DIR_SPHINX_SOURCE = os.path.join(DIR_SPHINX_ROOT, "source")
Expand Down Expand Up @@ -211,15 +210,6 @@ def _run(self) -> None:
check=True,
)

# remove rst file and directory for excluded modules
for module in EXCLUDE_MODULES:
rst_path = os.path.join(
DIR_SPHINX_API_GENERATED, PROJECT_NAME, f"{PROJECT_NAME}.{module}.rst"
)
module_path = os.path.join(DIR_SPHINX_API_GENERATED, PROJECT_NAME, module)
os.remove(rst_path)
shutil.rmtree(module_path)

# Adjust the path and filename as per your project's structure
api_doc_filename = os.path.join(DIR_SPHINX_API_GENERATED, f"{packages[0]}.rst")
new_title = "API Reference"
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
64 changes: 58 additions & 6 deletions sphinx/source/faq.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,13 +10,65 @@ FAQ
About the project
-----------------

What is FLUXUS for?
~~~~~~~~~~~~~~~~~~~
What is *fluxus* for?
~~~~~~~~~~~~~~~~~~~~~

*fluxus* is a Python framework designed by `BCG X <https://www.bcg.com/x>`_ to
streamline the development of complex data processing pipelines (called *flows*),
enabling users to quickly and efficiently build, test, and deploy highly concurrent
workflows, making complex operations more manageable.

Who developed FLUXUS, and why?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
**FLUXUS** is inspired by the data stream paradigm and is designed to be simple,
expressive, and composable.

Who developed *fluxus*, and why?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

*fluxus* was developed by the Responsible AI team at
`BCG X <https://www.bcg.com/x>`_, primarily to provide a scalable
and efficient way to rapidly stand up highly concurrent red teaming workloads for
GenAI testing and evaluation.

Given that other use cases for *fluxus* are likely to emerge, we decided to publish
the flow management portion of the codebase as a a separate open-source project.

What is the origin of the name *fluxus*?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The name *fluxus* is derived from the Latin word for "flow" or "stream." The name
was chosen to reflect the project's focus on data streams and the flow of data
through a pipeline.

How does *fluxus* differ from other pipelining/workflow libraries?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

*fluxus* is designed to be simple, expressive, and composable. It is built on top
of the `asyncio <https://docs.python.org/3/library/asyncio.html>`_ library, which
provides a powerful and flexible way to write concurrent code in Python.

*fluxus* is also designed to be highly extensible, allowing users to easily add
new components and customize existing ones. It provides both a functional API for
quickly and intuitively building flows using dictionaries as their primary data
structures, as well as a class-based API for more complex flows that require
custom data types.

Finally, *fluxus* is designed to be lightweight and efficient, managing the complexities
of concurrency and parallelism behind the scenes so that users can focus on building
their pipelines without worrying about the underlying implementation details.

What are examples of use cases for *fluxus*?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

*fluxus* is designed to be a general-purpose framework for building data processing
pipelines, so it can be used in a wide variety of applications. It is particularly
powerful for building highly concurrent workflows that make heavy use of I/O-bound
operations, such as network requests, file I/O, and database queries.

Some examples of use cases for *fluxus* include:

- Real-time data processing
- ETL (Extract, Transform, Load) pipelines
- Machine learning workflows
- Data extraction
- Data aggregation and analysis
- Red teaming and security testing
Loading

0 comments on commit e7f7798

Please sign in to comment.