Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Edits to documentation #1408

Merged
merged 15 commits into from
Dec 18, 2024
6 changes: 3 additions & 3 deletions docs/source/cli.rst
Original file line number Diff line number Diff line change
Expand Up @@ -151,15 +151,15 @@ Runtime management
------------------

For complete instructions on how to build runtimes for Lithops, please
refer to ``runtime/`` folder and choose your compute backend.
refer to the ``runtime/`` folder and choose your compute backend.

``lithops runtime build <runtime-name>``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Build a new runtime image. Depending of the compute backend, there must
be a Dockerfile located in the same folder you run the command,
otherwise use ``-f`` parameter. Note that this command only builds the
image and puts it to a container registry. This command do not deploy
image and puts it into a container registry. This command do not deploy
the runtime to the compute backend.

+-----------------+-----------------------------------+
Expand Down Expand Up @@ -447,7 +447,7 @@ Deletes objects from a given bucket.
``lithops storage list <bucket>``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Deletes objects from a given bucket.
Lists objects from a given bucket.

+-----------------+---------------------------------+
| Parameter | Description |
Expand Down
8 changes: 4 additions & 4 deletions docs/source/compute_config/aws_ec2.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,9 @@ The AWS EC2 client of Lithops can provide a truely serverless user experience on
The assumption that you already familiar with AWS, and you have AUTH credentials to your account (HMAC Credentials).

### Choose an operating system image for the VM
Any Virtual Machine (VM) need to define the instance’s operating system and version. Lithops support both standard operating system choices provided by the VPC or using pre-defined custom images that already contains all dependencies required by Lithops.
Any Virtual Machine (VM) needs to define the instance’s operating system and version. Lithops supports both standard operating system choices provided by the VPC or using pre-defined custom images that already contains all dependencies required by Lithops.

- Option 1: By default, Lithops uses an Ubuntu 22.04 image. In this case, no further action is required and you can continue to the next step. Lithops will install all required dependencies in the VM by itself. Notice this can consume about 3 min to complete all installations.
- Option 1: By default, Lithops uses an Ubuntu 22.04 image. In this case, no further action is required and you can continue to the next step. Lithops will install all required dependencies in the VM by itself. Note this can consume about 3 min to complete all installations.

- Option 2: Alternatively, you can use a pre-built custom image that will greatly improve VM creation time for Lithops jobs. To benefit from this approach, navigate to [runtime/aws_ec2](https://github.com/lithops-cloud/lithops/tree/master/runtime/aws_ec2), and follow the instructions.

Expand Down Expand Up @@ -188,7 +188,7 @@ In summary, you can use one of the following settings:
|aws_ec2 | ssh_username | ubuntu |no | Username to access the VM |
|aws_ec2 | ssh_key_filename | ~/.ssh/id_rsa | no | Path to the ssh key file provided to create the VM. It will use the default path if not provided |
|aws_ec2 | worker_processes | AUTO | no | Number of parallel Lithops processes in a worker. This is used to parallelize function activations within the worker. By default it detects the amount of CPUs in the VM|
|aws_ec2 | runtime | python3 | no | Runtime name to run the functions. Can be a container image name. If not set Lithops will use the defeuv python3 interpreter of the VM |
|aws_ec2 | runtime | python3 | no | Runtime name to run the functions. Can be a container image name. If not set Lithops will use the default python3 interpreter of the VM |
|aws_ec2 | auto_dismantle | True |no | If False then the VM is not stopped automatically.|
|aws_ec2 | soft_dismantle_timeout | 300 |no| Time in seconds to stop the VM instance after a job **completed** its execution |
|aws_ec2 | hard_dismantle_timeout | 3600 | no | Time in seconds to stop the VM instance after a job **started** its execution |
Expand All @@ -211,7 +211,7 @@ lithops logs poll

## VM Management

Lithops for AWS EC2 follows a Mater-Worker architecture (1:N).
Lithops for AWS EC2 follows a Master-Worker architecture (1:N).

All the VMs, including the master VM, are automatically stopped after a configurable timeout (see hard/soft dismantle timeouts).

Expand Down
4 changes: 2 additions & 2 deletions docs/source/design.rst
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ In Lithops, each map or reduce computation is executed as a separate compute *jo

As mentioned above, the ``FunctionExecutor`` class is responsible for orchestrating the computation in Lithops. One ``FunctionExecutor`` object is instantiated prior to any use of Lithops. Its initialization includes these important steps: 1. It sets up the workers (depending on the specific compute backend), such as constructing docker images, defining IBM Cloud Functions, etc. This step may not include actually creating the workers, as this may be done automatically by the backend on-demand. 2. It defines a bucket in object storage (depending on the storage backend) in which each job will store job and call data (prior to computation) and results (when computation is complete). 3. It creates a ``FunctionInvoker`` object, which is responsible for executing a job as a set of independent per-worker calls.

Compute jobs are created in the functions of the ``job`` module (see chart above), invoked from the respective API method of ``FunctionExecutor``. Map jobs are created in ``create_map_job()`` and reduce jobs in ``create_reduce_job()``. The flow in both functions is quite similar. First, data is partitioned, with the intention of each partition be processed by one worker. For map jobs, this is done by invoking the ``create_partitions()`` function of the ``partitioner`` module, yielding a partition map.
Compute jobs are created in the functions of the ``job`` module (see chart above), invoked from the respective API method of ``FunctionExecutor``. Map jobs are created in ``create_map_job()`` and reduce jobs in ``create_reduce_job()``. The flow in both functions is quite similar. First, data is partitioned, with the intention that each partition be processed by one worker. For map jobs, this is done by invoking the ``create_partitions()`` function of the ``partitioner`` module, yielding a partition map.

For reduce jobs, Lithops currently supports two modes: reduce per object, where each object is processed by a reduce function, and global (default) reduce, where all data is processed by a single reduce function. Respectively, data is partitioned as either one partition per storage object, or one global partition with all data. This process yields a partition map similar to map jobs. Additionally, ``create_reduce_job()`` wraps the reduce function in a special wrapper function that forces waiting for data before the actual reduce function is invoked. This is because reduce jobs follow map jobs, so the output of the map jobs needs to finish before reduce can run.

Expand All @@ -58,4 +58,4 @@ Completion of a computation job in Lithops is detected in one of two techniques:

**RabbitMQ**: A unique RabbitMQ topic is defined for each job. combining the executor id and job id. Each worker, once completes a call, posts a notification message on that topic (code in ``function_handler()`` in ``handler`` module, called from ``entry_point`` module of the worker). The ``wait_rabbitmq()`` function from ``wait_rabbitmq`` module, which is called from ``FunctionExecutor.wait()``, consumes a number of messages on that topic equal to ``total_calls`` and determines completion.

**Object Storage**: As explained above, each call persists its computation results in a specific object. Determining completion of a job is by the ``FunctionExecutor.wait()`` invoking the ``wait_storage()`` function from the ``wait_storage`` module. This function repeatedly, once per fixed period (controllable), polls the executor’s bucket for status objects of a subset of calls that have still not completed. This allows control of resource usage and eventual detection of all calls.
**Object Storage**: As explained above, each call persists its computation results in a specific object. Determining completion of a job is by the ``FunctionExecutor.wait()`` invoking the ``wait_storage()`` function from the ``wait_storage`` module. This function repeatedly, once per fixed period (controllable), polls the executor’s bucket for status objects of a subset of calls that have still not completed. This allows control of resource usage and eventual detection of all calls.
4 changes: 2 additions & 2 deletions docs/source/worker_granularity.rst
Original file line number Diff line number Diff line change
Expand Up @@ -90,7 +90,7 @@ understanding the flexibility VMs provide is essential for effectively utilizing
Unlike FaaS and CaaS platforms, when deploying Lithops on Virtual Machine backends, such as EC2, a master-worker architecture
is adopted. In this paradigm, the master node holds a work queue containing tasks for a specific job, and workers pick up and
process tasks one by one. In this sense, the chunksize parameter, which determines the number of functions allocated
to each worker for parallel processing, is not applicable in this context.Consequently, the worker granularity is inherently
to each worker for parallel processing, is not applicable in this context. Consequently, the worker granularity is inherently
determined by the number of worker processess in the VM setup. Adjusting the number of VM instances or the configuration of
each VM, such as the CPU core count, becomes crucial for optimizing performance and resource utilization in this master-worker
approach.
Expand All @@ -99,7 +99,7 @@ In this scenario, specifying either the ``worker_instance_type`` or ``worker_pro
the desired parallelism inside worker VMs. By default, Lithops determines the total number of worker processes based on the
number of CPUs in the specified instance type. For example, an AWS EC2 instance of type ``t2.medium``, with 2 CPUs, would set
``worker_processes`` to 2. Additionally, users have the flexibility to manually adjust parallelism by setting a different
value for ``worker_processes``. Depenidng on the use case, it would be conveneint to set more ``worker_processes`` than CPUs,
value for ``worker_processes``. Depending on the use case, it would be convenient to set more ``worker_processes`` than CPUs,
or less ``worker_processes`` than CPUs. For example, we can use a ``t2.medium`` instance types that has 2 CPUs, but
set ``worker_processes`` to 4:

Expand Down
4 changes: 2 additions & 2 deletions lithops/standalone/backends/aws_ec2/aws_ec2.py
Original file line number Diff line number Diff line change
Expand Up @@ -1131,7 +1131,7 @@ def wait_ready(self, timeout=INSTANCE_STX_TIMEOUT):

def is_stopped(self):
"""
Checks if the VM instance is stoped
Checks if the VM instance is stopped
"""
state = self.get_instance_data()['State']
if state['Name'] == 'stopped':
Expand All @@ -1140,7 +1140,7 @@ def is_stopped(self):

def wait_stopped(self, timeout=INSTANCE_STX_TIMEOUT):
"""
Waits until the VM instance is stoped
Waits until the VM instance is stopped
"""
logger.debug(f'Waiting {self} to become stopped')

Expand Down
4 changes: 2 additions & 2 deletions lithops/standalone/standalone.py
Original file line number Diff line number Diff line change
Expand Up @@ -173,7 +173,7 @@ def _wait_master_service_ready(self):
"""
Waits until the master service is ready to receive http connections
"""
logger.info(f'Waiting Lithops service to become ready on {self.backend.master}')
logger.info(f'Waiting for Lithops service to become ready on {self.backend.master}')

start = time.time()
while (time.time() - start < self.start_timeout):
Expand Down Expand Up @@ -282,7 +282,7 @@ def create_workers(workers_to_create):
total_workers += len(new_workers)

if total_workers == 0:
raise Exception('It was not possible to create any worker')
raise Exception('It was not possible to create any workers')

logger.debug(f'ExecutorID {executor_id} | JobID {job_id} - Going to run '
f'{total_calls} activations in {total_workers} workers')
Expand Down
2 changes: 1 addition & 1 deletion lithops/standalone/worker.py
Original file line number Diff line number Diff line change
Expand Up @@ -251,7 +251,7 @@ def run_wsgi():
# Start the consumer threads
worker_processes = standalone_config[standalone_config['backend']]['worker_processes']
worker_processes = CPU_COUNT if worker_processes == 'AUTO' else worker_processes
logger.info(f"Starting Worker - Instace type: {worker_data['instance_type']} - Runtime "
logger.info(f"Starting Worker - Instance type: {worker_data['instance_type']} - Runtime "
f"name: {standalone_config['runtime']} - Worker processes: {worker_processes}")

# Create a ThreadPoolExecutor for cosnuming tasks
Expand Down
2 changes: 1 addition & 1 deletion lithops/util/ssh_client.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ def close(self):

def create_client(self, timeout=2):
"""
Crate the SSH client connection
Create the SSH client connection
"""
try:
self.ssh_client = paramiko.SSHClient()
Expand Down
4 changes: 2 additions & 2 deletions runtime/aws_ec2/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ lithops image build -b aws_ec2
```

This command will create an image called "lithops-ubuntu-jammy-22.04-amd64-server" in the target region.
If the image already exists, and you want to updete it, use the `--overwrite` or `-o` parameter:
If the image already exists, and you want to update it, use the `--overwrite` or `-o` parameter:

```
lithops image build -b aws_ec2 --overwrite
Expand Down Expand Up @@ -43,7 +43,7 @@ aws_ec2:

## Option 2:

You can create a VM image manually. For example, you can create a VM in you AWS region, access the VM, install all the dependencies in the VM itself (apt-get, pip3 install, ...), stop the VM, create a VM Image, and then put the AMI ID in your lithops config, for example:
You can create a VM image manually. For example, you can create a VM in your AWS region, access the VM, install all the dependencies in the VM itself (apt-get, pip3 install, ...), stop the VM, create a VM Image, and then put the AMI ID in your lithops config, for example:

```yaml
aws_ec2:
Expand Down