Skip to content

Commit

Permalink
alpine: init commit
Browse files Browse the repository at this point in the history
  • Loading branch information
utnapischtim committed May 24, 2024
1 parent dd23519 commit da5c2e8
Show file tree
Hide file tree
Showing 3 changed files with 225 additions and 0 deletions.
111 changes: 111 additions & 0 deletions alpine/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
# Alpine docker base image

Quick informations:
- alpine linux as base
- python>3.12
- nodejs>20
- final image around 800 MB
- build time around 10 min (mostly install python packages and build statics)

## Builder

The base image for the builder stage is constructed to have all
necessary packages to install the python packages and build the
frontend statics. The size for the builder base image matters too but
it is difficult to optimize it more because the packages are all
necessary to install afterwords the python virtual environment and
build the statics. The biggest part is there the installed python
packages and the installed nodejs packages.

## App

The base image for the app stage is designed to have the smallest
possible size. Only the really necessary packages are installed.

## Multistage build

below is an example implementation of the multistage Dockerfile. There
is still room for optimization. The builder stage is more or less what
it is but for the app stage it would be possible to optimize the
filesize a little bit more. This are then mostly indivual improvements.

Possible further filesize optimizations (estimated max 200MB):
- remove pyc files before copying over
- remove all files which where used to build the statics and therefore useless for production
- remove the app_data directory. only necessary for initialisation of the database not necessary for production
- remove not used python packages like citation styles, ...

Possible further build time optimization:
- place as much into the base images as possible
- most of the python package dependencies
- precalculate a node_modules directory into the builder base image

```Dockerfile

# this multistage stage installs the python packages and builds the
# frontend static files
FROM base-image-builder as builder

COPY Pipfile Pipfile.lock ./

# Install all the dependecies defined in the Pipfile.
RUN pipenv install --deploy --system --pre

# Temporary solution:
# compatibility reasons. python3.12 does not have importlib-metadata
# anymore and it is needed in some packages
RUN pip install importlib-metadata

COPY ./app_data/ ${INVENIO_INSTANCE_PATH}/app_data/
COPY ./assets/ ${INVENIO_INSTANCE_PATH}/assets/
COPY ./static/ ${INVENIO_INSTANCE_PATH}/static/
COPY ./translations ${INVENIO_INSTANCE_PATH}/translations/
COPY ./templates ${INVENIO_INSTANCE_PATH}/templates/

RUN invenio collect --verbose && \
invenio webpack create && \
invenio webpack buildall


# this multistage stage is the final stage to run the worker and ui
# and api's. it contains only what is necessary therefore it copies
# over only the python packages and the static's. node and all
# node_modules packages are not necessary to run the app.
from base-image-app as app

COPY --from=builder ${VIRTUAL_ENV}/lib ${VIRTUAL_ENV}/lib
COPY --from=builder ${VIRTUAL_ENV}/bin ${VIRTUAL_ENV}/bin
COPY --from=builder ${INVENIO_INSTANCE_PATH}/app_data ${INVENIO_INSTANCE_PATH}/app_data
COPY --from=builder ${INVENIO_INSTANCE_PATH}/static ${INVENIO_INSTANCE_PATH}/static
COPY --from=builder ${INVENIO_INSTANCE_PATH}/translations ${INVENIO_INSTANCE_PATH}/translations
COPY --from=builder ${INVENIO_INSTANCE_PATH}/templates ${INVENIO_INSTANCE_PATH}/templates

WORKDIR ${WORKING_DIR}/src

# TODO:
# add here further things what should be in the final container

COPY ./saml/idp/cert/ ./saml/idp/cert/
COPY ./migrations/ ./migrations/
COPY ./wipe_recreate.sh .
COPY ./docker/uwsgi/ ${INVENIO_INSTANCE_PATH}
COPY ./invenio.cfg ${INVENIO_INSTANCE_PATH}

# so that the user invenio can execute the wipe_recreate script
RUN chmod 755 wipe_recreate.sh
RUN chown invenio:invenio wipe_recreate.sh

# this step ensures that the user invenio could write to the
# ${WORKING_DIR}/src which is necessary for the worker to
# write the celerybeat schedule file
RUN chown invenio:invenio .

# this step ensures that the docker container is not running as root.
# since virtually no file belongs to the user, he can do almost
# nothing except what he is supposed to do
USER invenio

# Instruction used to configure how the container will run.
ENTRYPOINT [ "bash", "-c"]

```
48 changes: 48 additions & 0 deletions alpine/app/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
#
# Copyright (C) 2024 Graz University of Technology.
#
# Invenio is free software; you can redistribute it and/or modify it
# under the terms of the MIT License; see LICENSE file for more
# details.
#

FROM alpine:edge

ENV LANG en_US.UTF-8
ENV LANGUAGE en_US:en
ENV LC_ALL en_US.UTF-8

ENV VIRTUAL_ENV=/opt/env
ENV WORKING_DIR=/opt/invenio
ENV INVENIO_INSTANCE_PATH=${WORKING_DIR}/var/instance
ENV PATH=$VIRTUAL_ENV/bin:$PATH


# python is to run the webserver
# libxslt-dev is a dependency
# xmlsec is for saml
# cairo is for image processing
# uwsgi-python3 to provide the webserver
# bash is not really necessary but a nicer terminal then sh
RUN apk update
RUN apk add --update --no-cache \
"python3>3.12" \
libxslt-dev \
xmlsec \
cairo \
uwsgi-python3 \
bash --repository=https://dl-cdn.alpinelinux.org/alpine/edge/community

RUN python -m venv ${VIRTUAL_ENV}

RUN source ${VIRTUAL_ENV}/bin/activate

RUN mkdir -p ${INVENIO_INSTANCE_PATH}
RUN mkdir -p ${VIRTUAL_ENV}
RUN mkdir -p ${WORKING_DIR}/src/saml/idp/cert

RUN adduser invenio --no-create-home --disabled-password

RUN rm /opt/env/bin/python && ln -s /usr/bin/python python

ENTRYPOINT [ "bash", "-c"]
66 changes: 66 additions & 0 deletions alpine/builder/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
#
# Copyright (C) 2024 Graz University of Technology.
#
# Invenio is free software; you can redistribute it and/or modify it
# under the terms of the MIT License; see LICENSE file for more
# details.
#

FROM alpine:edge

ENV LANG en_US.UTF-8
ENV LANGUAGE en_US:en
ENV LC_ALL en_US.UTF-8

ENV PYTHONUNBUFFERED=1
ENV PIPENV_VERBOSITY=-1
ENV VIRTUAL_ENV=/opt/env
ENV WORKING_DIR=/opt/invenio
ENV INVENIO_INSTANCE_PATH=${WORKING_DIR}/var/instance
ENV PYTHONUSERBASE=$VIRTUAL_ENV
ENV PATH=$VIRTUAL_ENV/bin:$PATH
ENV PYTHONPATH $VIRTUAL_ENV/lib/python3.12:$PATH


# xmlsec is for saml
# gcc to build wheels
# nodejs to build the statics
# cairo for some frontend things. maybe not necessary here
# python is obvious

RUN apk update
RUN apk add --update --no-cache \
"python3>3.12" \
"python3-dev>3.12" \
"nodejs>20" \
"npm>10" \
git \
cairo \
autoconf \
automake \
bash \
build-base \
file \
gcc \
libtool \
libxml2-dev \
libxslt-dev \
linux-headers \
xmlsec-dev \
xmlsec --repository=https://dl-cdn.alpinelinux.org/alpine/edge/community

RUN python -m venv ${VIRTUAL_ENV}
RUN source ${VIRTUAL_ENV}/bin/activate
RUN pip install --upgrade pip setuptools pipenv

# Temporary solution:
# not more necessary after new release of xmlsec
# https://github.com/xmlsec/python-xmlsec/issues/316
# NOTE: --only-binary is not working!!!! it builds but it fails on runtime
RUN pip install --no-binary=xmlsec --no-binary=lxml lxml xmlsec

WORKDIR ${WORKING_DIR}/src

RUN mkdir -p ${INVENIO_INSTANCE_PATH}

ENTRYPOINT [ "bash", "-c"]

0 comments on commit da5c2e8

Please sign in to comment.