Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LEARNING-2] Update Several Files Related Docker File #23

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ This repository contains the code we wrote during [Rock the JVM's Spark Essenti
- either clone the repo or download as zip
- open with IntelliJ as an SBT project
- Windows users, you need to set up some Hadoop-related configs - use [this guide](/HadoopWindowsUserSetup.md)
- in a terminal window, navigate to the folder where you downloaded this repo and run `docker-compose up` to build and start the PostgreSQL container - we will interact with it from Spark
- in a terminal window, navigate to the folder where you downloaded this repo and run `docker compose up` to build and start the PostgreSQL container - we will interact with it from Spark
- in another terminal window, navigate to `spark-cluster/`
- Linux/Mac users: build the Docker-based Spark cluster with
```
Expand All @@ -19,7 +19,7 @@ chmod +x build-images.sh
```
build-images.bat
```
- when prompted to start the Spark cluster, go to the `spark-cluster` directory and run `docker-compose up --scale spark-worker=3` to spin up the Spark containers with 3 worker nodes
- when prompted to start the Spark cluster, go to the `spark-cluster` directory and run `docker compose up --scale spark-worker=3` to spin up the Spark containers with 3 worker nodes

### Spark Cluster Troubleshooting

Expand Down
2 changes: 1 addition & 1 deletion spark-cluster/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ This will create the following docker images:
The final step to create your test cluster will be to run the compose file:

```sh
docker-compose up --scale spark-worker=3
docker compose up --scale spark-worker=3
```

## Validate your cluster
Expand Down
2 changes: 1 addition & 1 deletion spark-cluster/docker/base/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -37,4 +37,4 @@ RUN wget --no-verbose https://archive.apache.org/dist/spark/spark-${SPARK_VERSIO

# Fix the value of PYTHONHASHSEED
# Note: this is needed when you use Python 3.3 or greater
ENV PYTHONHASHSEED 1
ENV PYTHONHASHSEED=1
6 changes: 3 additions & 3 deletions spark-cluster/docker/spark-master/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,9 @@ RUN apt-get update && apt-get install -y dos2unix
COPY start-master.sh /
RUN dos2unix /start-master.sh && apt-get --purge remove -y dos2unix && rm -rf /var/lib/apt/lists/*

ENV SPARK_MASTER_PORT 7077
ENV SPARK_MASTER_WEBUI_PORT 8080
ENV SPARK_MASTER_LOG /spark/logs
ENV SPARK_MASTER_PORT=7077
ENV SPARK_MASTER_WEBUI_PORT=8080
ENV SPARK_MASTER_LOG=/spark/logs

EXPOSE 8080 7077 6066

Expand Down
2 changes: 1 addition & 1 deletion spark-cluster/docker/spark-submit/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ RUN dos2unix /spark-submit.sh && apt-get --purge remove -y dos2unix && rm -rf /v

ENV SPARK_MASTER_URL="spark://spark-master:7077"
ENV SPARK_SUBMIT_ARGS=""
ENV SPARK_APPLICATION_ARGS ""
ENV SPARK_APPLICATION_ARGS=""
#ENV SPARK_APPLICATION_JAR_LOCATION /opt/spark-apps/myjar.jar
#ENV SPARK_APPLICATION_MAIN_CLASS my.main.Application

Expand Down
6 changes: 3 additions & 3 deletions spark-cluster/docker/spark-worker/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,9 @@ RUN apt-get update && apt-get install -y dos2unix
COPY start-worker.sh /
RUN dos2unix /start-worker.sh && apt-get --purge remove -y dos2unix && rm -rf /var/lib/apt/lists/*

ENV SPARK_WORKER_WEBUI_PORT 8081
ENV SPARK_WORKER_LOG /spark/logs
ENV SPARK_MASTER "spark://spark-master:7077"
ENV SPARK_WORKER_WEBUI_PORT=8081
ENV SPARK_WORKER_LOG=/spark/logs
ENV SPARK_MASTER="spark://spark-master:7077"

EXPOSE 8081

Expand Down