Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LEARNING-2] Update Several Files Related Docker File #23

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

tiaranrh
Copy link
Contributor

Hi @daniel-ciocirlan , thank you for this course I've been studying a lot about Spark including how to use docker in this project.

  • Previously I was encountered an issue while executing the following commands:
chmod +x build-images.sh
./build-images.sh

error:

3 warnings found (use docker --debug to expand):
 - LegacyKeyValueFormat: "ENV key=value" should be used instead of legacy "ENV key value" format (line 7)
 - LegacyKeyValueFormat: "ENV key=value" should be used instead of legacy "ENV key value" format (line 8)
 - LegacyKeyValueFormat: "ENV key=value" should be used instead of legacy "ENV key value" format (line 9)

What's next:
    View a summary of image vulnerabilities and recommendations → docker scout quickview 

so I've updated several docker files to fix the legacy "ENV key value" warnings by switching to the correct ENV key=value format.

  • Update commands in README.md use docker compose instead of docker-compose in accordance with Docker v2.

I hope these updates will be helpful for others taking this course. cmiiw, thank you ^^

Comment on lines 86 to 88
carsDF
.withColumn("Year",date_format(col("Year"),"yyyy-MM-dd")) // date format conversion *before* writing to Parquet
.write
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related this changes, when I am checking parquet file that Year column return number value

image It might be from schema in my editor *I'm using avro viewer by IntelijIDEA
 {
    "name" : "Year",
    "type" : [ "null", {
      "type" : "int",
      "logicalType" : "date"
    } ],
    "default" : null
  }

is it valid?? I think we need to convert this column to date before write to parquet, because datetype is temporal type

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this section wants to demonstrate that Parquet is the default file format, the dates are not very important

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh I see, okey I will rollback this changes in the next commit. thanks

…writing cars.parquet json"

This reverts commit 807fb9e.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants