Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

no import change notebooks updates #830

Merged

Conversation

eordentlich
Copy link
Collaborator

No description provided.

Signed-off-by: Erik Ordentlich <[email protected]>
Signed-off-by: Erik Ordentlich <[email protected]>
Signed-off-by: Erik Ordentlich <[email protected]>
@eordentlich
Copy link
Collaborator Author

build

Copy link
Collaborator

@rishic3 rishic3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome new feature! Minor comments that are mostly stylistic.

notebooks/aws-emr/README.md Outdated Show resolved Hide resolved
notebooks/aws-emr/init-bootstrap-action-no-import.sh Outdated Show resolved Hide resolved
notebooks/aws-emr/init-configurations.json Outdated Show resolved Hide resolved
notebooks/README.md Show resolved Hide resolved
notebooks/databricks/README.md Outdated Show resolved Hide resolved
notebooks/databricks/README.md Outdated Show resolved Hide resolved
notebooks/databricks/README.md Outdated Show resolved Hide resolved
notebooks/dataproc/README.md Outdated Show resolved Hide resolved
python/README.md Show resolved Hide resolved
notebooks/aws-emr/init-bootstrap-action-no-import.sh Outdated Show resolved Hide resolved
Signed-off-by: Erik Ordentlich <[email protected]>
Signed-off-by: Erik Ordentlich <[email protected]>
@eordentlich
Copy link
Collaborator Author

Great suggestions. Will incorporate in revision.

Copy link
Collaborator

@lijinf2 lijinf2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

Signed-off-by: Erik Ordentlich <[email protected]>
@eordentlich
Copy link
Collaborator Author

build

@lijinf2
Copy link
Collaborator

lijinf2 commented Jan 28, 2025

@eordentlich I have a question in mind: Would it be beneficial to move "no-code-change" to an independent folder for better organization and future reference? Since this is an important feature, having a centralized location for reference or a pointer could be helpful. Do you have any thoughts on how cuDF or cuML direct users to "no-code-change"?

@eordentlich
Copy link
Collaborator Author

cudf no code change for pandas is called cudf-pandas: https://rapids.ai/cudf-pandas/ and the code lives in cudf/pandas . The analogue in our case could be spark_rapids_ml/pyspark which seems a little strange to me. Open to suggestions.

@lijinf2
Copy link
Collaborator

lijinf2 commented Jan 28, 2025

cudf no code change for pandas is called cudf-pandas: https://rapids.ai/cudf-pandas/ and the code lives in cudf/pandas . The analogue in our case could be spark_rapids_ml/pyspark which seems a little strange to me. Open to suggestions.

Agree. Also, it seems nontrivial to maintain a repository or a folder. We can revisit this in the future when needed.

--bucket ${GCS_BUCKET} \
--enable-component-gateway \
--subnet=default \
--no-shielded-secure-boot
```
**Note**: the `properties` settings are for demonstration purposes only. Additional tuning may be required for optimal performance.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somehow the properties got left out of the shell block here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Looks like indenting to the -- fixes it and preserves no-spaces between the entries on copy.

- In the [AWS EMR console](https://console.aws.amazon.com/emr/), click "Clusters", you can find the cluster id of the created cluster. Wait until all the instances have the Status turned to "Running".
- In the [AWS EMR console](https://console.aws.amazon.com/emr/), click "Workspace(Notebooks)", then create a workspace. Wait until the status becomes ready and a JupyterLab webpage will pop up.
- In the [AWS EMR console](https://console.aws.amazon.com/emr/), click "Clusters", you can find the cluster id of the created cluster. Wait until the cluster has the "Waiting" status.
- To use notebooks on EMR you will need an EMR Studio and an associated Workspace. If you don't already have these, in the [AWS EMR console](https://console.aws.amazon.com/emr/), on the left, in the "EMR Studio" section, click the respective "Studio" and "Workspace (Notebooks)" links and follow instructions. Please check EMR documentation for further instructions. Note that the Studio VPC should match the VPC of the subnet used for the cluster. Select "\*Default\*" for all security group prompts and drop downs.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the user must set up internet gateway for the VPC along with routing table - do we want to make those steps explicit?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will clarify that the subnet should have internet access. We actually need it to install things in the init scripts even for the emr cluster vpc. Adding a few other clarifications.

@rishic3
Copy link
Collaborator

rishic3 commented Jan 28, 2025

build

@eordentlich
Copy link
Collaborator Author

build

Copy link
Collaborator

@rishic3 rishic3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@eordentlich eordentlich merged commit e909a46 into NVIDIA:branch-25.02 Jan 29, 2025
3 checks passed
@eordentlich eordentlich deleted the eo-no-import-notebooks-updates branch January 29, 2025 17:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants