Skip to content

Commit

Permalink
Merge pull request #126 from ONSdigital/development
Browse files Browse the repository at this point in the history
Release 0.3.6
  • Loading branch information
dombean authored Oct 16, 2024
2 parents ae11f04 + 44d4e3a commit 780233d
Show file tree
Hide file tree
Showing 4 changed files with 25 additions and 15 deletions.
2 changes: 1 addition & 1 deletion .bumpversion.cfg
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[bumpversion]
current_version = 0.3.5
current_version = 0.3.6
commit = False
tag = False
parse = (?P<major>\d+)\.(?P<minor>\d+)\.(?P<patch>\d+)
Expand Down
16 changes: 16 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,20 @@ and this project adheres to [semantic versioning](https://semver.org/spec/v2.0.0

### Removed

## [v0.3.6] - 2024-10-16

### Added

### Changed

### Deprecated

### Fixed
- Changed `cut_lineage` function inside `helpers/pyspark.py` to make it compatible
with newer PySpark versions.

### Removed

## [v0.3.5] - 2024-10-04

### Added
Expand Down Expand Up @@ -420,6 +434,8 @@ and this project adheres to [semantic versioning](https://semver.org/spec/v2.0.0
> and GitHub Releases.

- rdsa-utils v0.3.6: [GitHub Release](https://github.com/ONSdigital/rdsa-utils/releases/tag/v0.3.6) |
[PyPI](https://pypi.org/project/rdsa-utils/0.3.6/)
- rdsa-utils v0.3.5: [GitHub Release](https://github.com/ONSdigital/rdsa-utils/releases/tag/v0.3.5) |
[PyPI](https://pypi.org/project/rdsa-utils/0.3.5/)
- rdsa-utils v0.3.4: [GitHub Release](https://github.com/ONSdigital/rdsa-utils/releases/tag/v0.3.4) |
Expand Down
2 changes: 1 addition & 1 deletion rdsa_utils/__init__.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
__version__ = "0.3.5"
__version__ = "0.3.6"
20 changes: 7 additions & 13 deletions rdsa_utils/helpers/pyspark.py
Original file line number Diff line number Diff line change
Expand Up @@ -573,24 +573,18 @@ def cut_lineage(df: SparkDF) -> SparkDF:
>>> new_df.count()
3
"""
logger.info("Converting SparkDF to Java RDD.")

try:
logger.info("Converting SparkDF to Java RDD.")

jrdd = df._jdf.toJavaRDD()
jschema = df._jdf.schema()
jrdd.cache()
sql_ctx = df.sql_ctx
try:
java_sql_ctx = sql_ctx._jsqlContext
except AttributeError:
java_sql_ctx = sql_ctx._ssql_ctx

logger.info("Creating new SparkDF from Java RDD.")
new_java_df = java_sql_ctx.createDataFrame(jrdd, jschema)
new_df = SparkDF(new_java_df, sql_ctx)
spark = df.sparkSession
new_java_df = spark._jsparkSession.createDataFrame(jrdd, jschema)
new_df = SparkDF(new_java_df, spark)
return new_df
except Exception:
logger.error("An error occurred during the lineage cutting process.")
except Exception as e:
logger.error(f"An error occurred during the lineage cutting process: {e}")
raise


Expand Down

0 comments on commit 780233d

Please sign in to comment.