[SPIKE] Look into solution for partitioned data encoding #255

ravjotbrar · 2021-09-28T19:50:24Z

When writing partitioned data to hdfs using spark 3.1.1, the directory layout is something like:
data/dftest.parquet/col1%3D4/part1.snappy.parquet

Instead we need it to look like
data/dftest.parquet/col1=4/part1.snappy.parquet

This encoding issue only happens when using spark 3.1.1. Either need to look into fixing the incompatibility or look into decoding the directory layout in our connector.

The text was updated successfully, but these errors were encountered:

alexey-temnikov · 2023-03-06T17:47:43Z

Next steps: Check behaviour on the latest Spark versions.

ravjotbrar added the High Priority label Sep 28, 2021

ravjotbrar self-assigned this Sep 28, 2021

jonathanl-bq added the size: 3 label Feb 25, 2022

jeremyprime added the enhancement New feature or request label Mar 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPIKE] Look into solution for partitioned data encoding #255

[SPIKE] Look into solution for partitioned data encoding #255

ravjotbrar commented Sep 28, 2021

alexey-temnikov commented Mar 6, 2023

[SPIKE] Look into solution for partitioned data encoding #255

[SPIKE] Look into solution for partitioned data encoding #255

Comments

ravjotbrar commented Sep 28, 2021

alexey-temnikov commented Mar 6, 2023