chispa 1.0 release #93

MrPowers · 2024-02-19T20:11:38Z

It would be nice to develop chispa so we can make a 1.0 release.

We might even want to expose a different interface. Something like this:

@dataclass
class MyFormats:
    mismatched_rows = ["light_yellow"]
    matched_rows = ["cyan", "bold"]
    mismatched_cells = ["purple"]
    matched_cells = ["blue"]

my_chispa = Chispa(formats=MyFormats())

my_chispa.assert_df_equality(actual_df, expected_df)

The user could inject the my_chispa object in their tests as follows:

@pytest.fixture()
def my_chispa():
    return Chispa(formats=MyFormats())

def test_shows_assert_basic_rows_equality(my_chispa):
  ...
  my_chispa.assert_basic_rows_equality(df1.collect(), df2.collect())

It's worth contemplating at least.

The text was updated successfully, but these errors were encountered:

MrPowers · 2024-07-17T20:28:56Z

Let's brainstorm some of the "big issues" with chispa:

bad for wide table DataFrame comparisons
doesn't handle some column types well
probably doesn't handle some edge cases well (e.g. array columns with NaN values)
user can't customize formatting
some bad abstractions (e.g the underline_cells argument)
Users can't disable terminal characters (sometimes users want to use this in a notebook and don't want any Terminal formatting output)

Here are some project goals:

always maintain backward compatibility whenever possible
output beautiful error messages and make it easier for users to unit test their PySpark code
allow users to run unit tests in a performant manner

For chispa 1.0, it might be better to build new interfaces rather than modify the existing interfaces. But I'd rather not make chispa 1.0 backward incompatible. Let's align on vision & interfaces.

SemyonSinchenko · 2024-07-18T13:48:07Z

For chispa 1.0, it might be better to build new interfaces rather than modify the existing interfaces. But I'd rather not make chispa 1.0 backward incompatible. Let's align on vision & interfaces.

Why not to have a new API, but do not delete an old one, only raise DeprecationWarnings? Or even just create a chispa.v2 API.

MrPowers · 2024-07-18T14:06:47Z

Yep, I already started building that new interface with Chispa(formats=MyFormats()). We may want to expose the public API via Chispa going forward. I think we just need to figure out exactly the public interface that we want to expose to end users. The public interface should meet all the project goals, should be flexible enough to allow for customizations, and should be easy to run with the defaults.

fpgmaas · 2024-07-19T05:25:18Z

user can't customize formatting

I already started building that new interface with Chispa(formats=MyFormats()). [...]

@MrPowers For a proposed new way of formatting configuration, see #127 which would change that for users to e.g.

Chispa(
    formats=FormattingConfig(
        mismatched_rows={"color": "light_yellow"}
    )
)

fpgmaas · 2024-07-19T09:29:54Z

I think the best way to move forward is to simply create separate issues for the following topics:

bad for wide table DataFrame comparisons
doesn't handle some column types well
probably doesn't handle some edge cases well (e.g. array columns with NaN values)
user can't customize formatting
some bad abstractions (e.g the underline_cells argument)
Users can't disable terminal characters (sometimes users want to use this in a notebook and don't want any Terminal formatting output)

So we can discuss them separately. We add them to the milestone for a 1.0 release. We release features and changes one-by-one by incrementing the minor version, and when all desired changes and features for the 1.0 release are finished, we release it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chispa 1.0 release #93

chispa 1.0 release #93

MrPowers commented Feb 19, 2024 •

edited

Loading

MrPowers commented Jul 17, 2024

SemyonSinchenko commented Jul 18, 2024

MrPowers commented Jul 18, 2024

fpgmaas commented Jul 19, 2024

fpgmaas commented Jul 19, 2024

chispa 1.0 release #93

chispa 1.0 release #93

Comments

MrPowers commented Feb 19, 2024 • edited Loading

MrPowers commented Jul 17, 2024

SemyonSinchenko commented Jul 18, 2024

MrPowers commented Jul 18, 2024

fpgmaas commented Jul 19, 2024

fpgmaas commented Jul 19, 2024

MrPowers commented Feb 19, 2024 •

edited

Loading