-
Notifications
You must be signed in to change notification settings - Fork 224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
POC: Refactor tests that checks the string output from the info module #3564
base: main
Are you sure you want to change the base?
Conversation
with clib.Session() as lib: | ||
with lib.virtualfile_from_matrix(data) as vfile: | ||
with GMTTempFile() as outfile: | ||
lib.call_module("info", [vfile, f"->{outfile.name}"]) | ||
output = outfile.read(keep_tabs=True) | ||
bounds = "\t".join([f"<{col.min():.0f}/{col.max():.0f}>" for col in data.T]) | ||
expected = f"<matrix memory>: N = {shape[0]}\t{bounds}\n" | ||
assert output == expected | ||
lib.call_module("info", [vfile, "-C", f"->{outfile.name}"]) | ||
output = outfile.loadtxt() | ||
npt.assert_equal(output[::2], data.min(axis=0)) | ||
npt.assert_equal(output[1::2], data.max(axis=0)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or we can change it to:
with clib.Session() as lib:
with (
lib.virtualfile_from_matrix(data) as vintbl,
lib.virtualfile_out(kind="dataset") as vouttbl,
):
lib.call_module("read", [vintbl, vouttbl, "-Td"])
output = lib.virtualfile_to_dataset(vfname=vouttbl, output_type="numpy")
npt.assert_equal(output, data)
Pros:
- No need to use
GMTTempFile
- Check the full data array rather than just the min/max values
- This is exactly what we are doing (virtualfile_in/virtualfile_out/call_module/virtualfile_to_dataset) when wrapping modules
Cons:
- Calls four Session functions in a single test, violating the rule that "unit test should test one thing"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems @weiji14 is against the changes in this comment. Are you OK with the changes in https://github.com/GenericMappingTools/pygmt/pull/3564/files, i.e., checking the numerical min/max values rather than checking the string output from info
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For comparison, we sometimes use Session.write_data
to write the dataset into a temporary file and load it to compare the full data array.
pygmt/pygmt/tests/test_clib_put_vector.py
Lines 46 to 56 in 82b0c73
with GMTTempFile() as tmp_file: | |
lib.write_data( | |
"GMT_IS_VECTOR", | |
"GMT_IS_POINT", | |
"GMT_WRITE_SET", | |
wesn, | |
tmp_file.name, | |
dataset, | |
) | |
# Load the data and check that it's correct | |
newx, newy, newz = tmp_file.loadtxt(unpack=True, dtype=dtype) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're missing the <matrix memory>: N
part which checks that the array was loaded as a GMT matrix, rather than via a GMT vector <vector memory>
. Also missing the count/length of the array.
I think it's best to keep this unit test simple, we're also using .loadtxt
here instead of .read
, which means it is going through numpy.loadtxt
that introduces another layer or source of potential breakages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're missing the
<matrix memory>: N
part which checks that the array was loaded as a GMT matrix, rather than via a GMT vector<vector memory>
.
I actually don't care if it's a matrix or vector memory (it's just decided by the GMT_VIA_VECTOR
/GMT_VIA_MATRIX
modifier, as long as the data array is correct.
The current way in the main branch (checking string output from info
) only checks the min/max values in each column, which is not 100% correct, since the actual data array that GMT API gets may differ from the array we pass without affecting the min/max values. That's why checking the full data array is more robust. Please see the Session.write_data
way above.
Instead of check the string output from
info
, I feel checking the numpy array is more accurate and more readable.This is just a proof of concept. I'll work on other tests if you agree with the changes.
Please vote by 👍 or 👎 or leave comments.