-
-
Notifications
You must be signed in to change notification settings - Fork 370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: PDF Scaling Bug #1382
Fix: PDF Scaling Bug #1382
Conversation
This fixes the |
There are two problems discussed in the linking issue #1082.
1. Data being transformed without explicit user settingsI was able to reproduce the unexpected data scaling based on the type of image being saved. The environment and modified MRE are shown below in the code section. To get the expected behavior, uncomment out the code below the note. Unexpected data scaling (no explicit plot dimensions given for vector images)Expected behavior (explicit plot dimensions given for vector images)
datashader/datashader/mpl_ext.py Lines 255 to 272 in 6802220
datashader/datashader/mpl_ext.py Lines 282 to 288 in 6802220
I need to investigate further, but I think the call to 2. Color variations due to the type of image being saved (bitmap vs vector)Pixel color values are computed based on a combination of
You can see this behavior in the following movie where I modify the final image's total area by making it progressively smaller using the supplied notebook, but not using the given PR changes. This behavior is expected, but may not be documented well enough. dpi-changes.webm@nvictus and @thomas-reimonn the PR modifies default CodeInstall the environment and run the MRE. mamba env create environment.yaml && mamba activate datashader-issue1082
python example.py # environment.yaml
name: datashader-issue1082
channels:
- conda-forge
dependencies:
- python ==3.8.12
# Package managers
- pip
# Dependencies
- datashader ==0.14.0
- matplotlib ==3.2.2
- numpy ==1.21.6
- pandas ==1.1.3 # example.py
import datashader as ds
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from datashader.mpl_ext import dsshow
EXTENSIONS = ["pdf", "png", "svg"]
SIZE = 100_000
DPIS = ["300", "default"]
def main():
# Fake data for testing
x = np.random.normal(size=SIZE)
y = x * 3 + np.random.normal(size=SIZE)
df = pd.DataFrame(columns=["xs", "ys"], data=np.array([x, y]).T)
for dpi in DPIS:
for extension in EXTENSIONS:
fig, ax = plt.subplots()
dsartist = dsshow(
df=df,
glyph=ds.Point("xs", "ys"),
aggregator=ds.count(),
vmin=0,
vmax=100,
norm="linear",
aspect="auto",
ax=ax,
# NOTE: Uncomment without the PR changes to get similarly scaled images.
# plot_width=1920,
# plot_height=1440,
)
plt.title(f"{dpi} dpi {extension}")
if dpi == "300":
fig.savefig(f"test_{dpi}_dpi.{extension}", dpi=300)
else:
fig.savefig(f"test_{dpi}_dpi.{extension}")
plt.close()
if __name__ == "__main__":
main() |
@amaloney not sure yet if this addresses your concerns, but here is an explanation of the change: The
Notably, matplotlib has the expectation that the default value of unsampled is False. It was my mistake to set the default value to True in the original implementation of the DSArtist (I was the main implementer), as I wanted to avoid additional manipulation by matplotlib. This didn't matter in scenario 1, so the scaling bug only surfaces when using a matplotlib renderer that does not support affine transformation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, this LGTM. I don't think the datashader-image.pdf
file, nor the notebook are necessary since they are both there as examples showing the bug and the fix.
Thank you for the feedback on this! I removed the demo notebook and image. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1382 +/- ##
=======================================
Coverage 88.42% 88.42%
=======================================
Files 93 93
Lines 18707 18705 -2
=======================================
- Hits 16541 16540 -1
+ Misses 2166 2165 -1 ☔ View full report in Codecov by Sentry. |
Thanks for catching that! |
Datashader currently has a known bug where, upon saving figures to PDF, the rasterized portion of the Datashader output is scaled differently from the vector-based elements (such as axes, labels, etc.). This PR fixes the issue and demonstrates it by ensuring the Datashader images render correctly within a single PDF figure.