Skip to content
This repository has been archived by the owner on Jul 18, 2024. It is now read-only.

Huge memory usage for GPU when using textures #23

Open
terhechte opened this issue Dec 19, 2022 · 9 comments
Open

Huge memory usage for GPU when using textures #23

terhechte opened this issue Dec 19, 2022 · 9 comments
Labels
bug Something isn't working help wanted Extra attention is needed

Comments

@terhechte
Copy link

terhechte commented Dec 19, 2022

Hey, so I had some fun porting text rendering to Forma using parley:

Screenshot 2022-12-19 at 22 18 50

However, when using the GPU, the memory usage is quite high (e.g. for this demo, in release mode, ~400MB on my Mac). When I then start resizing the window, it quickly grows to ~600MB and beyond.

This doesn't happen when using a CPU runner. If I render text without emoji, then I also don't see this behaviour with the GPU runner either.

The way emoji are rendered is by rendering the glyph to a small image (16x16 right now) and then rendering this image with a 16x16 Path.

So only if I'm rendering emoji images on the GPU do I see huge memory consumption (and increasing memory consumption).

Do I need to perform some sort of cleanup when rendering images using Forma with the GPU renderer?

@terhechte
Copy link
Author

I've uploaded my code here. When enabling the GPU renderer examples/demo/main.rs:114 the memory usage goes through the roof.

Also, I'm getting a lot of artefacts when enabling clipping (even on the CPU): The first render is ok, but once I compose again (due to changes in the scene) everything but the first layer seems to disappear. I'm probably doing something wrong here examples/demo/draw.rs:72 enables clipping.

@dragostis
Copy link
Contributor

Thanks for reporting this. I'll take a look at the high memory consumption and report back.

@dragostis
Copy link
Contributor

dragostis commented Dec 20, 2022

Part of the issue with the memory usage is the large atlas size for textures (4096 x 4096), but also the format, which is rgbafloat16. This easy to improve upon, but there is another underlying issue.

It seems like performance is not great on macOS GPUs. If someone has more time to investigate which of the shaders is taking too long, it would be appreciated, since wgpu does not have timestamp query implemented on Metal. My intuition is that it's the sorter that's the main issue.

@dragostis dragostis added bug Something isn't working help wanted Extra attention is needed labels Dec 20, 2022
@dragostis
Copy link
Contributor

About the cropping, I don't think I understand the issue. If you're trying to crop the geometry that lies outside of the screen, this should be something we optimize anyway. See #25.

@terhechte
Copy link
Author

terhechte commented Dec 20, 2022

Thanks for the reply!

About the cropping, I don't think I understand the issue. If you're trying to crop the geometry that lies outside of the screen, this should be something we optimize anyway. See #25.

Yeah, that's what I was trying to do by having one "base" layer (the size of the window) and then clipping all additional layers after that. But what's happening is that this renders fine, but when the scene changes (e.g. I zoom or translate the transform) then most layers seem to disappear (at least that's how I interpret it):

clipping.mp4

However, #25 would be the much better solution anyway.

@dragostis
Copy link
Contributor

I merged #29 which should improve performance when zooming in. We can slowly improve on the other issues as well.

@terhechte
Copy link
Author

Thanks! I'll try it out tomorrow

@FoundationKen
Copy link

@terhechte How is the parley integration coming? Did the performance improve with the PR from @dragostis?

@terhechte
Copy link
Author

@FoundationKen Yeah, performance and memory usage did improve. Then I got distracted by another side project 🤷

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants