Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WebLLM always processes on Intel UHD Graphics, not on NVIDIA T1200 #609

Open
b521f771d8991e6f1d8e65ae05a8d783 opened this issue Oct 10, 2024 · 9 comments

Comments

@b521f771d8991e6f1d8e65ae05a8d783

Hi!

I could not find any other forum to post this, so I will write it here: I am trying to use WebLLM via a Chromium-based Browser (I am developing an Add-In for Outlook, which uses Blink WebView under the hood). So far, the Web-LLM works, but it always processes on my Graphic Chip and leaves my GPU untouched. How could this behaviour be configured?

Thank you so far for your effort!

@ReneLH
Copy link

ReneLH commented Oct 24, 2024

I have the exact same question.

@tqchen
Copy link
Contributor

tqchen commented Oct 24, 2024

would be great if you can check https://webgpureport.org/ and send a screen shot, it may have to do with how we order adapters

@Iternal-JBH4
Copy link

You can force Chrome in windows to use the more powerful GPU by going to the Display>Graphics>Apps page, adding chrome, clicking options, and setting to use dedicated GPU.

Not an ideal outcome but how it works right now

@StevenHanbyWilliams
Copy link

Also ran into this on Windows 10 and 11, Chrome, Edge, and Brave all only give the low power gpu, even if you request with the high-performance powerPreference. You can verify this from the js console by opening DevTools and running

const adapter = await navigator.gpu.requestAdapter({powerPreference: 'high-performance'})

and inspecting the result.

Note: The powerPreference IS honored on Mac OSX

@marschr
Copy link

marschr commented Nov 27, 2024

I have a hybrid Intel + Nvidia GPU, although the nvidia is way more powerful I tend to use the intel one to make compatibility checks mainly for mobile devices.

With that in mind, I ask if the function that web-llm calls here:

const gpuDetectOutput = await tvmjs.detectGPUDevice();

is this one linked below?
https://github.com/apache/tvm/blob/7ae7ea836169d3cf28b05c7d0dd2cb6a2045508e/web/src/webgpu.ts#L36

I'm asking because, even that it should default to the higher performance one, it would be nice to have the option to use the low-power GPU when requested.
What would be the direction here, should I open a PR on the apache/tvm repo?

@zhibisora
Copy link

I have a hybrid Intel + Nvidia GPU, although the nvidia is way more powerful I tend to use the intel one to make compatibility checks mainly for mobile devices.

With that in mind, I ask if the function that web-llm calls here:

const gpuDetectOutput = await tvmjs.detectGPUDevice();

is this one linked below?
https://github.com/apache/tvm/blob/7ae7ea836169d3cf28b05c7d0dd2cb6a2045508e/web/src/webgpu.ts#L36
I'm asking because, even that it should default to the higher performance one, it would be nice to have the option to use the low-power GPU when requested. What would be the direction here, should I open a PR on the apache/tvm repo?

The current issue is that webllm is unable to use NVIDIA graphics on hybrid Intel + Nvidia GPU devices. So you don't need to worry about using NVIDIA graphics by default for now.

const adapter = await navigator.gpu.requestAdapter({powerPreference: 'high-performance'})

This code does not work as it should on a dual gpu device.

I think the way forward should be to first make high performance gpu's available on webgpu's and then provide parameters to choose which device to use. As for the default device, I support the use of integrated graphics, but more discussion is still needed.

@zhibisora
Copy link

I have the same issue on an R7000p laptop.

Some relevant information can be found here.
https://developer.mozilla.org/en-US/docs/Web/API/GPU/requestAdapter

@marschr
Copy link

marschr commented Dec 4, 2024

The current issue is that webllm is unable to use NVIDIA graphics on hybrid Intel + Nvidia GPU devices. So you don't need to worry about using NVIDIA graphics by default for now.

const adapter = await navigator.gpu.requestAdapter({powerPreference: 'high-performance'})

This code does not work as it should on a dual gpu device.

I think the way forward should be to first make high performance gpu's available on webgpu's and then provide parameters to choose which device to use. As for the default device, I support the use of integrated graphics, but more discussion is still needed.

Thanks for the reply!

This is odd, both https://webgpureport.org/ (check below) and https://github.com/huggingface/transformers.js see both Intel GPU and the nVidia GPU.

for context, huggingface/transformers.js seems to use onnx under the hood to support webgpu, and there's a flag that overrides the use of one or the other, something like env.backends.onnx.webgpu.powerPreference = 'low-power' or 'high-performance'.

I've switched to web-llm because currently seems to offer better performance and model support.

I also support that the integrated graphics - or even an NPU/NeuralEngine/etc in the future, if something like WebNN becomes a reality - and leave to the user/implementation to notify the user to switch to another more powerful hardware, but for now, a flag or an argument to select the GPU would be just fine.

I've attached the screenshot below from webgpureport.org on my system for further inspection:
Screenshot from 2024-12-04 01-18-05

I did open a PR to address device selection on tvm's repo, but I'm not feeling like it's going to get merged anytime soon https://github.com/apache/tvm/pull/17545/files.

@marschr
Copy link

marschr commented Dec 4, 2024

After half an hour of googling around I came to this --use-webgpu-power-preference=force-low-power command line flag from this chromium source at about line 60.
This way you can force your chromium based browser to use the low-power GPU, in my case the Intel one.
My nVidia still shows on chrome://gpu but the sites cannot see it anymore. The integrated intel GPU spikes to 100% use when running the web-llm inference (check on intel_gpu_top) while the RTX4060 remained idling.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants