update to latest ggml #134

leejet · 2024-01-01T08:31:10Z

Still need some work. When generating images larger than 512x512, it produces invalid images. This might be an internal issue within ggml. I haven't had time to pinpoint the exact cause yet.

.\bin\Release\sd.exe -m ..\..\stable-diffusion-webui\models\Stable-diffusion\v2-1_768-nonema-pruned.safetensors -p "a lovely cat"  -H 768 -W 768 -v

.\bin\Release\sd.exe -m ..\..\stable-diffusion-webui\models\Stable-diffusion\sd_xl_turbo_1.0_fp16.safetensors --vae ..\..\stable-diffusion-webui\models\VAE\sdxl_vae-fp16-fix.safetensors -p "a lovely cat" -v   -H 768 -W 768  --cfg-scale 1 --steps 1

FSSRepo · 2024-01-03T15:12:43Z

I will see if I can help with anything.

FSSRepo · 2024-01-04T21:52:31Z

@leejet Do you get these errors on CPU or CUDA? I ran tests on both backends and obtained correct results.

If you see errors with cuda, in ggml-cuda.cu, in ggml_backend_cuda_buffer_set_tensor change cudaMemcpy order to before device syncronization:

static void ggml_backend_cuda_buffer_set_tensor(ggml_backend_buffer_t buffer, ggml_tensor * tensor, const void * data, size_t offset, size_t size) {
    GGML_ASSERT(tensor->backend == GGML_BACKEND_GPU);

    ggml_backend_buffer_context_cuda * ctx = (ggml_backend_buffer_context_cuda *)buffer->context;

    ggml_cuda_set_device(ctx->device);
    CUDA_CHECK(cudaMemcpy((char *)tensor->data + offset, data, size, cudaMemcpyHostToDevice));

    CUDA_CHECK(cudaDeviceSynchronize());

}

leejet · 2024-01-05T15:02:30Z

@leejet Do you get these errors on CPU or CUDA? I ran tests on both backends and obtained correct results.

If you see errors with cuda, in ggml-cuda.cu, in ggml_backend_cuda_buffer_set_tensor change cudaMemcpy order to before device syncronization:

static void ggml_backend_cuda_buffer_set_tensor(ggml_backend_buffer_t buffer, ggml_tensor * tensor, const void * data, size_t offset, size_t size) {
    GGML_ASSERT(tensor->backend == GGML_BACKEND_GPU);

    ggml_backend_buffer_context_cuda * ctx = (ggml_backend_buffer_context_cuda *)buffer->context;

    ggml_cuda_set_device(ctx->device);
    CUDA_CHECK(cudaMemcpy((char *)tensor->data + offset, data, size, cudaMemcpyHostToDevice));

    CUDA_CHECK(cudaDeviceSynchronize());

}

This issue only occurred on CUDA. Changing the cudaMemcpy order to before device synchronization fixed the problem.

leejet added 2 commits January 1, 2024 16:25

update to latest ggml

5f907ac

update ggml

e427d4e

leejet mentioned this pull request Jan 3, 2024

CUDA cannot generate images #95

Closed

leejet added 2 commits January 5, 2024 23:11

update ggml

cd4cac5

Merge branch 'master' into update_ggml

a4237ec

leejet merged commit 2b6ec97 into master Jan 5, 2024
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

update to latest ggml #134

update to latest ggml #134

leejet commented Jan 1, 2024 •

edited

Loading

FSSRepo commented Jan 3, 2024

FSSRepo commented Jan 4, 2024 •

edited

Loading

leejet commented Jan 5, 2024

update to latest ggml #134

update to latest ggml #134

Conversation

leejet commented Jan 1, 2024 • edited Loading

FSSRepo commented Jan 3, 2024

FSSRepo commented Jan 4, 2024 • edited Loading

leejet commented Jan 5, 2024

leejet commented Jan 1, 2024 •

edited

Loading

FSSRepo commented Jan 4, 2024 •

edited

Loading