-
Notifications
You must be signed in to change notification settings - Fork 329
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Very simple http server #367
base: master
Are you sure you want to change the base?
Conversation
I'm excited about this one, and was attempting to combine with Vulkan I'm seeing a compile time issue (around the pingpong function) in my merge, and seems it's in the original as well.
|
Ah! This function should go, I just added it at the start of devlopment to see if I was able to connect to the server. If it's causing issues, just remove it, and all the few things that depend on it. |
@theaerotoad just out of curiosity, which C++ compiler are you using? MSVC had no issue with this code (which I believe was technically incorrect). |
Tested it on gcc 12.2.0-14 on Debian. |
Yup, removing the pingpong endpoint allows compilation. Another thought--the default 'localhost' string didn't work on my end initially. Looks like I was able to generate an image via requests, but segfault immediately afterwards.
I've played around a bit (not much of a c++ coder at this point, and can't reliably track down where it's coming from, though. I'm running with batch 1 (so only one image), and the first image gets written properly, with tags, then the dreaded segfault. |
Maybe you could try on the CPU backend to see if the segfault is related to the Vulkan merge or to the server itself? (Also you should probably use a less demanding model than flux when testing) |
Right--should have said I ran the earlier example with the CPU backend (tried with no BLAS just to confirm it wasn't in my merging it over that caused this!) It's much faster with Vulkan. I can confirm I seem to throw a segfault with the server everytime with:
For each of the above, they run fine with the main cli example (although painfully slowly on CPU). |
Hmm it doesn't happen on my machine, that's annoying to debug. I'll try running it on WSL to see if it's a linux thing. Edit: It does happen on WSL too! So maybe i can fix it. |
@theaerotoad I belive it's fixed now. |
@stduhpf Yup, that fixes it. Thank you! Sure nice not to have to reload everything each time. |
@stduhpf -- This is working pretty well, I played around with it a bit this weekend. I have a few tweaks, to enable other inputs to be specified (via html form inputs) and returning the image as part of the POST command, and reduce CPU usage--use t.join() rather than while(1) at the end. Do you want them? I may just share as a gist, or can branch off your repo. What's your preference? |
@theaerotoad Both options are fine with me, thanks for helping. I thought about returning the image in base64 after each generation, but I was too lazy to implement it. |
I just spent hours trying to understand why the server wasn't sending the image metadata as it is supposed to, turns out PIL automatically strips out the metadata, the server was working fine 🙃. |
There are some differences to the automatic111 v1 webui api.
This info however might be outdated, I just wanted to make my bot work with your api, so this just jumped at me. edit: links: |
I might look into making the API compatible with other standards in the future. For now, I just use the same arguments as the stable-diffusion.cpp/stable-diffusion.h Lines 148 to 164 in 14206fd
|
Speaking of, shouldn't the schedule method be specified when calling txt2img() rather than when creating the context? |
I see.
I suppose. |
If anyone just wants to run a command: curl -sv --json '{"prompt": "a lovely cat", "seed": -1}' 127.0.0.1:7860/txt2img | jq -r .[0].data | base64 -d - > api_result.png |
leejet/stable-diffusion.cpp#367 commit 1c599839800ed5984e72562968db7e4df5d052bd
@stduhpf thanks for your work. Currently I'm using this pr for photomaker v2 but have error when change the input embedding (I believed it called "input_id_images_path"). How to input the different face without reload the whole SDXL model or some function to reload the face embedding ? |
@NNDam You can try with my lastest commit. I can't test it on my end, but it should work now? |
@stduhpf thanks, I tried but still not work. The main problem is, at the first time load model, I also need to preload |
Oh I see. Well, even if Support for PhotoMaker Version 2 was merged, I couldn't get this to work with the current architecture of the server, sorry. Have you tried with photomaker v1? |
Hi @bssrdf, can you help us ? |
I think some changes need to be made in stable-diffusion.cpp/stable-diffusion.h. Some arguments like scheduler type, vae settings, and controlnets are passed to the |
@NNDam , @stduhpf , I briefly looked at the server code. There may be a simple workaround for photomaker.
Could there be a parsing of input_id_images_path added in above block and set |
This is a very simple server that I made to be able to generate different prompts without reloading the models everytime.
Edit (03/01/25):
Mostly outdated instructions
Starting the server
The syntax is pretty much the same as the cli.
How to use (example):
Using the example client script
requests
andpillow
are installedpip install requests pillow
python -i examples/server/test_client.py
Simplest setup
requests
module:pip install requests
python
>>> import requests
/txt2img
endpoints>>> requests.post("http://localhost:8080/txt2img","a lovely cat holding a sign says 'flux.cpp'")
Using json payloads
requests
module:pip install requests
python
>>> import requests, json
>>> payload = {'prompt': """a lovely cat holding a sign says "flux.cpp" """,'height': 768, 'seed': 42, 'sample_steps': 4}
/txt2img
endpoints>>> requests.post("http://localhost:8080/txt2img", json.dumps(payload))
Decoding response using pillow
requests
andpillow
are installedpip install requests pillow
python
>>> import requests, json, base64
>>> from io import BytesIO
>>> from PIL import Image
>>> response = requests.post("http://localhost:8080/txt2img","a lovely cat holding a sign says 'flux.cpp'")
>>> parsed = json.loads(response.text)
>>> pngbytes = base64.b64decode(parsed[0]["data"])
>>> image = Image.open(BytesIO(pngbytes))
>>> image.show()
One-liner
>>> import requests, json, base64
>>> from io import BytesIO
>>> from PIL import Image
>>> [Image.open(BytesIO(base64.b64decode(img["data"]))).show() for img in json.loads(requests.post("http://localhost:8080/txt2img",json.dumps( {'seed': -1, 'batch_count':4, 'sample_steps':4, 'prompt': """a lovely cat holding a sign says "flux.cpp" """} )).text)]
If you don't want the image viewer to pause the execution of your command, you can do the following (not needed on macOS for some reason):
>>> from threading import Thread
>>> [Thread(target=Image.open(BytesIO(base64.b64decode(img["data"]))).show, args=()).start() for img in json.loads(requests.post("http://localhost:8080/txt2img",json.dumps( {'seed': -1, 'batch_count':4, 'sample_steps':4, 'prompt': """a lovely cat holding a sign says "flux.cpp" """} )).text)]