Skip to content

feat: add ideogram4 support#1609

Merged
leejet merged 5 commits into
masterfrom
ideogram4
Jun 6, 2026
Merged

feat: add ideogram4 support#1609
leejet merged 5 commits into
masterfrom
ideogram4

Conversation

@leejet
Copy link
Copy Markdown
Owner

@leejet leejet commented Jun 4, 2026

Summary

  • Add ideogram4 support

Related Issue / Discussion

N/A

Additional Information

Convert weights

fp8 scale -> bf16

python .\convert_fp8_scale_to_bf16.py --input .\ideogram4_fp8.safetensors --output ideogram4_bf16.safetensors
python .\convert_fp8_scale_to_bf16.py --input .\ideogram4_uncond_fp8.safetensors --output ideogram4_uncond_bf16.safetensors

bf16 -> q8

.\bin\Release\sd-cli.exe -M convert -m ideogram4_bf16.safetensors -o ideogram4-Q8_0.gguf --tensor-type-rules "^layers.*adaln_modulation.*weight=q8_0,layers.*attention.o.*weight=q8_0,layers.*attention.qkv.*weight=q8_0,layers.*feed_forward.*weight=q8_0" -v

.\bin\Release\sd-cli.exe -M convert -m ideogram4_uncond_bf16.safetensors -o ideogram4_uncond-Q8_0.gguf --tensor-type-rules "^layers.*adaln_modulation.*weight=q8_0,layers.*attention.o.*weight=q8_0,layers.*attention.qkv.*weight=q8_0,layers.*feed_forward.*weight=q8_0" -v

Examples

.\bin\Release\sd-cli.exe --diffusion-model ideogram4-Q8_0.gguf --uncond-diffusion-model ideogram4_uncond-Q8_0.gguf --llm ..\..\llm\Qwen3VL-8B-Instruct-Q4_K_M.gguf --vae ..\..\ComfyUI\models\vae\flux2_ae.safetensors -p '{"high_level_description":"A square 1024 x 1024 luxury fashion magazine cover featuring exactly one short chubby fluffy cat as the main model. The cat sits on a soft ivory studio floor, facing the viewer with a stylish calm expression, wearing tiny black sunglasses, a red silk scarf, and a small gold collar charm. In front of the cat on the floor is a wide horizontal luxury nameplate that clearly reads ideogram4.cpp. The whole design feels premium, fashionable, clean, and editorial.","style_description":{"aesthetics":"luxury fashion magazine cover, high-end pet couture campaign, minimalist editorial design, elegant studio photography, soft paper texture, refined typography, fashionable and polished","lighting":"Soft diffused studio lighting, gentle spotlight on the cat, subtle floor shadow, warm ivory highlights, clean separation between subject and background","photo":"high-resolution fashion editorial photography look, front-facing cat portrait, crisp fur details, glossy sunglasses, clear readable nameplate text, shallow depth of field","medium":"mixed media fashion photography and premium editorial graphic design","color_palette":["#F4EFE7","#111111","#D8B56D","#B73A3A","#FFFFFF","#8A7A6A"]},"compositional_deconstruction":{"canvas":"Square 1024 x 1024 canvas with a normal upright orientation. Do not rotate the poster or any text. Use a clean fashion magazine cover layout.","background":"Warm ivory studio backdrop with subtle paper grain, a soft spotlight gradient, faint floor shadow, and a few minimal gold editorial lines. The background is spacious, premium, and uncluttered.","layout":"Top center has a small elegant headline. Center area features one cat as the main fashion model. Lower foreground has a wide horizontal luxury nameplate placed on the floor in front of the cat. Bottom center has a small footer. All text is horizontal, upright, and readable left to right.","elements":[{"type":"text","desc":"Top center headline reading LOOK WHAT I FOUND in a refined high-fashion serif font. The headline is horizontal, centered, elegant, and secondary to the nameplate text."},{"type":"obj","desc":"Exactly one short chubby fluffy cat sitting in the center like a luxury fashion model. The cat has a large round head, compact body, short legs, soft detailed fur, expressive eyes, and a calm confident pose. The cat is cute and rounded, not tall, not stretched, not duplicated."},{"type":"obj","desc":"Tiny glossy black sunglasses worn naturally by the cat, slightly oversized but still showing the cat face clearly. The sunglasses add a chic fashion-editorial attitude."},{"type":"obj","desc":"A red silk scarf tied neatly around the cat neck, with soft folds and a couture feeling. The scarf must not cover the cat face or the nameplate."},{"type":"obj","desc":"A small gold collar charm or fashion accessory under the scarf, subtle and premium, adding a luxury campaign detail."},{"type":"obj","desc":"In the lower foreground, place a wide horizontal luxury nameplate on the floor in front of the cat. The nameplate is low, flat, landscape-oriented, much wider than tall, like a fashion show seat card or premium display plaque. It is centered, front-facing, level, and fully visible. It must not become vertical, tall, standing, rotated, or side-facing."},{"type":"text","desc":"Print the exact text ideogram4.cpp only on the wide horizontal nameplate. Use clean bold black lettering, perfectly spelled, lowercase, with the number 4 and .cpp extension. The text must fit completely inside the nameplate, stay horizontal, and be readable from left to right."},{"type":"obj","desc":"Add sparse premium editorial accents around the edges: thin gold lines, small code brackets, tiny cursor marks, subtle dots, and minimal geometric details. No extra cats, no stickers, no animal faces, no busy decorations."},{"type":"text","desc":"Bottom center footer reading tiny paws, big compile energy in a small refined monospace or editorial font. The footer is horizontal, centered, understated, and much smaller than the nameplate text."}]}}'  --diffusion-fa -v --offload-to-cpu -H 1024 -W 1024
output

Still tweaking it, but at least it hasn’t triggered the safety filter so far.

Checklist

@leejet
Copy link
Copy Markdown
Owner Author

leejet commented Jun 4, 2026

Finally, I got it.

.\bin\Release\sd-cli.exe --diffusion-model ideogram4-Q8_0.gguf --uncond-diffusion-model ideogram4_uncond-Q8_0.gguf --llm ..\..\llm\Qwen3VL-8B-Instruct-Q4_K_M.gguf --vae ..\..\ComfyUI\models\vae\flux2_ae.safetensors -p '{"high_level_description":"A square 1024 x 1024 high-fashion editorial poster featuring exactly one short chubby fluffy cat styled like a luxury fashion mascot. The cat sits confidently in the center wearing tiny stylish sunglasses and a silk scarf. In front of the cat, there is a wide low horizontal placard with the exact text ideogram4.cpp printed clearly on it. The composition is upright, elegant, centered, and not rotated.","style_description":{"aesthetics":"high-fashion magazine editorial, luxury pet fashion campaign, minimalist studio set, playful couture mascot styling, polished graphic design, soft paper texture, premium layout","lighting":"Soft diffused studio lighting with a gentle spotlight on the cat, subtle floor shadow, glossy editorial highlights, clean subject separation","photo":"high-resolution fashion editorial photography look, front-facing composition, crisp fur details, clear readable placard text, shallow depth of field, elegant studio perspective","medium":"mixed media fashion photography and editorial poster design","color_palette":["#F4EFE7","#111111","#D8B56D","#C94C4C","#FFFFFF","#7A7A7A"]},"compositional_deconstruction":{"canvas":"Square 1024 x 1024 canvas. Use a normal upright composition. Do not rotate the poster, the cat, the headline, the footer, or the placard text.","background":"A warm ivory fashion studio backdrop with subtle paper grain, soft spotlight gradient, faint runway floor shadow, and minimal abstract editorial marks. Keep the background luxurious, clean, and uncluttered. Do not add extra cats, stickers, animal faces, or busy cartoon decorations.","layout":"Top area contains a small centered headline. Middle area contains the cat as the main subject. Lower-middle foreground contains a wide horizontal placard. Bottom area contains a small centered footer. Everything is straight, upright, and readable left to right.","elements":[{"type":"text","desc":"At the top center, an elegant fashion-magazine headline reading LOOK WHAT I FOUND. It is horizontal, centered, upright, straight, and secondary to the placard text. Use a refined bold editorial font with a premium magazine-cover feel."},{"type":"obj","desc":"Exactly one short chubby fluffy cat sitting in the center like a tiny fashion model. The cat faces the viewer directly, with a large round head, compact rounded body, expressive eyes, soft detailed fur, and a calm stylish expression. The cat is cute, short, compact, and not stretched. No duplicate cats."},{"type":"obj","desc":"Tiny fashionable black sunglasses worn naturally by the cat, slightly oversized and glossy, giving a luxury streetwear editorial feeling. The sunglasses do not hide the whole face."},{"type":"obj","desc":"A small elegant red silk scarf or couture ribbon around the cat neck, neatly styled and softly flowing, adding a fashionable magazine-campaign look without covering the face or placard."},{"type":"obj","desc":"In the lower-middle foreground, directly in front of the cat, place a wide low horizontal placard like a fashion show nameplate or luxury display card. It is a short landscape rectangle, much wider than tall, front-facing, level, and centered. It must never become a vertical sign, tall strip, upright board, standing lightbox, clipboard, or pole sign."},{"type":"text","desc":"Print the exact text ideogram4.cpp only inside the wide horizontal placard. Use bold clean black lettering that fits completely inside the placard. The text is horizontal, straight, lowercase, readable from left to right, and must not rotate, wrap, float outside the placard, or overlap the cat."},{"type":"obj","desc":"Add sparse luxury editorial accents around the edges: thin layout lines, tiny code brackets, small cursor marks, gold dots, and minimal geometric details. No extra cats, no stickers, no animal faces. Keep accents elegant and secondary."},{"type":"text","desc":"At the bottom center, a small elegant footer reading tiny paws, big compile energy. The footer is horizontal, centered, upright, straight, normally readable, understated, and much smaller than the placard text."}]}}'  --diffusion-fa -v --offload-to-cpu -H 1024 -W 1024
safety

@stduhpf
Copy link
Copy Markdown
Contributor

stduhpf commented Jun 4, 2026

Can someone upload some gguf quants of this model? I'm running a bit low on available storage to do the conversion right now.

Edit: https://huggingface.co/stduhpf/ideogram-4-gguf/tree/main

@stduhpf

This comment was marked as resolved.

@leejet leejet merged commit b9254dd into master Jun 6, 2026
mudler added a commit to mudler/LocalAI that referenced this pull request Jun 6, 2026
…0201)

* feat(stablediffusion-ggml): support Ideogram4 unconditional diffusion model

Bump stable-diffusion.cpp from 1f9ee88 to b9254dd, the upstream commit that
adds Ideogram4 support (leejet/stable-diffusion.cpp#1609). Ideogram4 derives
its classifier-free guidance from a separate unconditional diffusion model,
exposed upstream through the new sd_ctx_params_t.uncond_diffusion_model_path
field.

Wire that field into the gosd wrapper via a new uncond_diffusion_model_path
option. The _path suffix is deliberate: the Go loader only resolves options
whose name contains "path" to an absolute path under the model directory, so
this keeps the option consistent with diffusion_model_path and
high_noise_diffusion_model_path.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Assisted-by: Claude:claude-opus-4-8 [Claude Code]

* feat(gallery): add Ideogram4 stablediffusion-ggml models

Single-file GGUF weights for Ideogram4 are now published
(stduhpf/ideogram-4-gguf), so add the model to the gallery. Ideogram4 is a
text-to-image model with strong, accurate in-image text rendering, driven by
a Qwen3-VL-8B text encoder and real classifier-free guidance from a separate
unconditional diffusion model (the uncond_diffusion_model_path support added
in the preceding commit).

Two index entries, both built on gallery/virtual.yaml with the full config
inlined in overrides (same pattern as the other models, no dedicated template
file):
- ideogram-4-iq4nl-ggml (4-bit, ~11.6GB diffusion)
- ideogram-4-q8_0-ggml  (8-bit, ~20GB diffusion)

Each bundles the diffusion + unconditional GGUF (stduhpf), the
Qwen3-VL-8B-Instruct text encoder (unsloth), and the FLUX.2 VAE (Comfy-Org
mirror, non-gated). cfg_scale is 7 to match the upstream Ideogram4 default,
since it performs real CFG unlike the guidance-distilled Flux/Z-Image models.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Assisted-by: Claude:claude-opus-4-8 [Claude Code]

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants