Prompt Stable Diffusion
Constrói prompt SD com positive, negative, sampler e CFG.
Prompt gerado
—
Prompt engineering for Stable Diffusion
Engineering a Stable Diffusion prompt is less about creative writing and more about weighting tokens and choosing the right sampler. Every prompt has two halves: the positive prompt (what you want to see) and the negative prompt (what you want the sampler to avoid). The negative is often as influential as the positive — pushing words like blurry, low quality, bad anatomy, extra fingers, watermark, text down inside the negative dramatically cleans up output.
Different front-ends interpret prompts differently. AUTOMATIC1111 parses parentheses for weighting; ComfyUI exposes the CLIP text encoder as a node graph, so the same prompt string can produce a slightly different embedding depending on which encoder configuration is wired in. Always assume your prompt is tied to the WebUI you wrote it on.
Token weighting and structure
In AUTOMATIC1111 syntax, (red dress:1.3) raises attention on that phrase and [red dress] lowers it. Each CLIP chunk has a hard 75-token limit; longer prompts are split, and you can force a split explicitly with the BREAK keyword. Word order matters: the leftmost tokens carry more weight in the CLIP attention map, so put the most important concept first.
CFG scale and samplers shape the prompt response
The CFG scale (Classifier-Free Guidance) at 7-12 is standard. Lower values give the model creative freedom to deviate from the prompt; higher values force strict adherence but easily over-expose and burn the image. The sampler changes how the prompt is honoured: Euler a is creative and varies a lot; DPM++ 2M Karras is faithful and clean; DDIM reproduces seeds well.
Embeddings, LoRAs, wildcards and dynamic prompts
Textual Inversion embeddings are summoned in the prompt by their trained name, e.g. <embedding-name>. LoRAs are loaded inline with weight: <lora:loraname:0.8>. Wildcards like __hair_color__ randomly pick from a text file, and the Dynamic Prompts extension supports inline alternatives: {red|blue|green} dress. Stack a ControlNet conditioning (pose, depth, canny) on top and your prompt compounds with structural control.
Booru tags vs natural language and the refiner
Anime fine-tunes trained on the Danbooru dataset prefer booru tag style: 1girl, blue_eyes, standing, masterpiece. SDXL and Flux base models prefer natural language: a young woman with blue eyes standing in a meadow, masterpiece. Mixing styles confuses the encoder. Common quality boosters — masterpiece, best quality, ultra detailed, 8k, cinematic, sharp focus — help, but overloading them can introduce bias toward a specific aesthetic. SDXL adds a refiner second pass that polishes details from the base output.
FAQ
Is the negative prompt mandatory? No, but it is one of the highest-leverage tools in your kit. Even a generic low quality, blurry, watermark, text meaningfully reduces artifacts.
Booru tags or natural language? Depends on the checkpoint. Check the model card on Civitai or Hugging Face — if the example prompts are tag-style, follow that pattern.
Do quality keywords always help? No. Spamming masterpiece, 8k, ultra detailed, trending on ArtStation overloads attention and biases composition toward generic ArtStation thumbnails. Use 2 or 3 quality words, not 10.
How do I make a prompt reproducible? Fix the seed, the sampler, the step count, the CFG, the checkpoint hash and any LoRA hashes. The image metadata embedded by AUTOMATIC1111 already contains all of this for re-import.
Related Tools
Handwriting Generator
Convert typed text into an image with handwriting appearance. Useful for adding a personal touch to digital work.
Resume Generator
Fill a simple printable A4 CV from a form with personal data, education and experience.
Favicon Generator
Generate a favicon from text/emoji in all common sizes (16, 32, 48, 64, 192, 512). PNG download.