Image-to-Image Prompting

Technique Context: 2022

Introduced: Image-to-image (img2img) generation became widely accessible with the release of Stable Diffusion in 2022. The technique uses an existing image as a starting point for the diffusion process, allowing the AI to transform, restyle, or enhance images while retaining structural elements from the original. Rather than generating from pure noise, the model partially destructs the input image by adding controlled noise, then reconstructs it guided by a text prompt. The denoising strength parameter controls how much the output deviates from the input — a concept that gave users precise, slider-based control over the balance between faithfulness and creative transformation.

Modern LLM Status: Image-to-image is a standard workflow in all major image generation platforms including Stable Diffusion, Midjourney, and DALL-E. The technique forms the foundation for more specialized approaches like inpainting (editing specific regions), outpainting (extending image boundaries), and ControlNet workflows (guiding generation with structural maps). Every serious image generation pipeline now supports img2img as a core capability, and it remains the primary method for iterative visual refinement, style transfer, and concept art pipelines.

The Core Insight

Start from an Image, Not from Noise

Standard text-to-image generation begins with pure random noise and gradually refines it into a coherent image guided only by a text prompt. Image-to-image flips this starting point: instead of noise, the model begins with an existing image. It partially destructs that image by adding a controlled amount of noise, then reconstructs it under the guidance of a new text prompt. The result is a transformation that preserves structural elements from the original while applying the creative direction specified in the prompt.

The critical control is the denoising strength slider. This single parameter determines how much the output can deviate from the source image. At low values (0.2–0.4), the model makes subtle adjustments — cleaning up lines, refining details, or applying minor stylistic shifts while preserving most of the original composition. At high values (0.7–0.9), the model dramatically transforms the image, keeping only the rough layout and proportions while reimagining everything else. This gives practitioners precise, predictable control over the balance between faithfulness to the original and the degree of creative transformation.

Think of it like a painter working over a pencil sketch. A light touch preserves the sketch’s lines while adding color and detail. A heavy hand covers the sketch entirely, using it only as a loose compositional guide for something entirely new.

Why Starting from an Image Matters

Text-to-image generation is inherently unpredictable — the same prompt can produce wildly different compositions across runs. Image-to-image solves this by anchoring generation to a known starting point. You control the composition, the subject placement, and the overall layout through your reference image, while the prompt controls the aesthetic and thematic transformation. This combination of visual anchoring and textual direction makes img2img far more controllable than pure text-to-image generation for iterative creative workflows.

The Image-to-Image Process

Four stages from reference image to transformed output

1

Provide Reference Image

Upload the source image that will serve as the structural foundation for the transformation. This can be anything from a rough pencil sketch, a photograph, a previous AI generation, or a digital mockup. The reference image defines the composition, spatial layout, and proportional relationships that the model will work from.

Example

A hand-drawn pencil sketch of a two-story building with large windows, a pitched roof, and surrounding landscaping.

2

Write Transformation Prompt

Describe the desired output, emphasizing the changes you want from the original rather than restating what is already in the image. Focus on the target style, medium, lighting, color palette, and any specific modifications. The prompt steers the reconstruction — think of it as instructions for how the model should reinterpret the reference during the denoising phase.

Example

“Photorealistic architectural rendering, modern glass and steel building, golden hour lighting, lush green landscaping, professional photography, 8K resolution.”

3

Set Denoising Strength

Control how much the output can deviate from the source image. This is the most important parameter in the entire img2img workflow. Low values (0.2–0.4) preserve most of the original’s detail and make only subtle adjustments. Medium values (0.4–0.6) allow significant stylistic changes while maintaining the overall structure. High values (0.7–0.9) dramatically transform the image, retaining only rough composition and proportions.

Example

For sketch-to-rendering: start at 0.65 to allow enough creative freedom for the model to add photorealistic detail while keeping the building’s layout intact.

4

Iterate with Adjustments

Review the output and refine by adjusting the denoising strength, modifying the prompt, or using a previous output as the new input for another pass. Iteration is central to the img2img workflow — each pass can bring the result closer to your vision. You can also feed the output back as a new reference for progressive refinement, gradually building toward the final result across multiple generations.

Example

First pass at 0.65 produces a good rendering but the windows are too small. Lower denoising to 0.3 and add “large floor-to-ceiling windows” to the prompt. Run again using the first output as the new reference.

See the Difference

How denoising strength transforms the same sketch

Source

A rough pencil sketch of a building with large windows and surrounding trees.

Result

The sketch’s lines are cleaned up and refined. Pencil strokes become smoother, proportions are corrected slightly, and basic shading is added. The output still clearly reads as a sketch — the medium and style are preserved while the execution is improved. Fine details from the original, including line weight and hatching patterns, remain recognizable.

Subtle refinement, preserves original medium and detail

VS

Same Source + Prompt

“Photorealistic architectural rendering, modern glass and steel building, golden hour lighting, lush landscaping, professional photography.”

Result

A fully photorealistic architectural rendering that preserves the sketch’s overall layout and proportions — the building’s footprint, window placement, and tree positions match the original composition. But every surface is now rendered with realistic materials: glass reflecting the sky, steel beams catching golden hour light, detailed landscaping with individual leaves and grass blades. The pencil sketch is gone; only its structure remains.

Dramatic transformation, preserves only layout and proportions

Image-to-Image in Action

Three practical transformation workflows

Sketch to Rendering

Source Image

A hand-drawn pencil sketch of a fantasy castle on a cliff, with turrets, a drawbridge, and a winding path leading to the entrance. The sketch is rough but clearly conveys the spatial arrangement and scale of the structure.

Prompt + Settings

Prompt: “Detailed digital illustration, fantasy castle perched on dramatic cliffs, epic scale, volumetric lighting, atmospheric fog, rich color palette, concept art quality, matte painting style.”

Denoising Strength: 0.70

Result: The rough sketch transforms into a polished digital illustration. The castle’s position on the cliff, the turret placement, and the winding path all match the original layout. But now every surface has texture — weathered stone walls, moss-covered battlements, volumetric fog rolling through the valley below. The sketch provided the composition; the prompt and high denoising provided the finish.

Season Change

Source Image

A photograph of a tree-lined suburban street in full summer — green canopy, bright sunlight, green lawns, a clear blue sky. The street has parked cars, houses with front porches, and a sidewalk running along both sides.

Prompt + Settings

Prompt: “Same street scene in deep winter, heavy snowfall, bare tree branches, snow-covered roofs and lawns, overcast sky, warm light glowing from house windows, fresh tire tracks in the snow.”

Denoising Strength: 0.55

Result: The street layout, house positions, car placement, and sidewalk structure remain intact from the original photograph. But the season has changed entirely: green leaves become bare branches, lawns are blanketed in snow, the sky shifts from blue to overcast grey, and warm interior light spills from the windows. The moderate denoising strength preserves the exact spatial arrangement while allowing the seasonal transformation to feel natural and complete.

Style Conversion

Source Image

A portrait photograph of a person sitting in a garden, natural lighting, the subject centered in the frame with flowering bushes and a wooden fence in the background. Standard photographic quality with sharp focus on the subject.

Prompt + Settings

Prompt: “Oil painting in the style of the Impressionists, visible brushstrokes, soft edges, vibrant dappled light, rich color palette with blues and warm yellows, canvas texture, gallery-quality fine art painting.”

Denoising Strength: 0.60

Result: The photograph’s composition is preserved — the subject’s pose, the garden layout, and the spatial relationships remain the same. But the photographic medium is replaced entirely with oil painting characteristics: visible brushstrokes define the flowering bushes, the subject’s features are softened with Impressionist handling, dappled light plays across the scene with broken color technique, and the entire surface has a canvas-like texture. The moderate denoising allows the style to change completely while the composition stays anchored to the original photograph.

When to Use Image-to-Image

Best for controlled transformations with a known starting point

Perfect For

Iterating on Existing Visuals

When you have an image that is close to what you need but requires refinement, style adjustment, or quality improvement — img2img lets you evolve rather than start over.

Concept Sketches to Finished Art

Converting rough hand-drawn sketches, wireframes, or doodles into polished digital illustrations, renderings, or photorealistic outputs while preserving the original composition.

Style Conversion of Photographs

Transforming photographs into different artistic media — oil paintings, watercolors, anime, pixel art — while maintaining the subject, composition, and spatial relationships.

Batch-Consistent Transformations

Applying the same stylistic transformation across a series of source images for consistent visual output — such as converting a product catalog to a unified illustration style.

Skip It When

Completely New Compositions

If you have no reference image and want the model to create a scene entirely from your text description, standard text-to-image is the correct approach.

No Useful Structural Information

If the source image has nothing worth preserving — no useful composition, layout, or subject placement — then img2img adds complexity without benefit over pure text-to-image.

Pixel-Exact Reproduction Required

When you need specific pixels preserved exactly as they are, img2img will always introduce some variation. For precise edits to specific regions, use inpainting instead.

Use Cases

Where image-to-image delivers the most value

Concept Art Pipeline

Artists sketch rough compositions by hand, then use img2img to rapidly explore different rendering styles, lighting conditions, and color palettes — iterating from thumbnail to finished concept in a fraction of traditional timelines.

Photo Enhancement

Improve photograph quality by using low denoising strength to refine lighting, sharpen details, reduce noise, or subtly adjust the mood of existing photos without altering their composition or subject matter.

Seasonal Marketing Variants

Transform a single product or brand image across seasons — summer to winter, day to night, spring to autumn — creating campaign-ready visual variants from one source photograph while maintaining brand-consistent composition.

Architectural Sketch to Render

Convert architectural hand-drawn sketches or simple 3D wireframes into photorealistic building renderings, allowing architects and clients to visualize designs before committing to full 3D modeling.

Design Iteration

Use each generation as the input for the next pass, progressively refining details, adjusting elements, and converging on the final design through multiple controlled iterations rather than one-shot generation.

Historical Photo Colorization

Transform black-and-white or faded historical photographs into vivid, colorized versions using low denoising strength and prompts specifying realistic color palettes appropriate to the era and subject matter.

Where Image-to-Image Fits

Image-to-image bridges pure generation and precise editing

Text-to-Image Pure Generation Create from text prompt alone

Image-to-Image Guided Transformation Transform with reference + prompt

Inpainting / Outpainting Regional Editing Modify or extend specific areas

ControlNet Guidance Structural Control Precise layout via edge and depth maps

Progressive Refinement Strategy

The most effective img2img workflows use multiple passes at different denoising strengths. Start with a high-denoising pass (0.7–0.8) to establish the overall look and feel, then feed that output back as the reference for a lower-denoising pass (0.3–0.4) to refine details without losing the composition you have established. This progressive approach gives you both creative freedom and fine-grained control in a single pipeline.

Related Techniques

Explore connected image generation techniques

Foundation Image Generation Prompting The foundational text-to-image techniques that img2img builds upon — understanding prompt construction for image generation is essential for writing effective transformation prompts.

Evolution Inpainting Takes img2img further by allowing you to mask specific regions for targeted editing — changing only selected areas while preserving the rest of the image exactly as it is.

Complement Style Transfer A complementary technique focused specifically on applying the visual style of one reference image to the content of another — separating style from structure for targeted aesthetic transformation.

Transform Your Images

Apply image-to-image techniques to your creative workflow or explore other visual generation frameworks in the Praxis library.

Prompt Builder All Foundations

Image-to-Image Prompting

Start from an Image, Not from Noise

The Image-to-Image Process

Provide Reference Image

Write Transformation Prompt

Set Denoising Strength

Iterate with Adjustments

See the Difference

Low Denoising (0.3)

High Denoising (0.75)

Image-to-Image in Action

When to Use Image-to-Image

Perfect For

Skip It When

Use Cases

Concept Art Pipeline

Photo Enhancement

Seasonal Marketing Variants

Architectural Sketch to Render

Design Iteration

Historical Photo Colorization

Where Image-to-Image Fits

Related Techniques

Transform Your Images