ComfyUI workflow complete tutorial, 2026 Stable Diffusion advanced 8-step practical operation

📅 2026-05-19 11:20:22 👤 DouWen Editorial 💬 9 条评论 👁 26

ComfyUI is the most popular node-based workflow tool in the Stable Diffusion community in the past year or two. In 2026, it has become the standard for veteran SD players. The problem is that novices will be discouraged when they see a bunch of nodes when they open ComfyUI for the first time. This article uses an 8-step practical path to take zero-level readers from installation to running through the first advanced workflow, and also explains clearly the real value of ComfyUI compared to WebUI.

What is ComfyUI and why everyone is using it

配图

Let’s make the concept clear first. ComfyUI is a node-based Stable Diffusion graphical interface. Each operation such as loading model, encoding prompt words, sampling, decoding, and saving pictures is an independent node, and the nodes are connected through lines to form a workflow.

Compared with the drop-down menu plus slider mode of Automatic1111 WebUI, the node mode has three obvious advantages. First, visualize each step. You can clearly see the complete path from the prompt word to the latent space to the picture, and you can quickly locate errors when making mistakes. The second combination is free. By putting different nodes together, you can achieve functions that are not available in WebUI, such as ControlNet + IPAdapter + Tile at the same time. Third, the performance is better. ComfyUI's video memory management is usually better than WebUI on the same hardware. The specific improvement range varies greatly depending on the model and configuration.

The price is that there is a threshold to get started. You will be confused when you see dozens of nodes for the first time, but after you understand the basic structure, you will find that it is far more flexible than WebUI.

The first step is to choose the right installation method

配图

There are two most recommended paths for starting from scratch. The first is to use the popular integration package in the community, which can be downloaded from the network disk, decompressed and used. It comes with a Python environment and common models, which is suitable for mainstream Windows + mid-range NVIDIA graphics card users. The second is the official independent installation: git clone official warehouse, then python -m venv venv, and after activation pip install -r requirements.txt. Mac M series chip users take this path, and official support for MPS is relatively mature.

Graphics card threshold: SD 1.5 usually requires more than 4 GB of video memory to run, SDXL requires more than 8 GB, and FLUX.1 recommends more than 12 GB. For machines with low graphics memory, you can turn on the --lowvram startup parameter, which is slow but can run.

Model storage location: base model is placed in ComfyUI/models/checkpoints, VAE is placed in ComfyUI/models/vae, LoRA is placed in ComfyUI/models/loras, and ControlNet is placed in ComfyUI/models/controlnet. The folder structure is different from that of WebUI. When migrating the model, do not copy it directly. You can use a soft link to the corresponding path.

The second step is to understand the 7 basic nodes of the default workflow

配图

The default workflow you see at first sight when you open ComfyUI is the Vincent diagram minimum process, with roughly 7 nodes.

Load Checkpoint loads the base model and outputs three lines of MODEL, CLIP, and VAE; CLIP Text Encode Positive encodes positive prompt words into CONDITIONING tensors; CLIP Text Encode Negative encodes negative prompt words; Empty Latent Image creates a blank latent space tensor, and the width and height are set here; KSampler sampler, receives MODEL, positive and negative CONDITIONING, and latent, and outputs denoised latent; VAE Decode decodes latent into pixel images; Save Image is saved to the output folder.

After understanding the input and output correspondence of these seven nodes, all subsequent advanced workflows are just spliced ​​on this basis.

The third step is to add the standard connection method of LoRA and VAE

配图

The first common advancement. The LoRA node LoraLoader is inserted between the Load Checkpoint and CLIP Text Encode, connect the MODEL and CLIP lines, and output the MODEL and CLIP for later use. Remember to write the trigger word in the front prompt word.

VAE replacement: SDXL's own VAE is not very good, so sdxl-vae-fp16-fix is ​​commonly used. After the Load Checkpoint, the VAE Loader node is loaded separately, and then the VAE Decode node takes this external VAE instead of the one that comes with the base model. The color saturation and detail texture of the output image will be significantly improved.

Weight: LoRA strength is roughly between 0.6 and 0.9, which is a common sweet spot, and 1.0 is easy to over-contaminate the bottom model style. Be careful when connecting multiple LoRAs in series. More than two can easily conflict. You can use LoraLoaderModelOnly to only affect unet but not clip.

Step 4: ControlNet access control composition

配图

ControlNet is one of the core sources of ComfyUI workflow power.

Basic connection method: first add the Load Image node to import the reference graph, then connect the ControlNet Preprocessor node (Canny, MLSD, Depth, etc.) to generate the preprocessing graph, then connect the Apply ControlNet Advanced node to feed the preprocessing graph to the ControlNet model, and output the modified CONDITIONING to the conditioning input of KSampler.

Strength parameters: strength is roughly between 0.5 and 0.8, which is the most natural, start_percent is generally 0, and end_percent is used to control the step range of ControlNet intervention.

Multiple ControlNet series connection: OpenPose control posture + Depth control depth + Canny control outline are used together, the drawing stability is extremely high, and it is suitable for e-commerce websites, comic storyboards, and product multi-views.

Step 5, Tile Upscaler HD amplification

配图

Insufficient image resolution is the first pain point for beginners.

The most standard approach is the SD Upscale + Tile ControlNet combination. First, KSampler generates a medium resolution image, then VAE Decode, then Upscale Image By node to enlarge, then VAE Encode to re-enter the latent space, then Apply ControlNet Advanced to access the Tile model, the strength is about 0.6, then run KSampler again but adjust the denoise to between 0.3 and 0.5, and finally VAE Decode output.

This process can preserve the composition while bringing details close to the level of commercial posters. If you are short on video memory, you can install a community plug-in such as ComfyUI-TiledDiffusion to divide the Tile into pieces and denoise it piece by piece. Machines with tight video memory can also run larger sizes.

Step 6, IPAdapter style migration and reference diagram

The most advanced node worth learning in the past year is IPAdapter, which can use a reference image to migrate styles or characters.

Basic connection method: Download the ip-adapter series model and the corresponding image encoder, and place them in the ComfyUI/models/ipadapter and clip_vision directories. Add the IPAdapter Unified Loader node, connect the IPAdapter Advanced node, connect the MODEL and reference image, and output the new MODEL to KSampler.

Practical scenario: Add Ghibli style to the pictures generated by ChatGPT or DALL-E. IPAdapter is simpler and more stable than LoRA. To change the background of the product, you can use IPAdapter with an inpaint model to complete it in one picture. The comic storyboard maintains the consistency of the protagonist, and the FaceID series can lock facial features.

Step 7, the particularity of FLUX.1 workflow

FLUX.1 is a new generation model launched by Black Forest Labs. In 2026, it has become one of the main forces in high-quality graphics production in the ComfyUI community.

Workflow differences: FLUX does not use CLIP Text Encode and instead uses the CLIPTextEncodeFlux node to connect both CLIP and T5 text encoders; Empty Latent Image uses the EmptySD3LatentImage node instead; KSampler uses KSampler Advanced or BasicScheduler instead.

Video memory challenge: The full version of FLUX dev has high video memory requirements. The FP8 quantized version can be reduced to the runnable range of medium video memory. The GGUF Q4 quantized version can be lowered, but the details of the drawing will be lost. For specific requirements, please refer to the latest version description of the community quantitative version.

Prompt word difference: FLUX understands natural language far better than the SD series, and writing complete English sentences is better than writing a tag list.

Step 8: Workflow reuse and sharing

A hidden strength of ComfyUI is that workflows can be saved and shared. Complete workflow metadata is automatically embedded in each generated image PNG file, and all nodes and connections can be restored by dragging the image back to the ComfyUI canvas.

OpenArt platform: There are a large number of public workflows on openart.ai/workflows that can be browsed and downloaded. Search for "animal portrait", "anime style", "product photography" and other keywords, download workflow.json and load it directly.

Avoid pitfalls: If the downloaded workflow lacks custom nodes, a red box error will be reported. After installing ComfyUI Manager, click Install Missing Custom Nodes to solve the problem with one click; if the model path is incorrect, change the Load Checkpoint node and reselect it; if the VAE version is incorrect, plug in sdxl-vae-fp16-fix.

A complementary no-setup option for casual mobile users

ComfyUI rewards users who are willing to learn nodes, manage GPUs, and tune LoRA or ControlNet chains. The opposite scenario also exists: a quick image needed on a phone during commute or lunch break, with no desktop machine in reach. For users in mainland China, an iOS app called Lingtu (灵图-AI画图设计) fills that gap. It is available directly on the China App Store and aggregates a Midjourney-style ambient engine, a Flux-style realistic engine, and a Nano Banana-style fast engine behind one Chinese interface, with localized prompt handling and no VPN required. Treat ComfyUI as the deep-control workflow and Lingtu as the zero-setup mobile workflow, and the two coexist comfortably. Search "灵图" on the App Store, or use this link: https://apps.apple.com/cn/app/灵图-ai画图设计/id6763914201.

FAQ

Which to choose between ComfyUI and Stable Diffusion WebUI?

Novices who just want to get through Vincent Tu's WebUI are more friendly and can get started in a few minutes. But as long as you want to use ControlNet multi-model combination, Tile amplification, and IPAdapter style migration, ComfyUI is almost the first choice. WebUI either does not have these functions, or requires a bunch of plug-ins to be installed and is unstable. It is recommended to learn ComfyUI directly to save time.

What should I do if ComfyUI cannot run after installation?

The three most common questions. First, the graphics card driver version is too low. The NVIDIA driver should be based on the manufacturer's current recommended version. Second, the PyTorch and CUDA versions do not match. You must confirm the correspondence when installing. Third, the model file is damaged, re-download the safetensors file and check the hash. Modpack users mostly won't encounter this.

Can FLUX.1 really run on a machine with low graphics memory?

Yes, use the GGUF quantized version with ComfyUI-GGUF custom nodes. The rendering speed is significantly slower than the unquantized version, and the quality is slightly lower than FP8 but still better than SDXL. Under extreme optimization, machines with very tight video memory can also run, but they will have to endure a relatively long waiting time.

What should I do if the workflow reports that the red box node cannot be found?

Open ComfyUI Manager and click Install Missing Custom Nodes. In most cases, it can be installed with one click. If the newly released node Manager has not yet included it, go to the node author's GitHub repository git clone to the ComfyUI/custom_nodes directory, and restart ComfyUI.

Can ComfyUI workflow run on Mac?

Can. All M series chips are supported and run through PyTorch's MPS backend. Speeds are often significantly lower than similarly priced NVIDIA desktop graphics cards, depending on chip model and unified memory size. FLUX.1 can run on Mac, but it is recommended to use the quantized version. Models with smaller memory need to make a trade-off between longer latency and smaller size.

📝 本文来自抖文 www.douwen.me ,转载请保留出处。

💬 评论 (9)

D
DigitalNomad 2026-05-19 05:23 回复

Solid breakdown, very useful.

T
TechReader 2026-05-18 16:16 回复

Stats really back it up.

S
SEOFan 2026-05-19 08:38 回复

Easy to follow.

D
DigitalNomad 2026-05-18 17:53 回复

Great resource.

P
ProductHunter 2026-05-19 01:01 回复

Best summary I've read on this.

P
ProductHunter 2026-05-19 05:22 回复

Clear and to the point.

G
GrowthHacker 2026-05-19 02:52 回复

Sharing this with my team.

D
DevTools 2026-05-18 13:02 回复

Loved the FAQ section.

C
ContentDev 2026-05-18 11:34 回复

Bookmarked for reference.