
Guide: Here's how to generate images on your PC for free
CheckMag
Want to generate images on your PC without relying on cloud services? This guide walks you through setting up free, open-source tools for complete creative control without any subscriptions.Rohith Bhaskar, ✓ Published 🇪🇸 🇵🇹 ...
Image generation tools have become increasingly more capable of creating state-of-the-art photorealistic images. Unfortunately, most of them are locked online behind a paywall, but what if I told you there was a way to run these locally with far more flexibility than online tools provide?
Prerequisites
- An Nvidia graphics card with a minimum of 8GB VRAM. (RTX 3060 or better recommended)
- A minimum of 16GB DDR4 system memory. (The more you have, the better)
- Windows 10/11 (64-bit OS needed)
- At least 100-150 GB of free hard drive space for setting everything up and downloading models.
- An internet connection for initial setup. (Needed only to download and install UI frontends and image models)
Note: AMD or Intel GPUs are not officially supported by most UI frontends and require workarounds to function.
The first thing you need to do is to ensure you are running the latest studio drivers by Nvidia for your graphics card. If you are unsure of the Nvidia GPU you have installed, right-click anywhere on the desktop and click on "NVIDIA Control Panel" in the context menu.
Now, look for "System Information" at the very bottom of the page that opens. You should see the name of your graphics card on the left, along with more information if needed.
Open up Nvidia's official driver repository here, look for Nvidia Studio Drivers, and click on Download. This will open up a page for you to download the latest driver. Now, it's worth checking if your Nvidia product is supported by the driver. You can do that by clicking on "Supported Products" just below the download button. Install the drivers and restart your PC.
Note: This will overwrite the Game Ready Drivers if you have them installed. If your system is primarily for gaming, you may see reduced performance in games.
Great! The first step is now complete. Now, we can move on to the fun stuff, like downloading UI frontends. I highly recommend downloading and installing Stability Matrix. It's an all-in-one maintenance tool that supports multiple UI frontends and automatically keeps them up to date. It even creates shared folders for models and outputs that you can view in one place.
On the GitHub page, scroll down to the readme section and look for your operating system. Stability Matrix also offers downloads for Linux and Mac. For the purposes of this guide, we will be using the Windows version.
Click on the operating system button, and it should prompt you to download a .zip file. This should be placed on the drive where you want Stability Matrix installed. Make sure you have at least 100-150 GB of free space on the drive. This isn't just for installing Stability Matrix but also for downloading the required models, text encoders, and other system files. These add up pretty quickly.
Once you have downloaded the file, unzip it and run the StabilityMatrix.exe located within the extracted folder. It should automatically download all the required files and set up the interface for you.
Awesome! We now have an interface to download and install multiple UI frontends.
Now, it's time to decide on the front end you want to use. A front end is a graphical user interface (GUI) that allows you to interact with image models, manipulate settings, and, more importantly, generate images.
Here's a quick list of options offered on Stability Matrix.
- Stable Diffusion WebUI Forge
- Stable Diffusion WebUI Forge - Classic
- ComfyUI (Recommended)
- Foooocus
- Fooocus - mashb1t's 1-Up Edition
- Stable Diffusion WebUI
- SwarmUI
- Cogstudio
- Stable Diffusion WebUI UX
- RuinedFooocus
- SD.Next
- SDFX
- InvokeAI
Personally, I would highly recommend using ComfyUI. It's a visual, node-based application that might seem a little intimidating at first but is surprisingly easy to get used to. Remember, Stability Matrix can manage multiple frontends, so you don't have to limit yourself to one. You can experiment and find one that suits you the best.
Use the list above and navigate to the GitHub pages for each package. Take your time and learn more about each package before making a choice.
For the purposes of this guide, I'll walk you through downloading and installing ComfyUI, along with a few useful extensions that should serve you well.
Installing and setting up ComfyUI
Open Stability Matrix and click on the "Add Package" button. Now find "ComfyUI" on the list of offered packages and click on it. On the page that opens, make sure "master" is selected in the drop-down box. The "master" version is the most stable release and recommended for most users.
Once you click Install, Stability Matrix will begin downloading the package for you. Just wait for it to finish. It may take a few minutes to download, so feel free to keep using your system at this time. If the download box closes or you accidentally press "Hide," use the downloads button at the bottom to view the status of your current download.
Once it's installed, you should see a pop-up notification over your system tray informing you that ComfyUI is ready for use.
Go back to Packages, and you will see the ComfyUI tile on the page, but don't launch it just yet. ComfyUI is set to launch in Normal VRAM mode for GPUs with 12GB VRAM or higher by default. If you have an 8GB VRAM card, now is a good time to force Comfy to launch in Low VRAM mode.
Next, we need to install the ComfyUI Manager. It's a critical component that allows you to install and manage various custom nodes within the application. Click on the Jigsaw icon on the right and type in "ComfyUI-Manager" in the "Available Extensions" section. Select it, and click on Install at the bottom. Once it's finished installing, you should see "(installed)" next to it. Now, we're all set to launch ComfyUI.
Launch ComfyUI and allow it to run through the start-up process. Once it's done, it will automatically open in a new tab on your default browser. If it doesn't, head back to the Packages tab and click on WebUI on the now green-colored ComfyUI tile.
Congratulations! You're now done with all the prerequisites. Now, let's move on to the good stuff, downloading and using image models.
Downloading and using your first image model
Before we begin downloading image models, let's run through a glossary of terms that you should be familiar with.
1) UNET/Checkpoint/Diffusion Model/Diffusers - The big boss. The brains of the operation, think of it as the artist that paints using your words.
2) Tokenizer - The timekeeper. They convert your prompts into tokens for embedding before the Text Encoders take over. Depending on the model, you will be limited to a set number of tokens (words) you can use.
3) Text Encoders/CLIP - The heavy lifters. They convert your text prompts into digital inputs that the UNET can understand.
4) Samplers - The master conductor. Iteratively guides the image generation process by refining the image from noise into the final output.
5) VAE - The clean-up crew. They clean up noise after the generation stage and help improve image quality.
Now, you will encounter more terms on your journey, but these are the basics that will define everything from this point forward. Don't worry too much about going into detail about them just yet.
This will open a window featuring example workflows already set up for you. Click on "Basics" and then "Image Generation." You will immediately see an error on the screen informing you of missing models. This is because we haven't actually downloaded an image model yet. Let's go ahead and do that. Click on download.
While the file downloads, let's dwell on some differences between the various models that you will see. What we're downloading right now is the Stable Diffusion 1.5 base model. In ComfyUI's canvas, you may have noticed three separate connections from the very first "Load Checkpoint" node. One each for MODEL, CLIP, and VAE.
That's because all three are included in this base checkpoint, and you don't need to use separate CLIPs and VAE for this particular model.
Newer image models, however, like Flux or HiDream, require you to download them separately, so keep that in mind. Thankfully, they have example templates just like this one, so you don't have to worry about setting everything up. Let's talk about that later. For now, let's generate our first image.
Now, if you click on the Run button at the bottom right, you will notice ComfyUI still gives you an error. That's because the model needs to be placed in the correct folder for Comfy to recognize it. Now, go to the folder where you downloaded the model in File Explorer and copy it.
You will notice the filename has a ".SAFETENSORS," extension. This is the format that the file is using. As a general rule, only download files with the .safetensors extension. Don't download pickle tensors or .pth. Trust me!
Open the folder where you installed Stability Matrix, and look for a folder called "Models." Double-click it, and look for a subfolder called "StableDiffusion," paste the .safetensors file inside it.
For future reference. Models with CLIPS and VAE included should be placed in the "StableDiffusion" folder. Models with only the UNET (No CLIP or VAE) will be placed in the "DiffusionModels" folder. Text encoders (T5, Llama, CLIP L, CLIP G) need to be placed in the "TextEncoders" folder. Finally, your VAE files are placed in the "VAE" folder.
As you can probably see, there are a lot more folders than just those three. My advice would be to figure it out when you get to them. For now, let's worry about our image.
Refresh ComfyUI's web page in your browser. Click on the dialog box in the "Load Checkpoint" node and select your downloaded file.
Since this is a template, everything is already set up for you. However, it's worth knowing the basic groundwork for your future workflows. Let's start at the left. We have already discussed the Load Checkpoint node, and next to it are two "CLIP Text Encode (Prompt)" nodes.
The top one is the positive prompts or what you want to see in the image. Below is the input for negative prompts or what you don't want to see in the final output. Generating images is the balance of these two nodes combined. You create an image, check what you don't like about it, and enter that into the negative prompt. For now, these are already entered for us, so let's click on "Run."
ComfyUI is a visual interface, meaning you can actually see the process happening node by node. Once your text is encoded, it's sent to the "KSampler," which begins iterating the image. Let's quickly go over all the settings in this node.
1) Seed: Think of it as the image's address. Same positive prompt + negative prompt + same settings + same seed = the same image. Useful for recreating and iterating on an image.
2) Control after generation: Determines whether the seed will be randomized after each generation or should stay fixed.
3) Steps: The number of steps the KSampler should iterate for. Most models come with recommended steps.
4) CFG: The model's responsiveness to your prompt. Higher values = strict adherence to prompts but less creativity. Lower values = more creative outputs at the cost of prompt adherence.
5) Sampler_name: the name of the sampler you are currently using. Click on the drop-down for more options. Experiment with different settings and samplers for varied outputs.
6) Scheduler: Think of it as the second-in-command who comes up with strategies that the conductor approves. Again, experiment with different samplers and schedulers to find one that works for you.
7) Denoise: Determines the amount of noise added to the beginning of the generation process. This is removed iteratively by the sampler. The value can't be set above 1.00, and values below that (0.45 or 0.65) are mainly used in image-to-image, refiner or inpainting workflows.
The latent_image input to the left of the KSampler node determines the size of your image. In the workflow, it is connected to an "Empty Latent Image" node with a resolution of 512x512 and a batch size (the number of images generated in one run) of 1.
Now, would you look at that! Our very first image. If you have made it this far, congratulations! You now know the very basics to get you started on your journey in image generation. The more you experiment, the more you will discover, and this rabbit hole runs deep. So have fun.
Useful links
CivitAI: Your one-stop shop for downloading models, LoRAs, embeddings, and much more. (Caution: Includes NSFW content. Use built-in site filters.)
Monzon Media: Fantastic resource for beginner and advanced ComfyUI tutorials.
ComfyUI Wiki: For all your troubleshooting needs.
Bad ASS ComfyUI Resource List: Links for all the Base MODELs, CLIPs, and VAEs you might need in one place.
Comfy Workflows: A dedicated community to share and download workflows.