Every day, a little more

I’m back at it again tonight. Finding a style of art that looks clear enough to convey a mood, but painterly enough to leave more to the imagination. Yes, I’m working with genAI - if that sets you off, better you know now.

This has been a time-intensive process, and one that nearly lead me to give up. There is so much to learn currently to achieve descent, unique looking results in Stable Diffusion. It’s great at delivering concepts, so if you have vague ideas in mind, you might be able to fire-and-forget with some prompts. This happened to be my process near the beginning. But as time goes by, you start getting closer to what you want, and details begin to matter more.

The more details matter, the more you need to learn how to achieve those details. I’ve been told by artists, “Here’s a tip: hire an artist.” I get it. That’s possible and plausible, and may still be in the cards. But my desire to create a game with a tiny team, and to explore the feasibility of genAI in a workflow process means that I’m exactly where I need to be, doing exactly what I need to be doing.

So, I wanted to write this post tonight to document some of my process at this juncture. I’m sure I’ll look back in a few months and snicker at my foolishness.

  • Step 1: Tease out themes from the book, ideas you want to explore (yes, ChatGPT was used extensively to workshop this)

  • Step 2: Create scenarios for the player to encounter based on the themes from the previous step (again, ChatGPT helped shape and reduce ideas)

  • Step 3: Consider binary decisions to offer the player in each scenario, and generate resulting scenarios from each decision

  • Step 4: Build a spreadsheet to hold these pieces of text for cohesive importing into the game engine (being created by my partner)

  • Step 5: Use Stable Diffusion to generate an overall look for the game

  • Step 6: Create individual images for each scenario and result

    • Break down the scenario/result into a prompt and generate low-fi 512x512 images

      • Choose a model that fits your needs (and a refiner, if need be)

      • 15 sampling steps (your looking for a draft here)

      • Choose a sampling method that seems to work for you

      • Batch count 4 for options and speed

      • Adjust CFG Scale to your needs (2 for something wildly creative, 8 for something more specific to your prompt)

      • Generate

    • Review images, and either accept one or quickly generate new images (repeat process until the basic composition looks like someone you like)

      • It’s possible this acceptable image still needs tweaking in PS or Gimp to get the elements you want

      • It’s also possible you might want to use ControlNet to create specific items or poses of characters in the image, but this, too, can eat up vast amounts of time!

    • Take acceptable image over to Img2Img

      • Increase size to 768x768

      • Adjust Denoising Strength based on needs (lower = fewer adjustments; higher = wilder results)

      • Generate 1 image at a time for speed

    • Take the acceptable image again to Img2Img, and repeat, increasing size, adjusting denoise

      • You may want to InPaint certain areas like crazy hands, clothing style, strange aspects you don’t like/want to replace with something else. **WARNING: This is where you will start sucking away your time on trying to perfect images

    • As of writing this, 2048x2048 is the largest image I can generate, and the level of detail is really nice (especially if you’re going for a style that doesn’t need 4K Ultra HiDef!!!!)

Congrats! After all that, you have 1 image of many. Wash, rinse, repeat as many times as necessary.

Is there an AI that could automate more of this? Maybe, but at this point, I still maintain a great deal of creative control. I’m using the machine to quickly give me something that matches my mental images, and fine-tuning them faster than any artist could. The speed at which I’m iterating is simply unparalleled by hand - I work with professional artists during my day-to-day. They might argue about the ethics, or the final quality, but there’s no denying the concepting speed.

The same applies to the breaking down of story concepts and quests. As a professional game writer for 15 years, I work with other professional writers. The ability to break down themes, apply them to, say, 10 different scenarios, pick the best, and start creating from there is - again - unrivaled.

I know the topic of genAI is radioactive in the creative fields at the moment, but it is hard to fathom ever going back to drastically slower methods of exploring themes, styles, code, etc.

Previous
Previous

Closer to a Playable Build

Next
Next

Somewhere in the middle