The Farce of Chucking Water: My Life Creating MidJourney Images for Book Cover Commissions

When authors come to me for a book cover commission they generally come at me from two very different directions. 

Either they’ve seen my work and say to themselves: I’ll have some of that, let this kid go wild and create something amazing for my book, I’m sure he’ll do good work whatever it is. 

Or maybe they will say: I have a vision in my head, I want James to execute it for me. 

And although I must prefer the former let-me-have-at-it approach, because they’ll generally get better results with it, I don’t mind executing the former, this Author’s Vision.

But …

And here’s comes the ‘but’, we’re not going to get exactly the image they have in their head. And this blog might help them understand why not. And how we need to approach their book cover, their vision. 

So this blog post is for them! But even if that’s not you — you might even be somewhat in between both of these sorts of people — this blog should help you understand AI image generation a bit better.

If you don’t want to read this whole blog, and it’s bit of a long one, and just want a TL;DR:

But if you want to know the reasons behind it all, and understand AI, stick with this blog and I’ll show you the man-behind-the-curtain, i.e. my life, trying to make your great book cover.

This blog post should dispel a lot of the myths I think authors have built up in their head about how easy and low-effort AI is. And in dispelling these myths, and actually understanding how I do what I do, it should make my life easier. 

Because I think there is a lot of confusion at the moment.

So let’s start off with a topic which most authors probably think they have a good grasp of ‘Description in Fiction’. But before we do that, there’s quite a bit to cover, so grab yourself a lovely Earl Grey tea and a croissant, get comfy, and let’s have a look at it all.

Visual Imagination Greatest Joke

Before we talk about the problem that AI runs into, let’s have a little chat about how description in fiction actually works. Because I think this explains one really big problem with AI in a very perfectly succinct way.

All authors have an implicit feeling with how visual imagination works. When they’re writing something they’ll have a vision in their head of the scene, what the characters look like, maybe what’s happening in the background. Authors will see it in their mind’s eye and then their job is to effectively write that description down in words, so that readers will see exactly the same thing in their heads. Right?

Wrong.

Here’s the catch: readers won’t see exactly the same thing! Ever! They build their own picture in their own mind’s eye from the author’s words.

It’s literally how fiction functions. Good fiction gives you enough description and story to build the picture, whilst making sure the narrative moves forward. 

So how could you get a reader to see what you see, as the author? 

One solution would be to rely on that old adage: a picture tells a thousand words, right? 

So what will you do as the author? Are you going to write a thousand words for each scene? For each character? So the reader sees exactly the same thing as you. 

No, that would just mean the prose would become clunky and full of description and the story would move along at a glacial speed. 

So what authors actually do is economise on description and give in to the fact that what they see in their own mind’s and the reader’s mind’s eye are never going to be the same thing, but close enough so we can carry the story forward at a nice pace. 

In short, as authors we give ‘enough’ detail, not ‘excruciating’ detail.

This is important to remember when it comes to AI, because what MidJourney likes is that ‘enough’ detail, and doesn’t play well with ‘excruciating’ detail. Just like our readers.

What authors need to remember is this what’s happens:

And this is perfectly fine. In fact it has to be this way.

So why explain this?

When an author comes to be with their Author Vision for a book cover, what they’re doing is holding onto that first image. And I get caught in a hellscape of trying to match it. Our image on the left.

But if you’re following me so far, I’m sure you can see the glaring obvious problem here. 

Just like a reader, with their own visual imagination, of the words that have been offered by the author to explain something visually, I’m in the same boat as a designer.

And here’s where it gets even more farcical. 

There’s a third person in this process. And that’s MidJourney. So you have three different people being told words, and making pictures in their head from those words.

So authors tell me words to tell MidJorney, to make those images, like it is some sort of game of telephone. It’s a bit frustrating to say the least. 

But I can see the cogs in your mind turn, as an author. 

Wouldn’t it be much easier if we cut out James. Let me talk directly to MidJourney. Oh, and I know I’ll just give it loads of description! All that excruciating detail. That’d work. Right?

Nope. 

Firstly, you might have heard of something called ‘prompt craft’ when it comes to generating images. It’s basically the way you talk to the AI so you can get the right thing out of the system in the way you want. And to put it into context, I’ve committed an hour every morning before I start work proper, every single day, over the last couple of years, to actually learning and understanding prompt craft. So good luck with that.

There’s two sorts of people in this world. Those that have played with AI image generation and seen how unruly it is, and how difficult it is to get it to behave, to get what you want. 

And those yet to try AI image generation. 

So that’s problem number one. It’s going to be a frustrating experience for you. 

In fact, those two images of the ‘cat lazing under the tree’ were actually me asking MidJourney to literally draw a picture of a ‘cat lazing under a tree’. It interpreted my words in two different ways because I didn’t specify more than that.

And your second thought about it might be: I know what I’ll do I’ll just give it lots and lots of description then! Job done!

Something like:

Okay, let’s try that exact description with MidJourney and see what happens to try and get closer to the author’s mind’s eye image.

Here are the first four outputs from the description.

Welcome to my world! And the world of AI image generation. 

If you want a feeling for the frustration you’re going to get being me then go have a play with one of the free image generation tools: ChatGPT, Co-pilot or Gemini. And you’ll soon understand.

So let’s have a chat about why this is happening and how we can fix it. 

Chucking Water

Even if an author gives me loads of excruciating detail, like in that cat example, why isn’t MidJourney giving us images that are 100% perfect and match what we’ve asked for?

The simple answer is: technically, I have no idea!

All I’ve got is a feeling. A feeling I can best describe in terms of an analogy, one I like to call ‘The Random Chucking of the Water’.

The best way to visualise it for me — and explain it to you — is to think of it like each individual element we ask it for in that long description as a cup we want the water to fall in, everytime we re-roll a prompt to generate an image. And we’re chucking a big bucket of water at those containers. To see what sticks.

So our prompt my look something like this:

And then when we run the prompt, chuck the water, and hope it lands in the right descriptive places. 

Let’s place some bets on whether that will work or not. 

Yeah, generally this is what happens:

What we’re ideally looking for is this!

And that’s why the chucking water analogy works so well. Because you could try a million times and it’s not going to land in all the glasses equally as you want. It just won’t happen. Re-rolling prompts is more like some daft messy task on Taskmaster, where the studio audience is laughing at you. Rather than a precise art. By the way, all the Taskmaster Episodes are online here, and they’re very funny.

So it’s never going to produce a perfect image, hitting all of your descriptive elements and getting them all right. 

But another reason why I really like the glass and bucket metaphor is because imagine we actually cut down the amount of descriptive elements, what would that look like? 

I’ll take some bets again on how you think that is going to turn out.

Of course. It definitely works a lot better! The water hits the right glass a lot more of the time!

But what this means is that authors who want an image to be more correct, need to let go of their Author’s Vision, because sorry, it just isn’t going to happen. There are limitations to what AI can and can’t do. The number of descriptors it’s going to hit correctly.

There simply needs to be compromises and authors need to trust me that I have their best interests at heart and I’m trying my hardest to get things right for them. Making the best images to make book covers from.

There needs to be a meeting in the middle.

But how do we achieve that?

Prioritise! Prioritise! Prioritise! 

Firstly, we’ve learnt from our ‘Chucking Water’ analogy MidJourney is better when it has less cups to aim at. So much better. So firstly, we need to prioritise.

We need to trim the fat from the vision you have in your head. Make your idea a bit more amorphous to actually hit our really important points.

Ironically, this is what makes a better book cover anyway, hitting the three or four really important tonal, premise or plot points. Rather than an overly complex, detailed visual mess. Think more iconic album cover, rather than a movie still. 

And is it really that important that the cat is black and white? That the tree is a cherry tree? Will it make any difference to a potential reader’s experience stumbling on your book cover if these things aren’t 100% accurate?

I would say ‘no’. If you asked me I’d take eye-catching over correct detail every time. 

And maybe I’ve already taken up two of the descriptive elements I’m using in my prompt to make sure your cover is tonally correct and eye-catching. So that’s two less you’ve got to play with!

As we’ve learnt, if we try to give MidJourney our 300 word detailed description to explain your perfect image we’re definitely going to get the water splashing everywhere. So I would say the limit for any image is probably five or six descriptive elements. In order! And as I said, I probably want to take two of those, as the designer, to get tone and the medium right.

So that’s four things you can probably tell about your image.

Simply let go of the details in your head. It’s easy. What are your top five things? The age, the race, the hair colour, what they’re doing, what they’re wearing and … nope, that’s it. We’ve already gone over those four water containers to get consistent results. 

Crazy, I know.

So you need to think of your most important story specific elements that you want on the book cover, and when I generate images just pick the coolest, eye-catching image. Because you’re not going to get everything you want. It’s the nature of the beast. 

There’s actually a really good blog post, I wrote, about how to extract your most important details from a book for great cover here.

I’ll repeat it again: Forget about details. We’re after great emotive, eye-catching images that tell a story. Not the details. We can’t do detail. 

Unfortunately, MidJourney isn’t very good at being detail-oriented. 

So for me as a book cover designer, if we’re going down an AI route with a commission I like to know what’s important to you, in order! It’s helpful. 

Details might be important to you, but they’re just not that important to MidJourney. It might improve with version seven or eight, but I wouldn’t hold my breath because from version five to six it was meant to improve and it didn’t.

Save your details for the inside of a book. You know, your writing. 

And I know I’m banging one about ‘details’ here and you’re going to say to me: but everything is important! All my details. The vision in my head. 

And I’m going to say: but that’s not the way it works, Veruca Salt. You can’t have an Oompa Loompa. 

And cutting down your priorities gives better results, it’s the only way we can cut down all the randomness MidJourney generates. And give us strong usable images.

And you could say to me: but keep generating until all the water lands in my seventeen different glasses, it should eventually! Keep doing it! Keep doing it, NOW!

Sorry, not going to happen because time is very much limited, and I’m not just just talking about my precious designing time, we need to talk about  …

Working on GPU Time

There’s a fantastic joke by Simon Munnery: 

“They did give infinite typewriters to infinite monkeys. It’s called the internet.”

Okay, the original saying is something like: infinite typewriters plus infinite monkeys would equal the complete works of Shakespare. 

My point being, given an infinite number of generations on MidJourney we could come up with the image that is in an author’s mind’s eye. Simple. 

Well, apart from the fact that I don’t have an infinite amount of time to work on infinite images. I have a dinner date at 8pm today for starters.

And then comes the heavier catch. 

I don’t have infinite GPU time. 

With MidJourney I get 30 GPU hours a month to play with, basically the service generating the images is some cloud computing set-up, doing all the fancy stuff behind the scenes, and then giving me what it’s generated. 

Basically around 150-200 images equals about 1 hour of GPU time. 

Which sounds like a lot of images to pick from but there is a vast amount of redundancy!

Generally over the last couple of years what I’ve found is that if I have a project where an author has commissioned me, say for a single cover, I will generate about 300 images and about 25-30 images will be good enough to show the author to choose from.

There is a whopping 90-95% which is simply just trash. Unpresentable.

Images where the water has gone everywhere, and I know it’s not what an author has asked for. Or doesn’t look right in terms of tone. Or bad composition. Or has funny limbs. Or the colours are yucky. Or MidJourney interpreted my words in a rather amusing way. On and on.

So for each set of images I need to produce for a cover, I’m usually using up 1-2 hours of my GPU time. As well as the time of perfecting my prompt to get great images, with usually about 3-4 hours of human time too!

It would be great if I had infinite GPU time but that’s actually not possible. So unfortunately it can’t work any other way!

It would be utterly wonderful if an AI could read an author’s mind, do exactly what you tell it.

And I think sometimes authors think if they just give me more words and if I work hard enough it’ll happen. 

It. Will. Not. 

Plus I’m running out of GPU time — and patience.

So the watchword here I think is: compromise. Because as we learnt earlier communicating an Author’s Vision is ineffective with mere language. 

There is a more powerful tool at an author’s disposal for getting great images with MidJourney, when I’m doing a commission for them, if they’re willing to open their mind.

But let’s start off with something simple before we get to them …

MidJourney Tools

Once we’ve decided on an image out of those 20-30 images I present to an author. Then we do have the possibility for MidJourney to edit that image within certain limitations.

The big one, for me, and it’s super practical, is probably panning. Without panning I wouldn’t have any space for your lovely title.

Here’s an example:

Also, and here’s another big one, is we can actually reroll parts of the image. 

“Great, that means I can change a hairstyle to whatever I want, right? Change clothes. We’ll do all that later, then?”

Nope. Not so fast. 

Because we can only replace rectangles of that image. So if that rectangle goes over something you do and don’t want to change, it will change both things. Also it’s not always perfect. We get messy results. Because we’re back to that 90% redundancy of duff rerolls.

Here’s an example of what I mean of your hair style idea. 

[PIC]

Pretty bad right? How about if we just change some smaller elements. Let’s try a couple of experiments. 

So there are some edits possible but only if an author can think in terms of squares and be prepared for me to turn around and say, “that didn’t even work.”

But my more powerful solution at our disposal, but less MidJourney related …

Give Thanks for the Happy Accident

I’m going to ask you, as an author, if we only have 4 or 5 descriptive elements to go on, which would you prefer to see: Ten images that are very similar which match your vision or ten images that are completely different in styles?

For me, I’m very much in the latter camp. Simple because choice is good. And it’s what MidJourney is really good at. Trying out lots of different crazy ideas. 

Having a set vision in our head limits us to all the possibilities out there in the universe. And me, I’m a happy traveller with a vague destination in mind and try not to hamper myself with too many expectations.

In this way visual creation is very different to telling a story. With writing you’re always going somewhere, a story is linear, but with the visual arts you get to play and explore until you stumble on something truly fantastic.

It’s the only way happy accidents occur. 

In fact, MidJourney works best when you push it to the edges. It even has its own parameter which I’m a master at, which is called ‘Chaos’. What a great name!

And given enough chaos and seemingly disparate concepts it comes up with all manner of fun things! And do you know what a good premise for a book is about, yep, you guessed it disparate concepts. It’s literally the best tool for making ace images for a book cover.

And the more you create an environment for me to play in, the more interesting images come out of the whole process. Allowing me to explore rather than sticking to a set vision. Like playing with Sref, which we talked about in my last post.

But the catch is, to go down that avenue you need to let go of the image you might have in your head and …

Be More Vibes-based Author and Less Details-oriented 

I know this is hard for some authors that commissions me, but it’s a powerful way to get a powerful cover. And it’s how MidJourney actually works best. How I work best. Yep, my happy place. 

But if we are trying to follow your vision, which I don’t mind doing either, you just need to remember you can half your cakes and eat it, but only half the cake!

So to Recap

When undertaking a commission with me where you have something very specific in mind, then you need to think what are your priorities for that image. Because MidJourney can’t do it all. If only. And hope that this post has helped explain why it can’t do it all. How it all actually works.

So I need something like:

  1. It need to be in a soft illustrated style like X cover
  2. She’s ditzy looking because it’s a plot point
  3. She’s walking in the Dartmoor National Forest
  4. She’s wearing bright yellow waterproofs
  5. She has a Jack Russel dog with her

After about point five, MidJourney is going to start to get very confused. 

So you’ll need to forget about all those other facts you might want:

  1. She has a bob haircut with bangs …
  2. It’s mousey in colour …
  3. The dog has a spot over his left eye …
  4. She’s carrying a nobbled branch as a walking stick …
  5. It’s about 6pm in the evening …
  6. On the 19th of November …
  7. And there is a slight South-westerly breeze …
  8. and you can see a village in the background …
  9. On and on …

But if you promise to let go of detail, I’ll promise to create you the best image I possibly can, to design you a great book cover.

And with that I’m signing off writing this blog, and signing-on to getting on with some of these commissions I have stacked up already. 

Until next time I will remain one educational …

James,

GoOnWrite.com

Great Book Cover Aesthetics – Using Sref with MidJourney

The title of this blog post sounds rather boring and technical but it is anything but. Over the last week there has been a seismic change with the tool I use to generate images, that I use to design book covers. And it would be remiss of me not to talk to you authors about it. 

So here goes.

But there is a little bit of groundwork to get through before we talk about how powerful Sref is. My favourite new little toy in MidJourney.

We’re definitely travelling very much into the future with this one. So maybe get yourself a futuristic beverage, get comfy, because we’re about to travel into the future of aesthetics. 

Aesthetics: The Medium is the Message

When it comes to book cover design, a lot of the time the medium is the message. But what do I mean by that?

Let’s break down what a book cover actually consists of. 

It’s very easy to think in terms of the subject on a book cover, for example it’s really easy to say something like ‘there’s a knight standing in a forest’ on the book cover, but there is a lot more that’s going on with an image.

Things we can break down.

Things that convey tone.

Things that will always speak to a potential reader, other information that we need to consider, when designing a book cover.

So I would split off the information into probably two very simple categories. Something like this:

So you can think of these things as the stuff on the cover (the subject) and the way you present the stuff (the aesthetics).

If you’ve read my earlier essay: What Makes a Book Cover Intriguing, you’ll know that any good book needs to hit that sweet spot between what people have seen a thousand times before and what is new! That blog post is a bit of a long one, about 25-30 minutes to read, but it’s well worth becoming acquainted with that information

But one thing that I say in that essay (near the end) is you can easily achieve this ‘intrigue’ by being playful with the aesthetics. You don’t want to confuse potential readers by sending out the wrong signals about your book, but you do have wiggle room to be different.

What do I mean by that? Let me explain. 

You can go and look through any category on Amazon books and find the sort of cover presented in the same aesthetic over and over again in each category with the same sort of aesthetic. 

You know the drill; all horror covers will be moody or grungy or blurry, all thriller books will feel muted and desolate, all rom coms will be presented as twee and cute simple vector graphics. Etc. Etc.

They are aesthetic tropes that work. And they work because they convey a message about the book. And the thing I hear and read a lot in ‘The Cult of Self-publishing’ is stick to the tropes, don’t confuse the reader. You will sell more books. And to that, I say poppycock.

There is a tension at play here, one that doesn’t get considered when people talk in this way, a tension with my job as a book cover designer. And that’s actually getting your book noticed. 

As a book cover designer, that’s what my job is! It’s my primary directive.

Think about it, if everyone uses exactly the same aesthetic tropes how are you going to get noticed? It’s just going to be a set of images that all look the same. As if everyone’s turning up to the party in the same dress.

And that’s where playing with these aesthetics comes in. 

Creating something that’s ever so slightly different. Something that gives potential readers that little brain fizz of The New!

If a book cover looks a little different, then potential readers are going to think, this book is a little bit different from the standard fare. Do you want that subtle psychological advantage or not? I thought so.

Everyone likes to think that their book is a little different from everyone else’s, right? 

That is the message we’re trying to get across and we can only do that with the aesthetics, if your book is about a forest knight and you want a forest knight on your cover.

So that being said, this little blog post is about my new tool I have at my disposal to achieve that. An aesthetic toy.

But before that let’s talk about how MidJourney actually works. I might be ‘teaching your grandma to suck eggs’ here but it’s worth covering.

What’s Prompting

We will get onto Sref, but let’s first talk about how prompting works.

If you’ve not been hiding under a rock for the last year or so you’ll probably know about AI Image Generation and it works with something we call ‘prompting’. 

Prompting works with natural language, or sort of, because it’s an art until itself. So you basically just tell MidJourney what you want and it creates an image for you.

So to take it back to the original thing I mentioned, what you generally do is as it for those two things: the subject, what you want it to draw, and the aesthetic, how you want it to draw the thing. 

So let’s take ‘the knight in a forest’ and do a few very simple prompts just so I can show you how the prompting works.  

Quality prompting, to make something interesting, is a bit more complex that just those simple terms. But you can see from these images that we can get very different images from the same subject. 

These different aesthetics say very different things to a potential reader about what this sort of story they are in store for. The aesthetic informs the reader.

So, for example, the minimal vector image might say: this book is more for children. The woodcut engraving version might say to potential readers, this book is more historically accurate because you’ve chosen a more historically accurate way of presenting the image. 

All aesthetics have a message they convey. They set a tone and expectation. 

In fact, in terms of design, aesthetics the best tool at our disposal to stand out. And we can play around with the prompting to our heart’s content to get the messaging right and keep the book cover eye-catching.

In fact, half of my job these days is simply about learning how I mix prompts together and come up with the right amazing tone for you authors. 

The other half of my job is learning new prompting tricks from others, by playing around myself, and then remembering to note down what the prompt things were. I’m 50 and my memory wasn’t what it used to be.

But my job has just got a lot more interesting with ‘Sref’. 

So let’s have a chat about it.

Introducing Sref

Sref, or Style Referencing, is something that came out for MidJourney about a week ago, at time of writing (start of Feb 2024). And it’s been so fun, I’ve just had to sit down and write a blog post about it. Simply because it is the perfect tool for achieving exactly what we need to achieve this ‘ever so slightly different’ aesthetic we need to catch potential reader’s eyes. Our sweet spot.

What that is, is a way of taking a style reference from a picture and using that to make a new picture. Almost like the AI is learning styles on the fly and applying them to your subject.

So where I’d normally put the aesthetic part of the prompt, such as watercolour, I can simply use an image as a reference instead. Or even both! Some style prompting and a style image reference. 

But let me show you what I mean, because it’s easier that way.

So let’s take our knight in a forest and do a few generations using different images as reference and see what we get.

Admittedly some of these would work better than others for a book cover, but you start to see the power of what MidJourney can now do with Sref.

It seems to be able to take the colour palette and the structure of the way something is presented, and the style it’s drawn in, angles of composition, and make brand new images using that complete aesthetic. Which is a game changer (more of that soon).

But not only can we take a reference image, we can also take more than one image and use both aesthetics to make a third.

So let’s take two of the images we’ve already made as style references, and make a third new image. See what happens.

So let’s try that, shall we. 

Pretty stunning results, with aesthetics I’ve never seen before! Brand spanking new aesthetic pulled from the ether.

And any of these aesthetics I could keep and then use them later for something completely unrelated or even remix them yet again with some other aesthetics. On and on. 

Let’s take our remix and make something new with it. 

Pretty interesting. It’s a mix of a mix with a new subject.

But why is this so important, why am I telling you all this? Why is it so seismic? Well there’s a good few reasons, so let’s go through them.

Important Reason #1: I Like The New

I have some friends that sort of like the same sort of music. Which is fine. But that’s not my path. It’s sort of the reason why I hate trap music, it’s sonically too samey. ‘Sonically’ being the audio equivalent of visual’s ‘aesthetic’. Trap is always that same heavy 808 kit and the same crispy 32 bar snares. Always. Zzzzz.

So I always go hunting for new tunes, new sonics, this is my jam at the moment, just for example. Nutty stuff. Great video. Maybe because it appeals to the sonic mixing of an African sound and a Berlin sound.

It’s also why I like going to fancy restaurants with odd flavour pairings.

What makes me happy is The New.

And this doesn’t just extend to my down-time. In my job, what keeps me interested in doing book cover design work is The New.

So at the moment I’m as giddy as a Kid at Christmas with a new Lego set. Yep, as a kid I was given Lego every single Christmas! Loved that stuff. Because I could make stuff from it. I never followed the instructions. 

Important Reason #2: The Creative Journey

I think anyone who is honest with themselves creatively has learnt to let go of their ego and just let the process take over at certain points in your journey when making anything. Sometimes the art just takes over at some point and the destination you had in mind actually changes. 

Something new is discovered along the way. Things change and you accept the change. And then you end up with your glorious result. 

This letting go makes good art. 

Imagination is always just a guide and not the be all and end all.

Trusting yourself to go with the flow and be in the moment.

And this Sref stuff is the perfect example of this 50% my idea and 50% letting the process take over. 

Previously when you needed to use words it seemed a way more rigid way of working. Like: this prompt didn’t work, I have a vision, how do I make it work. Right, change some words. That’s closer. Frustrating. 

With this new way of working, I get to let go a lot more and get my hands dirty and make a mess and just see what comes out of the other end. And everyone loves the mess of finger painting. It’s pure joy!

Important Reason #3: Peak Sweet Spot

You’ll see with some of our previous examples of aesthetic applied using Sref to our ‘Knight in a forest’, they probably wouldn’t cut the mustard for a book cover. Maybe the green background one for a female night with more romantic overtones, or the bloodier one for a more gruesome fantasy novel. But that’s not to say we couldn’t take style references to make something just a bit different. 

And that’s what I always try to do. Make something different. 

At the time of writing I’ve just had only two or three working days playing with Sref, but what’s become abundantly clear is its power and I need to start building and collecting my own images together as aesthetic reference to remix to my heart’s content.

So don’t worry. I’ll be making work that hits that perfect sweet spot between trope and far too zany. Collecting the perfect new aesthetics for each genre and remixing them to my heart’s content.

Important Reason #4: Peak Correctness

And he is where it gets even more interesting. On Friday night me and my good friend Mike were watching Leeds United play — something we do on the regular over facetime. We won, by the way. 

But afterwards I wanted to show him the power of Sref; I love the fact that he takes interest in my work. 

So we started with this picture here. Yep, vintage Leeds United team photo. 

It’s an awful picture in terms of quality, but it completely says 1970s, it has that perfect blur and colouration. It’s so time-specific.

So, I asked him what he wanted to make from that photo. Suffice it to say we’d had a few beers watching the match, it was Friday night after all. These were our examples. Quite daft, I know.

But amazingly aesthetically-correct results based on the original reference. It sort of blew our tiny beer-addled minds. 

So I now hope that example has got your brain cogs whirring, of how I can make anything aesthetically-correct! All by just using the correct reference, some prompt-craft and a little bit of imagination.

So just off the top of my head, here are some random examples.

Imagine we have a book about a 50s newspaper journalist that’s investigating a case involving two gangster brothers. We could find an old photo from a newspaper and try to go with that aesthetic.

Or maybe we want to get a vintage look from a really old romance novel with that lovely muted palette. I didn’t really ask for David Hasselhoff on the cover, but fair play. I was a big fan of Knight Rider as a kid.

Or maybe we just want a really strong fun colour palette and even an angle from something we’ve found but with a completely different subject. 

So it’s a wonderful tool for thinking about how we can get source ideas to inform the aesthetics of an image for a book cover.

Important Reason #5: Aesthetic Uniqueness

A few years ago, when I was limited to stock images there was a rash of book covers that use exactly the same image in different ways. It was unavoidable. Completely. Simply because there were only so many really good images to go around. And the best images got used all the time.

It made me super duper happy when AI came along, simply because I started making images that were completely unique that no one else was seeing. Because all the images were made from scratch, by my own hand. 

Authors were no longer going to get the same image that someone else could come along and use for their book cover. It was great.

Side note: In fact, I seem to be getting more and more authors coming to me, which have got covers from three or four years ago, when we used stock images, saying to me that another author (or readers) have said they’ve used the same image. Well, yep, it’s a stock image. And I think authors and readers feel that way, because there is a lot more AI, utterly unique stuff out there.

So AI has made my work unique and specific to that very book, and that book alone. I love this fact because it’s an advantage to the author. It matches the uniqueness of their book! 

But what Sref means is, not only that the image can be unique, but all my aesthetics can also be unique now. I’m no longer reliant on some sort of style prompt such as ‘Illustration in Watercolour & Ink Lines’ ‘Highly-detailed cartoon illustration’ or ‘Simple vector in the style of a corporate logo’ or what-have-you. Which made unique images but with somewhat similar aesthetics but now I can also change up the aesthetics as well.

Unique aesthetics for the win!

Important Reason #5: An Aesthetic Singularity

Something that gets talked about a lot when it comes to AI, especially AGI (Artificial General Intelligence), is something called The Singularity. That’s basically the wonderful or the super-scary point, depending on your viewpoint, when AI surpasses human intelligence. In terms of General Artificial Intelligence, it’s the point when the AI is smarter than any human so it starts creating, coding, training itself to be even more smarter, and smarter, and smarter, and so on.

Sref is a game changer, completely, but is it the graphical equivalent of the Singularity? Nope. Definitely not. 

But the feeling I get from it is that from an aesthetic point of view. Yes, it is. We’re getting styles from MidJourney that no human could think of, and definitely couldn’t execute. 

It has this wonderful feeling of treading into the aesthetical unknown. Like we’ve passed a point where things are going to get rather wonky and wonderful. I’m already seeing other AI art’s work which are things I couldn’t not imagine and I’ve never seen.

Another thing that happened on Friday night, when I was showing my friend Sref, I was showing him what other people in the MidJourney Community were doing with it, to get this same point across. 

How we’re now in that wonky and wonderful post-human aesthetic world.

The original isn’t one of my pictures. It’s someone else. But we then went ahead and used that image to make — you guessed it — a Sad Elvis. 

To me this is next level stuff. This is the point where things really start to get interesting when it comes to aesthetic exploration. It’s beyond what we as humans could think to execute.

Here’s another thing I made from a remix of a remix of a style from two images for a commission I did last Friday, using Sref. 

Again beyond what I could even imagine in my head. And that’s the point, if you can’t imagine the aesthetic in your head but the AI is creating it then we’re in new creative territory. We’ve crossed some Rubicon, for sure. 

Something that gets levelled against AI image generation by those that don’t understand it is that everything it makes is soulless. It’s a phrase that gets regurgitated all the time. It hasn’t been made by humans so it’s therefore soulless. 

There is an obvious flaw in that argument, due to the fact that it’s me playing with the AI, it is a human making it. It’s all my artistic journey. But what those people really fail to understand is that humans connect with the tone of everything. Those feelings come for the most part from the aesthetic presentation, as well as the subject.

Yes, a lot of the earlier AI stuff was very samey. All had that same aesthetic. Some of the non-MidJourney generators (Dall-e I’m looking at you) still suffer from this problem.

What these people were saying was: this image has an AI aesthetic so I don’t like it.

The line is still being touted out without much meaning behind it any more.

But things have passed that point now. Sref is an even further journey into that unknown, using human play and discovery. Little artistic journeys. If I wasn’t busy as a book cover designer, it’s actually what I’d be doing. Playing all day with Sref.

And there is an irony at work with that future Elvis image, which is perfect one for me.

It’s cold, sad and sterile all at the same time. It’s almost the perfect representation of lonely and cool in equal measure. As if the image is the soulless future the anti-AI brigade are scared of. But at the same time, it’s emotive. The tone is great! I love the dichotomy of the image, it stirs something in me intellectually and creatively. Which is what good art is about. Right? But it was made with AI.

But it did take me and my friend about about 50 or 60 generations to settle on this final picture. I didn’t magically get that Elvis on the first try. It was our journey together.  

So that’s what my next blog post is about, how the process actually works with my authors if they commission me. Because it’s worth chatting about.

So I better finish this post here and get on with finishing that one. Which I actually started before this one but then started writing this one instead, so I could play with and talk about Sref. I was that excited.

So, catch you next time when I talk about my process using MidJourney.

Kindness and smiles,

James

PS If you feel I’ve used the word ‘aesthetics’ a bit too much in this post, then spare a thought for me. I’ve had to type it all these times. It’s a horrid word to type. 

PPS The next thing that’s coming down the line with MidJourney is called Cref. It’s at the top of their development list. If ‘S’ stands for ‘Style’ what do you think ‘C’ stands for? Yep that’s right: Character. Could be interesting for people wanting the same character on multiple book covers. It’s definitely interesting. No doubt I’ll talk about that when we get to that point.

  • Recent Posts

  • Archives

  • Categories