Last week, I helped create visuals for a fictitious athleisure brand called Limitless, using Midjourney. While a major piece of the branding was developed using ChatGPT, we used Midjourney to create a visual for the ad campaign that featured their latest product — a sports bra, which interestingly, was the product recommended by ChatGPT.
In the beginning…
My prompts for creating this campaign visual started with “South Asian girl wearing a sports bra and leggings, on a billboard” and ended in “A youthful photoshoot of happy South Asian girls wearing a sports bra and sports leggings, on a wide billboard”, and many more in between.
Here is a quick summary of what I learnt in the process.
Disclaimer: This was the first time that I was experimenting with Midjourney. So if there was something I could have done better, it’s probably because I didn’t know better.
Anyway, the learnings.
1.Your first prompt teaches you a lot!
As you can see, there are issues here with posture, proportion, muscle density, body shape, etc. The resulting image comes with a lot of distortion. You can’t also unsee an element of objectification. There were multiple problems to tackle here —the overall form of the body, design of the sports bra, layout of the campaign ad, photograph of the billboard, etc.
I gave up on two things very early on — the design of the sports bra and layout of the ad campaign image. I decided to let Midjourney dictate both of this.
2. Gaming the system
When it rendered grimacing and incorrectly constructed female bodyforms, I had to change my prompt. I put myself in the shoes of an art director at an advertising agency, wondered what sort of brief they would be giving their junior visualisers 🤔
My imaginary AD: “This product is supposed to be aspirational! The girls need to look happy, cheerful. The overall imagery should be pleasing to look at.”
My imaginary Visualiser: “Yes, sir! Right away, sir!”
I began using happy, smiling, pretty, etc. This helped me arrive at visuals that were different from the original set. I’m not proud of using ‘pretty’ but words that relate to conventional standards of visual appeal, does help in ‘gaming the system’ to get to a result that is closer to what I imagined my Visualiser would want. Although, I noticed that after I moved to Midjourney v5 at a later stage, I didn’t need ‘pretty’ as a keyword to avoid the unnaturally muscular body proportions. Interestingly, it also resulted in less-sexualised visuals with the product in focus.
Having said that, it is good to be mindful of the fact that Midjourney and other AI programs are largely trained on conventional standards and a whole lot of biases.
3. The AI cannot always understand your request
While I wanted a billboard with the photo described in the prompt. Several times, it gave me results where the girls or a part of their bodies were outside of the image on the billboard. It was unable to deconstruct and distinguish certain parts of the command in the prompt. Perhaps this is a syntax issue that I haven’t yet figured out. Therefore, I decided to keep the prompts relatively short to make sure the requirement was easy enough to interpret.
Later, after last week’s post, I learnt that you can create longer prompts with significant detail for accuracy. One of the examples that I found had this prompt — “Editorial style photo, medium closeup shot, off-center, a young, brunette, french woman, sitting at a Marble Table, wearing a black gucci dress and diamond necklace, in an Art Deco Dining Room with Velvet, Brass, and Mirror accents, Jewel Toned color palette, West Elm, Chandelier, Restaurant, Evening, natural lighting, Fujifilm, Luxurious, Historical, 4k –ar 16:9 –stylize 1000”
When I tried this, the image below is the first set that I got. Therefore, this method appeared to be effective.
This led me to try the following prompt — “Editorial style photo, long shot, off-center, on a billboard, two South Asian girls smiling, wearing a black sports bra and black leggings, in an empty gym with natural light, Fujifilm, 4k –ar 16:9 –stylize 1000”
The hardest bit was trying to get two images in one — an image of a campaign and a long shot of the billboard. In my attempt with the detailed prompt, I think I managed to get one part right — the image for the campaign. Getting that image on a billboard was proving to be tricky. One solution is to create two images and then to merge them in a photo-editing application. I still couldn’t figure out why it was working with some prompts while not for others. Perhaps, I should use ChatGPT to fine tune my prompts.
In 2022, I had come across a piece on how AI programs might effectively create visuals but lacked an understanding of them. These AI programs are trained to predict, not infer or understand a request. This is probably why it is unable to distinguish the request for an image in an image.
4. Midjourney v5 is superior to v4
Halfway through the prompts, I realized that the default setting was v4. I immediately changed it to v5. The results were evidently better. The quality of the render improved, the faces were less disproportionate, the poses were more natural, the designs for the sports bra, t-shirt, leggings and joggers improved substantially. Midjourney v5 was able to render outcomes that were complex. It had clearer looking logos and slogans (even though they were still gibberish). There were also a lot more external details around the key visual.
However, there were two problems that I could see. One, text was illegible or treated like an abstract form. So you will not be able to get a clear logo or slogan on these images. Although, in one of the renders, I did see a logo that resembled Adidas.
Two, there were still cases where the fingers, arms, and waist were not fully or accurately constructed.
5. Generate repeatedly through the same prompt text
For most of the process, I thought I needed to tweak the prompt to get better results, and this often happened. For example, from the moment I started adding “A photoshoot for a youthful brand…” I did get better results.
The other thing that worked was to not give up on your prompt. Use the same prompt again and again until you get an image that’s closest to what you want. After that, you can create multiple variations of the selected image till you arrive at a workable result. This is how the final image was created.
For those curious, here are all the various prompts that I had used
– A South Asian girl wearing sports bra and legging, photograph, on a billboard
– Happy South Asian girl wearing sports bra with coverage and a pair of leggings, photograph, on a billboard with trees and sky behind it
– A group of happy South Asia girls wearing sport t-shirt, sports bra and joggers, photograph, on a billboard with trees and sky behind it
– Girl with no muscles wearing sports bra with coverage and a pair of leggings, photograph, on a billboard with trees and sky behind it
– A group of happy, pretty South Asian girls wearing sports t-shirt, sports bra and joggers, photograph, on a billboard with sky behind
– Three happy, active, pretty Indian girls wearing sports t-shirt, sports bra and joggers, on a wide billboard with space for content, sky behind
– Two happy, pretty South Asian girls wearing sports t-shirt, sports bra and sports joggers, indoor fashion shoot photograph, W:H, 16:9
– A cool sport brand photoshoot with a happy South Asia girl in sports bra and sports joggers, on a billboard
– A cool youthful brand photoshoot with a smiling South Asian girl in sports t-shirt and sports joggers, on a billboard
– A cool youthful brand photoshoot with a softly smiling South Asian girl in a white sports t-shirt and blue sports leggings, on a wide billboard, frontal shot
– A cool youthful brand photoshoot with two smiling South Asian girls in sports bra and sports joggers, 8k photograph on a wide billboard, frontal-angle
–A cool youthful brand photoshoot with two smiling South Asian girls in sports vest and leggings, 8k photograph on a wide billboard
As you may have noticed, one of the hardest bits was trying to get a convincing image with the billboard. There are two images here — the ad campaign and the shot of the billboard, this also made the prompting exercise a bit tricky.
In the end, I did use Adobe Photoshop to clean up some parts of the image and then introduce the slogan and the logo for THC’s fictitious brand, Limitless.
I am not sure if this is the ideal solution for a brand, as we are not showcasing the product we have designed. Midjourney does let you include an image that you have uploaded into the prompt. Perhaps that would help with integrating the actual product design with the photograph.
While this was an engaging exercise, I would like to point out that all images that I was able to produce through prompts were based on visuals that the AI model is constantly being trained on. We have no way to determine the original owners of this work.
While there are issues of ethics and ownership that need to be tackled, in its current form, this tool can help in creating proof-of-concept visuals, which can then be used to explore newer ideas, brief creatives and design teams, run design workshops, and augment (not replace) our creative process.