Representative take:
If you ask Stable Diffusion for a picture of a cat it always seems to produce images of healthy looking domestic cats. For the prompt “cat” to be unbiased Stable Diffusion would need to occasionally generate images of dead white tigers since this would also fit under the label of “cat”.
These guys.
Exactly this. Generative AI shows that most people doing technical work are men? It probably also shows that most construction workers are men, most social workers are women, etc… Guess what, that reflects reality. If you want something else? You can ask for it. “Picture of a woman welding.” “Picture of a black, male social worker.” You’ll get it, no problem.
Over on Mastodon I linked to an NPR article where they kept asking Midjourney to generate images of BLACK doctors treating WHITE children in Africa and it was largely unable to do so, even with the prompts. Not “no problem,” but Midjourney sometimes literally interspersed Giraffes and Elephants into images with Black doctors.
I had severe decision paralysis trying to pick out quotes cause every post in that thread is somehow the worst post in that thread (and it’s only an hour old so it’s gonna get worse) but here:
Just inject random ‘diverse’ keywords in the prompts with some probabilities to make journalists happy. For an online generator you could probably take some data from the user’s profile to ‘align’ the outputs to their preferences.
solving the severe self-amplifying racial bias problems in your data collection and processing methodologies is easy, just order the AI to not be racist
…god damn that’s an actual argument the orange site put forward with a straight face
So this is how the tokenism sausage is made!
It works with other obvious stuff. Put the words “best, good, high quality” in your prompt actually makes the generated images better.
brb throwing away your account
I did not expect to get back to my laptop late on a friday and see someone “Well Akshoewally, If You Just Sing Gentle Sweet Songs To The Prompt then you get the socks you wanted”
but I guess the orange site had a spillover and has me covered today!
leading to the obvious question: if putting the words “best, good, high quality” in your generative AI prompt isn’t a placebo, then why is all the AI art I’ve seen absolute garbage
Ah, but have you considered how much worse they could be if they weren’t prompted with “high quality, masterpiece, best”?
all of my generative AI results have been disappointing because I didn’t give it the confidence it needed to succeed
One of the reasons I dislike this technology so much is that some of the ridiculous tricks actually (sometimes, sort of) work. But they don’t work for the reasons the interface invites the user to think that they do, they don’t work reproducibly or consistently, so the line between “getting large neural networks to behave requires strange tricks” and pure cargo-cult thinking is blurred.
I have no idea what exactly went into the training sets of Midjourney (or DALL-E), except that it’s probably safe to assume it’s a set of (image, text) pairs like the open source image generators. The easy thing to put in the text component is the caption, any accessibility alt-text the image might have, and whatever a computer vision system decides to classify the image as. When the scrapers appropriate images from artists’ forums, personal webpages and social media accounts, they could then also scrape any comments present, process them and put some of them into the text component as well. So, it’s entirely possible that 1. there are some of the images the generator saw during training that had “masterpiece”, “great work” etc. in the text component, and 2. there is a statistically significant correlation between those words being present in the text, and the image being something people like looking at. So, when the generator is trying to pull images out of gaussian noise, it’ll be trying to spot patterns that match “masterpiece-ness” if prompted with “masterpiece”. Clearly this doesn’t work consistently - eg. if the generator has never seen a masterpiece-tagged painting of a snake, it’s not at all obvious that its model of “masterpiece-ness” can be applied to snakes at all. Neural networks infamously tend to learn shortcuts rather than what their builders want them to learn.
Even then, most of it still looks like the result of a mugging in the Uncanny Alley. There’s almost always something “off” about it, even when it is technically impressive. Details that make no sense, weird lighting, shadows and textures, and a feeling of “eeriness” that I’d probably have the vocabulary to describe if I were a visual artist.
(PS: Does the idea of using well-intentioned accessibility features and kind words to artists to create a machine intended to destroy their livelihood make you feel a bit iffy? Congratulations, you are probably not a sociopath.)
I forget where I saw it, but the phrase/comparison stuck with me and I think of it often: all of this shit is a boring person’s idea of interesting
but the “just slap some prompt qualifiers on it (to deal with the journalists)” …god. it is of course entirely unsurprising to have an orange poster be so completely assured of their self-correctness to not even question anything, but the outright direct “just dress it up in vibes until they shut up”
you just have to wonder what (and who?) else in their life they treat the same way
a boring person’s idea of interesting
Agh this is such a good way of putting it. It has all the signifiers of a thing that has a lot of detail and care and effort put into it but it has none of the actual parts that make those things interesting or worth caring about. But of course it’s going to appeal to people who don’t understand the difference between those two things and only see the surface signifiers (marketers, executives, and tech bros being prime examples of this type of person)
ETA: and also of course this explains why their solution to bias is “just fake it to make the journalists happy.” Why would you ever care about the actual substance when you can just make it look ok from a distance
This is gonna be a little off the rails but bear with me:
I recently watched a youtube video that talked about how the contemporary jazz musician Laufey* and her audience run the risk of erasing the history and culture of jazz because they don’t take the time to engage with it. Instead, they are content with replacing it with an idealised parody/pastiche of that culture. Like how people wear mexican costumes and drink on cinco de mayo, or irish costumes on st pats day, or german costumes for oktoberfest, or 1920’s rich white people costumes for a gatsby party etc.
I’ve also been thinking about how the immortality fetish faction of treacles want to do brain uploading so they can live in a simulation forever. I think anyone would agree that such an existence is essentially the same as plopping on a VR headset and watching AI-generated content.
Putting these two ideas together, I’ve essentially reformulated what we already know about treacles et al, which is that they don’t want to acknowledge actual reality. Their model of the world is a pastiche of lazy stereotypes and reinforced by cherry picked statistics. They want to live in a space that confirms all their biases, basically an echo chamber in the cloud.
So yeah, when confronted with an observation about how generative AI produces biased results, we see an expression of the above. The AI produced parody is the reality they want to live in, so there’s no issue.
*I love Laufey. She’s great. You should give her a listen.
I commented about this when it was first posted but I’m still angry. These motherfuckers never consider that “reflecting reality” perpetuates that reality. And if AI art never surprises you, it isn’t art. But they don’t care.
Because reflecting “reality” never affects reality, right? …Right?
sounds like something linus tech tips would say
The amount of lazy “it is what it is” takes makes me want to vomit.
Every system has some form of bias, more or less, and a system that has less of a functional bias than another system isn’t necessarily a better one
I can’t even begin to comprehend how asinine this take is.