Within the photography industry, generative AI technologies seem to generate both enthusiasm and trepidation, in equal measure. As Wilco Versteeg rightfully points out, such ambivalence is not new: early camera technologies were met with a similar apprehension that photography would bring an end to the art world. While we worry that AI may threaten the future of photography, we’re also aware that AI could open doors for creative possibility, and many of us struggle to situate ourselves within this evolving landscape.
In an attempt to keep up with the rapidly changing technologies, I’ve bookmarked dozens of articles about the potential, pitfalls, and power of AI. I’ve read about the use of AI image generation technologies in fundraising campaigns, documentary storytelling, and propaganda; about compensation for visual artists whose work is used to train AI; and about bias built into AI’s ‘neural networks’. I attended the PhotoVogue Festival, where I heard lectures on governmental regulations for the use of AI, corporate-led initiatives to protect trust in photography, the copyright of AI-generated imagery, racism in AI technologies, and the societal implications of AI image generation technologies.
Given the massive growth of generative AI technologies throughout 2023, it’s important to take stock of the challenges and opportunities for ethical photographic practice that they present. However, one article cannot possibly address the multitude of ethical considerations that come with these technologies, which are growing in number day by day. Instead, I’ll narrow the focus of this article to three objectives: to articulate the ethical imperative for photographers to broaden the dataset; to consider how to ethically make work about others using generative AI; and to problematise the speed of AI in a culture of productivity in which visual artists necessarily operate. While these themes may seem disparate, I will demonstrate how they’re interrelated, and how understanding this interrelation can help photographers situate their practice in light of these emergent technologies.
Terminology
It’s important to begin by clarifying some terminology. First, I will primarily be discussing generative AI, as opposed to curation AI. Mel McVeigh helpfully outlines the difference between these two strands of development in AI technology: curation AI has to do with making recommendations for a user based on algorithms that learn from that person’s behaviours, whereas generative AI has to do with creating something based on a user-generated prompt.
Second, AI-generated images are not photographs. Writing with Light offers a useful glossary of terms to navigate these new technologies, suggesting that computer-generated images could be described as ‘synthetic images’ or ‘computational photography’. Others, including Elke Reinhuber, have put forward the term ‘synthography’. For clarity, in this article I will use the long form, referring to ‘AI-generated imagery’.
Although AI-generated images are not photographs, they can be photorealistic. While there are important ethical questions that warrant discussion across all uses of AI generation in the arts, it is photorealistic AI-generated imagery that often provokes the most complicated discussions. This is in part due to the way that we read photographic images as representative of reality.
I will not recap what has previously been written about photography’s claim to indexicality, its relationship to positivism, or its purported mechanical objectivity, but I do wish to reference Versteeg, who writes that ‘the photographic still serves as a hallmark of visual reliability’. This statement seems to echo the work of Friedrich Tietjen, who argues that, despite the rise of digital photography, photographs have maintained their special status as uniquely tied to reality, because the iconography of photorealism has come to connote an indexical relationship with the objects it depicts reality rather than the other way around. Such a discussion of the iconography of indexicality is particularly relevant when considering the ethical implications of photorealistic AI-generated images.
Photography as a dataset
As I explained in my recent TEDx Talk, visual media can be thought of as datapoints: each time we see an image of a specific person, place, community, topic, or thing, it adds another datapoint in our mind about what that thing looks like. This is a particularly apt comparison when we consider generative AI technologies.
AI image-generation tools learn from a dataset of images to understand what something looks like. For example, these tools learn what a chair should look like by processing a range of images of chairs, all identified by the keyword ‘chair’, so that they can produce an image of a chair when prompted. However, there is an inherent problem built into this model: AI image generators are guided by the dominant visual representations of people, places, and things. In other words, AI reproduces stereotypes. As Victoria Turk writes,
'bias in AI image generators is a tough problem to fix. After all, the uniformity in their output is largely down to the fundamental way in which these tools work. The AI systems look for patterns in the data on which they’re trained, often discarding outliers in favor of producing a result that stays closer to dominant trends. They’re designed to mimic what has come before, not create diversity.'
While AI-image generators produce images that are based on existing datapoints, photographers can add new datapoints, broadening the visual representation of people, places, and communities. Whereas AI can only work from the images that have already been made, photographers have a 360-degree view on the ground, in real life. From this vantage point, photographers can look around them and find visual narratives that push beyond our expectations and assumptions. Photographers can expand the dataset; AI can only repeat it.
Collaboration as a salve
Part of the reason AI image generators reproduce such a narrow view of the world is that they are learning from the existing photographic dataset, which has been dominated by white, male, colonial perspectives. In an effort to expand the dataset and create a more balanced visual record that is representative of a multitude of perspectives, a number of important initiatives have emerged, including Women Photograph, Diversify Photo, and Black Women Photographers. This critical discussion taking place across the industry – questioning not only who is being represented and how, but also who is doing the representing – has given rise to a renewed interest in the insider/outsider debate.
There have been many discussions about the utility of the insider versus the outsider perspective in photography. I believe that both perspectives have their merits. For example, an insider may have a much better understanding of and access to the context that surrounds an image. On the other hand, an outsider may be able to bring a different perspective that, if the photographer is sufficiently well-informed about the context, could be useful for seeing things in a new light. While there is value in having both insider and outsider perspectives in photography, the merits of the outsider perspective do not hold up when it comes to AI-generated images.
Since AI-generated imagery can be made at great temporal and geographic distance, the creator need not have actually been to the place or spoken with members of the community that they are representing. In fact, one can create work about people on the other side of the world without leaving their living room. Therefore, when a person uses AI image-generation technologies to create images about a place or community that is not their own, they are reliant on what they already know about that place or community. If they have done background research, they may know a lot. However, as is often the case, their work may be guided by preconceptions that are rooted in stereotypes. The assumptions of the creator, compounded by the reliance of AI-generation technologies on existing datasets that replicate visual tropes, result in AI-generated images that perpetuate harmful and essentialising narratives.
Given that AI image-generation technologies are known to replicate stereotypes, and that AI-generated imagery can be made without any direct engagement with a place, it’s worth considering the ethics of using AI to represent a community that is not one’s own. To this end, collaboration may be an important tool. Collaboration would be particularly useful for projects like Michael Christopher Brown’s 90 Miles, which he describes as ‘AI reportage illustration’. Brown used Midjourney to create a visualisation of the ninety-mile journey many Cubans made across the Gulf of Mexico to Florida in the years following the Bay of Pigs Invasion. However, collaboration is not only useful for projects that seek to visualise an historic event; it is a useful tool for all photorealistic works generated by AI, even if the work is intended as a purely artistic endeavour. All photorealistic images shape our understanding of others, and so all such images have the potential to either replicate or challenge stereotypes.
Time as a resource
Collaboration became an important element of Exhibit A-i, an AI-generated set of images based on the testimony of refugees and asylum seekers who were held in Australia’s off-shore detention centres. The production team from the law firm Maurice and Blackburn asked the individuals who contributed testimonies whether they would like to be involved in the image-making process and, if so, to what extent. Out of eighty contributors, fifty wanted to be involved, and twenty-five to thirty wanted to be ‘intimately involved’. Of course, collaboration takes time. For example, lawyer Nicki Lees says that it took three weeks of back and forth between the image-making team and a contributor to make sure that their memory of a mouldy tent on Nauru was accurately represented.
While collaboration takes more time, AI technologies are often used to speed things up. As photographer Mayan Toledano explains, ‘the technology allows us to not spend too much time on the technicality of things and more on the connection, the storytelling, the humanity within the story’. Therefore, when making work about others, perhaps the time saved by using AI image-generation technologies could be put toward implementing more collaborative methods.
Moreover, Myesha Evon Gardner describes how AI helps to circumvent tedious and monotonous tasks: instead of manually removing an entire forest tree by tree in Photoshop, AI can remove all the trees in one click. Gardner suggests that the potential of AI to speed up the process can ‘expand your own imagination’. Similarly, Sebastian Rodriguez, a staff member at Google, explains that Google seeks to use AI to ‘remove the friction of creating’, so that artists can ‘spend as much time creating and discovering the technology’. This idea of friction stuck with me, and made me wonder: Do we need friction? Are frustration, tedium, and slack time essential components for creativity?
Researchers like John Eastwood suggest that boredom may be an important component in the creative process, for ‘boredom triggers mind-wandering, and then mind wandering leads to creativity’. In my own creative practice, I have found that the hours spent in trial and error led me to creative results that I would never have discovered otherwise. While making slow still life, I experimented with using sustainable and non-toxic alternatives to traditional darkroom chemistry. This was an often frustrating and unpredictable process that took time, patience, and an abdication of perfection. In the end, I embraced the idiosyncratic results as a constitutive element of the work, and I believe that it was the better for it. In this sense, I relate to Mel McVeigh, who states that ‘sometimes we want a bit of boredom. We also want a bit of serendipity. And technology finds it really hard to do serendipity’. AI image-generation technologies may struggle with serendipity because they ultimately want to please us, the consumers, according to Fred Ritchin.
I wonder if the surge of interest in AI image-generation technologies might be symptomatic of a culture of productivity in which artists need to be creative quickly. Indeed, there are numerous articles suggesting that AI can help us work better and faster, with some studies predicting that generative AI will precipitate the ‘next productivity frontier’. However, if some ‘friction’, like boredom or frustration, is actually generative of creativity, how might artists incorporate these technologies into their creative practice?
Possible paths forward
While some photographers have been spurred back to analogue methods in response to the rise of AI image generators, others have embraced AI as another tool in their proverbial toolbox. Whichever camp you sit in, the reality remains the same: AI image-generation technologies are readily available, and they’re not going away. Therefore, I would like to conclude by considering how some people have sought to develop thoughtful engagements with these emerging technologies.
Andrea Sommer, a lecturer in visual communication at the Basel Academy of Art and Design FHNW, explained to me that she often uses the AI image generator Disco Diffusion to challenge the immediacy of AI technologies. Somner prefers Disco Fusion to other generators because it almost simulates the experience of darkroom printing as the image slowly emerges on the screen. She also posits that AI may have other similarities to analogue photography, because the creator feels uncertainty about what the final image will look like. Although the creator writes the prompt, the resulting image is ultimately shaped by the dataset of images from which the generator has learned. Therefore, the resulting image may be far from what the creator anticipated.
Similarly, Ritchin has found that the AI generator sometimes creates images from his prompts that he didn’t expect. He explains, ‘AI is not a tool. It’s not like a hammer and a nail. You hit the nail, it does what you tell it to do. [AI] has its own mind or perspective. It does things you’re not expecting’. In other words, AI doesn’t simply produce what we’re asking for but rather works to produce what it thinks we want – and maybe it knows what we want better than we do. This understanding of how AI operates has shaped how Ritchin works with image generators: ‘What I do is often give it space to reflect as opposed to trying to get it to give me what I want. I don’t want what I want; I want what I don’t know I want’. This kind of critical engagement with AI image-generation technologies is self-reflexive, acknowledging the technology’s limitations – and its possibilities. For while Ritchin warns of the many risks associated with AI-generated imagery, he also leaves room for possibility, asking, ‘how can AI be helpful?’
Speaking from a product design perspective, Florian Koenigsberger suggests that there may be ways of building reflexivity into the user interface:
'You would never want to interrupt somebody’s intent in a product to do the thing that they came to do. But in the spirit of challenging the norm, if we know that the future is that people are going to flock to and live in these tools, . . . where is the obligation to create some part of that education in the experience of using the tool? . . . I think it is also the responsibility of the builders to think critically about how you prompt people to think about the act it is that they’re performing. How do I expand the surface area of learning that can happen at the moment of intent?'
This idea of building teachable moments into the interface, so that the user is prompted to think critically about their use of the technology as part of the process of AI image generation, is very interesting, and I would like to see it put into action.
But, as Zahra Rasool warns, we cannot wait for the corporations to act. She explains, ‘I don’t think any tech company is going to act responsibly if it doesn’t benefit their profit margins’. For now, it’s up to us, the consumers, users, creators, and image-makers, to think critically about the images we produce – whether we’re working through a lens or an algorithm.