08 June 2023

Publishing industry needs consensus over transparency and accuracy before throwing stones at AI, says chief executive of Impress UK

Publishers can help mediate the harms of generative AI but they must also get their own houses in order over transparency, said Lexie Kirkconnell-Kawana, chief executive of Impress UK, independent press monitor and advocate for trusted news.

Speaking on day two of FIPP Congress 2023, where generative AI has been the major topic of discussion so far, Kirkconnell-Kawana – who is also a trained barrister in her native New Zealand – aimed to dispel some AI myths from a legal perspective by asking how a regulator might approach the issue.

“I’ve been working in this space for over a decade,” she said. “And I’ve seen how we tend to be reactionary, to have a bit of a hype lens, when it comes to new disruptive technologies. Our instinct is always to create a new framework in response to new tech, to create an entire new set of rules, sometimes throwing the baby out with the bathwater. But we need to hit pause.”

Sounding the harm alarm

For Kirkconnell-Kawana, harms are a really good starting point to think about this technology.

“Generative AI is a tool. We’ve heard a lot here about how new technologies like this can enable business, enable innovation. But we have to remember that the tech itself doesn’t have intentions – we are only looking at a set of functions.”

How, then, do we understand the functions of this technology and how do we best address the harms brought about by it?

Looking at generative AI with a functional lens, it is doing four things:

Scraping content
Running machine learning to create images, text, audio, video
Storing information for training
Recombining and conditioning data

The harms affecting publishers are obviously very different to the harms to the public.

“For publishers, the scraping and storage functions are where they are implicated, for example through copyright,” Kirkconnell-Kawana explained. “Generative AI may threaten business models, especially when it can do a job better than a human. There are also competitional harms.”

For consumers, the main issue is that the end product is indistinguishable from something human-made. “Obviously this erodes the user relationship with the outputs, and their sense of how accurate they are,” said Kirkconnell-Kawana.

The user’s relationship to the outputs is threatened, as well as the accuracy and reliability of those outputs. “We have to remember that accuracy is not the object of this technology,” she added. “It is designed holistically, modelling plausibility – not accuracy, not veracity”, and not using cited sources.

“The broader question we might want to be asking may therefore actually be one about advertising and advertising regulation, whether generative AI can be marketed as accurate. Also whether users understand that what they are looking at is not accurate.”

Those in glass houses …

Kirkconnell-Kawana referenced a study in April this year, where news data was scraped across 49 websites advertised as news websites – websites which were almost wholly generated by AI with no or almost no human oversight. All failed the test for transparency. “So there are AI sites that don’t cite, don’t source, don’t attribute, that don’t take steps to ensure accuracy and veracity,” she said.

But while generative AI’s total lack of sourcing might be unnerving, humans are pretty bad at this too, said Kirkconnell-Kawana.

“I can tell you this having been in the business of regulating news for a while now. Transparency, not disclosing ownership, etc. is all business as usual. Problems around accuracy are not unique. They might be worsened or thrown into relief by AI, but they are not new.

“To solve this, what we need is consensus across the news publishing industry to ensure that our house is in order, before we start pointing the finger. And that requires cultural and structural change on how news is organised and regulated.”

Legal challenges already underway

In the US, civil litigation has naturally already begun on the issue relating to storage, copyright licensing and attribution and more.

However, warned Kirkconnell-Kawana, it is really difficult to predict how the courts are going to decide. It is a process that could take many months, or even years. Crucially, whatever precedent is set if these cases come to court judgements, may not change how businesses operate or how the tech functions.

“Despite these claims being launched, they may not be successful. There may be credible defences by the AI companies,” she said. “For example, what is copyrightable? In the news business, if the information that is being scraped and sourced on news sites is found to be factual, it is not capable of being copyrighted.”

The courts might also look at determining fair use, in terms of the proportion of the works used in the AI recombination process to create the end product.

“This is going to be extremely disruptive and transformative for copyright law,” said Kirkconnell-Kawana. “The outcome of these judgements will inform it. We may see a wellspring of copyright regulators emerge in response to this.”

Away from the US, the most imminent regulations that are likely to touch generative AI are the EU’s Digital Services Act (DSA), AI Directive and Cyber Resilience Act. The DSA in particular demands enhanced transparency, so service providers will have to supply risk assessments to regulatory authorities.

Publishers as trust mediators?

Evidently there is a role for regulators, but also for publishers.

“My provocation to you is: how are we going to come to a consensus about this technology? How is the publishing industry going to mediate harms and improve trust?” said Kirkconnell-Kawana.

“There are calls that validators should be required, demanding sources, validation, credentials etc. But these validators will not necessarily ensure accuracy, only listing of the sources used. We need to do more work here on how we tackle concerns around transparency and accuracy.

“The consensual use of content is obviously a big issue for the publishing industry. But on issues relating to accuracy and transparency, publishing really needs to get its own house in order first before it starts throwing stones at AI tech. Once we have that consensus building, we can help users understand trustworthiness of sources better.”

Impress recently published a Standards Code for publishers to use, which reflects the fact that AI is likely to be a tool in their arsenal. She hopes that it will be widely adopted.

In this interim period between court cases and regulatory interventions, something publishers can do immediately is to help the public to understand why they should not blindly trust what generative AI produces, and instead follow trusted sources.

“At this stage, what’s really going to shape the outcome of AI tech is market dynamics and public uptake,” Kirkconnell-Kawana said. “We are already behind the curve.

“There’s been a suggestion that publishers should list terms and conditions for content scraping on their websites. Licensing agreements are also being entered into (eg. the recent deal between Shutterstock and OpenAI). But what we don’t want to see is the diminishment of open-source aspirations by shutting things down and reducing access to information online.

“We want to make sure that these generative AI operators are being pinned to jurisdictions and existing regulatory/legislative frameworks. What we don’t want is for the tech to reach scale, to develop in a way that makes them indispensable to public life, and then cherry-pick which jurisdiction is most favourable to them, profiting on their indispensability.

“We want freedom of expression, privacy, legal and human rights to be preserved by this tech and not stifled.”