Overview

Local News and AI

Welcome to API’s Local News and AI series. I’m Elite Truong, API’s Vice President of Product Strategy. Over the next four weeks in August, we’ll be sharing resources and advice on how AI can serve your news organization. We’re here to help cut through the noise and share what you need to know about AI as a local journalist. 

An intro guide to understanding AI

Every day there seems to be news about AI that affects us and our industry, Elite Truonglike OpenAI partnering with the AP for local news and hiring an ex-Microsoft lawyer to represent publisher licensing negotiations.

National newsrooms are recruiting AI and automation engineers and local newsrooms are automating calendar events. It seems that everyone in the media business is considering AI and even Google is starting to pitch AI-enabled tools to help journalists automate headlines and descriptions for SEO (meta, isn’t it?).

Even being outside of a newsroom for a year and speaking at multiple journalism conferences this year as an emerging tech/news expert, my head is spinning, keeping up with all the developments. Should local news leaders be worried? How do we decide in real time what will help our newsrooms or not? So, in this series, I’ll be answering some of the most pressing questions for local newsrooms considering AI.

DIVE IN

Today, we’ll start with what’s worth paying attention to — an intro guide to help you decide whether or not AI is right for your newsroom.

  • Defining AI:
    • There are many kinds of AI, but the branch that is the most accessible to us is generative AI. The most famous example of this is ChatGPT, which works as a prompt system and gives us a response based on an AI model that has been trained on billions of web pages, including news sites. You can think of it as a student who has memorized a billion pages and will return an answer based on their general research without personalizing it. Dall-E does similar work with images. Both are easy to register and get started to see how they work.
  • What could I use AI for?
    • Right now, the most trouble-free use case for AI in your newsroom is to automate busy work so reporters’ time can be put to better use and they can focus on bigger stories. Can we use ChatGPT to start a service piece outline on how to register as a voter ahead of the elections? Can we automate taking meeting notes, or building focus time blocks across teams?
    • Look through use cases in Generative AI Newsroom and the Center for Cooperative Media’s Beginner ChatGPT Prompt Handbook to learn how newsrooms are using generative AI to summarize legal documents, generate social posts, copyedit work, create templates for internal documentation and create news quizzes, on top of the AI-enabled transcript tools you might already have been using, e.g. Otter.ai or Trint. I also recommend listening to the Newsroom Robots podcast to hear a global perspective on how newsrooms and researchers are using generative AI around the world.
  • How will I know whether or not it’s worth using AI? 
    • If you decide to use AI tools or experiment with generative AI in your newsroom’s workflow, it’s worth creating an AI success metric to make sure you’re getting something out of it. It may be a win with your team giving tasks to AI tools in newsgathering, but the success metric should measure what your reporter works on with saved time that your audience engages with–otherwise, it may not be worth the time saved and energy or resources invested in that AI tool.
  • Is it ethical to use AI?
    • This is the first decision you’ll have to make for your news organization. My answer to this question as a seasoned news product leader  is “no, but our ethical qualms in using this technology are surpassed by how it can help us contribute so much more to the end goal — serving our audience.” In the meantime, we have to contend with this uneasiness and consider how can we contribute to getting licensing in place so local newsrooms receive compensation from AI use.
  • Do I have to use AI in my newsroom? 
    • It’s easy to look at the AP and other, often more resourced newsrooms establishing first-time use cases and feel like you’re lagging behind. However, this is not a good mindset for intentional innovation that helps your newsroom reach, engage and serve your audience in better ways. Keep exploring AI resources to see if a use case piques your interest. And once you decide, make sure to communicate it clearly to your staff and audience.
  • How should I keep up with the news about AI?
    • This is comparable to how you expect your audience to follow election coverage. A small minority needs to know everything that happens, likely because they work in politics. The majority needs to know top headlines, how it affects them and the opportunities they should pursue. Understand what’s helpful to you, and if horse-race AI reporting isn’t serving you, choose only a few outlets and reporters to follow. For a stepped-back look at how companies are working with AI across industries and privacy concerns, I follow Karen Hao and Madhumita Murgia. My favorite AI researcher and journalist who studies algorithmic accessibility and bias is Meredith Broussard. For local news use cases, I recommend the resources in the above section.
  • What else is interesting about AI in journalism?
    • AI is more than ChatGPT, but no matter the format, these programs generally work best if you give them a simple job recognizing patterns in data, documents and images that are worth reporting on, or use them to help get you started on labor-intensive work, such as suggesting social copy based on your article. Aside from those, there are compelling use cases in media that use machine learning for paywall and content discovery and computer vision for visual investigations and reconstructing events. The market is flooded with AI tools to help newsrooms create more video content from their written content. What I’m most optimistic about AI is seeing how much it can help our readers with smarter recommendations and evergreen content in your newsroom’s archives, which should help more readers engage with and support your journalism.

NEXT STEPS

If you’ve made it this far, you’re likely grappling with the question: Should I be doing more with AI? The takeaway this week is to make a decision before your leadership or board makes one for you. Consider the questions of whether your newsroom should be using AI and how you can keep up with AI news. In the weeks ahead, we’ll continue down the road of how to make your decision transparent to your staff and audience to keep your credibility and trustworthiness.

Chapter 2

API engineers weigh in on automation

Behind API’s Tech Talks, two products (Metrics for News and Source Matters) and friendly customer service to our 100+ local newsrooms, there are two engineers who tirelessly work to make sure our products are up and running and collaborate with our customer-facing teammates to process feedback and inform future product development.

Meet senior application engineer Stephen Jefferson and web applications engineer Marita Pérez Díaz. The two have been closely following all the generative AI trends this year — including the assumption that certain products or technologies are silver bullets that fix huge categories of problems on their own. We sat down with Stephen and Marita to discuss the trends they’re seeing, their favorite resources and what you need to know to continue evolving through this era.

Photos of Marita Perez Diaz and Stephen Jefferson

As journalists and engineers who have been working in news for the majority of your careers, how are you feeling about generative AI and local news? What concerns do you have, and what are you looking forward to?

Marita: I’m very concerned about how much content will be generated [using generative AI] and deepfakes. How will newsrooms keep up with fact-checking? It’s going to be huge — it’s already huge. How will tools be developed to keep up with the misinformation? 

But I’m also concerned about people who are ignoring AI, who only see it as a trend. This is happening so fast that if people don’t get trained and educated about it to use it to enhance their work, they will be behind in the industry. [News leaders] need to stay informed to make the best decisions for their newsrooms.

So many people are starting to use AI tools, but keeping their old structures in place for delivering news to the user. But now we have to take the user into account more than ever. In 2021, Modular Journalism  made a way to create articles in modules depending on what the audience wanted. The article would change depending on the audience persona. I find that very interesting. It doesn’t mean all news has to be experimented on that way, but the goal is to figure out better and updated ways to deliver news. 

Stephen: AI is not a totally separate period of technology, it’s a continuation of what we’ve been working on for awhile where we can reuse resources we already have.

I’m concerned that there are organizations thinking they can just jump into it [using AI in their work], but they should follow some of the resources out there. There’s a big gap of misunderstanding and it doesn’t seem like journalism [industry] is taking it quite seriously as it should. How are they thinking about experiments and time spent on this?

There are task forces spinning up and guidebooks and guidelines and ethics policies for AI, but none of those are new concepts — especially for fact-checking and personal data and privacy. We need to apply some of those practices.

Stephen, how will AI affect local newsrooms and what should they do to prepare? Editor’s note: Stephen hosted a presentation on AI preparations for newsrooms in May.

S: The fun and nerve-wracking thing about that presentation in May was that it was about preparation and readiness about AI in journalism — so many newsrooms were prepping at that time, and just the day before, Google announced a hub [dedicated to the effort]. It was a timely topic to present, but also so much to learn.

It’s not a single linear solution to prepare for AI. It is a whole continued process to make sure you have the right things in place to stand on so you have eyes in the air for new announcements with partners or new regulations or policies to guide you. How do orgs see themselves positioned with that? Approaches can help them understand when opportunity comes up, whether or not they should participate.

Marita, how might local newsrooms take advantage of AI tools and approaches to meet their editorial goals?

M: I would recommend taking advantage of education, guidelines and training first, not only on what to do but also on what NOT to do with AI. It is easy to make mistakes, especially ethically or by delivering wrong information generated by machines. Not learning how to use it and not being transparent about the use of AI could have a huge negative impact.

I believe that journalists could learn the basics of coding as well, especially if they are part of a local small newsroom that is understaffed. That could facilitate communication with engineers on the team or give reporters the possibility to set up tools that contribute to their work. For example, when I was a journalist in a small newsroom, I automated some processes using Zapier, which helped me capture subscribers for MailChimp using a Google Form and Instant Articles. The AI tools of Canva could also help a reporter to set up social media posts in an automated way.

A good strategy to follow that many experts recommend is to reflect first on what problems your newsroom would like to solve, and then look out for an AI tool that could help to solve the issue. But even tools that we use daily, like Google Workspace, Google Meet or Zoom, have already integrated AI tools that we can take advantage of, so, for example, reporters won’t have to take notes manually again and free up that time for something more important.

In many communities, reporters could also connect with open-source databases and automate the reporting of local crime, restaurant inspections or school sports. For example, in South Florida open source data could be used to tell your audiences if beaches are open or not, automatically.

Establishing partnerships with other newsrooms and with local universities is also a great way to experiment with LLMs and AI in general, as the CS departments may be able to share resources that local journalists don’t have access to. In the last JournalismAI course, there was a great example of that kind of collaboration, where Ojo Público worked with a professor at the Central European University to develop a tool called FUNES which lists government contracts and assigns a corruption risk score.

Stephen, most journalists are interested in finding ways to automate busy work. What are your thoughts on setting up automation workflows and maintaining those products over time? What should local news folks know?

S: I like this article about thinking about AI preparation like baking a cake. There was a study to see whether people who want to make a cake prefer just adding water to a mix, or a more lengthy process of adding in eggs and other ingredients. They found that people don’t want to just add water — there’s something human that is missed when you don’t add more involvement in the process. Cake mixes changed strategically to have humans add eggs to get them involved in the making. The same goes for AI tools — humans want to be part of that process, not just writing up a scope of work and saying build this, they want that kind of interaction.

I’m a skeptic on automating workflows. I see 50% of these projects to automate workflows as unnecessary — for example, at an information architecture conference I attended, there was an organization who wanted to use AI to build a system for knowledge management, basically for institutional knowledge. You don’t have to use AI for that, you just need to organize information better to be more accessible for you.  

This regards how newsrooms approach “problem framing.” Many folks are employing AI to take on “repetitive tasks,” which are seen as problematic inefficiencies. However, there are other lenses that could propose other solutions — perhaps the task being repeated might be reorganized to resolve repetition altogether. This is “dissolving the problem,” one of Russell Ackoff’s four problem frames. In my own work, I usually look at the “dissolving” lens first.

I worked with newsrooms for over a decade in data structuring, and…to go through tagging something correctly and structuring the data in the right way, journalists will do that to an extent and want to skip steps to make themselves more efficient by going straight to AI. 

I’m not all for automating workflow just because it’s tedious. If we keep going down the path of skipping steps, we’re going to be more dependent on those technologies. I’m nervous about that tendency to “skip” in the name of efficiency.

M: I agree that you need to be organized as a starting point, but once you get that sorted out, if it needs manual time or doesn’t need creative elements, you absolutely need to automate that because it’s taking time from focusing on more important things like fact checking or creating ideas for reporting.

TRY IT OUT NOW

Chapter 3

Should I give AI a job in my newsroom?

In the first two weeks of this series, we advised you onPhoto of Elite Truong how to consume news about AI and heard from journalism engineers on their concerns and hopes for AI and how it’ll change the way we work in newsrooms to better reach and engage our audiences. We learned that we don’t have to learn or know everything.

This week, we’ll talk about the question that helped me build a presentation at Media Party this year: what should I do with AI, or any other emerging technology?

The answer is: nothing, if you don’t have a problem worth solving.

WHERE TO START

The tricky thing about working with emerging technologies that I’ve found over the years is that technologies are solutions looking for problems. Technology is a hammer, so its perspective is that everything is a nail that can be hammered. The difference is, AI doesn’t get to decide — you do, and you have a full toolkit of potential ways to fix problems. The key is to know what problems you have that are most important and most worth solving, and start to learn about new tools that might solve those problems, including AI. Without a problem worth solving, experimenting with AI or any other technology won’t be worth the time.

In my presentation, roughly 50 people were tasked with identifying where the biggest, most important problems worth solving existed in their newsroom. This is where you should start — know your newsroom well and the solutions will follow.

Where do the biggest operational problems exist in your newsroom? Identify a few from the production cycle and list them.

Get to know your newsroom better: an illustration of a cycle through ideation, pitching & evaluating ideas, planning ideas, production & collaboration, publishing & distribution, and audience feedback & iteration 

Arguably, one of the most neutral ways to experiment with generative AI is outside editorial content in business operations tasks. I’ve had to write job descriptions in any newsroom manager role I’ve had, and the benefit of using generative AI is leveraging the collective intelligence of the best job descriptions written and posted from companies on the internet. Without good editors in HR, which increasingly fewer of us have, this is a tool we can use to write more accessible and inclusive job descriptions for internships, fellowships, staff roles and as a foundation to grow our teams.

What's a problem that needs a solution? Consider how ChatGPT, machine learning, automation, structured data processing or pattern recognition could help solve a problem in your newsroom. Beyond job descriptions, we discussed how ChatGPT could help start structured stories, including timely evergreen articles (e.g. publishing passport guidelines as supplemental to a news piece on TSA delays), as well as prompting brainstorms and retrospectives that leverage more best practices than the structure you might have used for a long time. 

It’s also important to know what ChatGPT shouldn’t do in your newsroom. We know it’s not the tool to fact-check, provide unbiased content, manage embargoed or sensitive materials (since it’s no longer private when you put it into ChatGPT) or report original content without an editor. 

If AI is the tool to solve a problem worth solving in your newsroom, by all means, dive in head first. I’ll leave you with these guidelines and an assignment for setting a responsible AI strategy when you take that leap. Best of luck!

The AI investment isn't worth it if: using AI introduces a lot of uncertainty in the material it produces; editing AI content takes longer than it does to report and edit; or it isn't going to contribute to revenue or audience goals. TRY IT OUT NOW

Is your company looking to solve a problem using AI? As the person deciding the AI strategy in your newsroom, write a mission statement of intention that includes the following:

  • How we’ll use AI in our work (behind the scenes or in our articles) and why we’re using it (audience, business or editorial goals)
  • Where the AI-assisted or -generated content will appear onsite
  • How you anticipate using AI will affect jobs
    • If applicable, if AI is in talks with your newsroom union

Some good references can be found at AP and Wired.

Are you looking for help creating an AI strategy for your newsroom? To work with API on creating a clear, transparent and audience-centered AI strategy,  reach out to elite.truong@pressinstitute.org with the subject line “AI Strategy.”

Chapter 4

It’s time to zoom out from using AI and ask a basic question: How can “training” large language models (LLMs) — the structure that ChatGPT, Dall-E and other generative AI models are built on — with anything and everything on the internet, including all of our content, possibly be legal? Especially when AI is built for other people’s commercial products? I’ve heard from countless local news leaders this year who are concerned and convinced they’re seeing the end of their news business, while having no protections to cover it.

We reached out to Danielle Coffey, the CEO of American Press Institute’s parent corporation, the News/Media Alliance, to learn more about the legal fight for news organizations’ rights with AI. 

Photos of Elite Truong and Danielle Coffey

DANIELLE: We filed with the U.S. Patent and Trademark Office a few years ago about data retention and mining — it was already going on, but nobody cared. But in January or February, all of a sudden every article in my inbox and call was about AI and that was because of ChatGPT. Once it was commercialized, that’s when it’s like wildfire, but this has been a wildfire that has maintained the level of buzz like nothing I’ve ever seen.

News organizations are uniquely situated because our content is used to feed the training that becomes AI tools that feed into creating more journalism — it’s almost cannibalistic. How do we feed into the training of AI and machines that produce datasets that are fed into commercial products? 

The very first thing we [News Media Alliance] did was a report that showed the business landscape and how these players use our information. We did a survey of membership to determine what licenses exist, who’s using what, how, and if they have permission.

We have IP copyright-protected content and exclusive rights. If our content was being used in a way that was not granted permission, it was unauthorized and requires compensation and/or a license. You need permission, essentially. 

ELITE: There are some privacy concerns here. When you put something in ChatGPT, you’re giving the company the right to that, so don’t use your embargoed investigations for an outline. The bigger concern, NMA is uniquely positioned to address: you don’t know if your content can be accessed or has been accessed for machine learning. Currently, that’s not legal for companies like OpenAI to use, right? What rights do publishers have and how do you know if they’ve used your content?

D: If it’s being used, how do we stop them? Whether or not it’s legal, this is a deep question.

There are two developments that [tech companies] are worried about. Lawmakers and the public have been made aware of how LLMs work, so they don’t know if they’re on solid ground anymore. OpenAI has now said you can opt out and Google, in a fuzzy way (via a Guardian article), said they’ll respect opt outs, but the problem is how do we know if they’re using our content? 

That’s where the technology piece comes in. We realized that if you want to prevent things like nonprofit Common Crawl that’s easy — you know what their bot looks like when they come on your site so you can deny them — but do you want to and how will it affect your traffic? Most of our companies have blocked them. 

OpenAI becomes more complicated. What they use will be in a lot of commercial products. If I opt out and [my content could be] used at a later stage and I could get compensated, do I want to? Hence litigation against OpenAI — don’t tell me I can opt out because then I may not get compensated. It would be waiving your right to payment and I don’t think companies are ready to do that. Hence Sarah Silverman and others filing a lawsuit against these tools.

However, with Google’s SGE and its products, it becomes much more complicated. Google currently crawls and indexes our content for search, which you want them to, but it exacerbates the previous problem. If you’re being crawled for search and it’s meshed with grounding for AI datasets and they use that to put together super snippets, how do you stop one activity for the other — and do you want to? I want to be found. 

I think ultimately, it’s a business question. Even if we can opt out of using a tech solution to grant permissions for specific uses of your content, will we? 

Editor’s note: After Danielle and Elite’s conversation, The New York Times blocked OpenAI’s web crawler.

E: Your eyes lit up when we mentioned the archives issue of ChatGPT retroactively using our content to learn and improve their own product, by using ours, for free. It’s a mature product because of everything that’s been fed into it. It’s already been trained. What does archive use legally look like in the fight you’re experiencing?

D: Sam Han at the Washington Post says you can’t untrain a model. If you were to say you’ve illegally used our information, there’s no taking it back — only payment is a remedy. When you go to court and say this guy used my stuff and it’s illegal, the court will ask if there’s market harm. When it comes to archives, one company alone might make millions off of archives. Think about the sites where you have to log in or pay to access, like Newspapers.com, ProQuest or LexisNexis — these companies already exist, and there is a market for us to be paid for our archival content. So when AI crawlers take it for free, a court will determine that we’re being deprived of what we can be charged. 

E: That seems like the easiest path forward — showing that what AI has already done is a problem. That’s promising to me because that’s a huge concern to people just now exploring their options. 

I’m curious — OpenAI hired Tom Rubin, a lawyer who used to work for Microsoft, to represent publishers in this issue. Is that weird? Has that happened before? Why would they do that and how would that benefit us?

D: It’s certainly not unusual. Tom cares about the industry and is a familiar face that will be helpful. It’s a win-win because the company has someone who is familiar and welcomed by our publishers, and publishers have someone they trust on the inside. That’s why they do it. They are having conversations — OpenAI seems open to payment, which was proven by the AP/American Journalism Project partnership.

E: Is there anything that makes it difficult for the media industry to tackle AI and licensing?

D: When I first started working, I noticed every other content industry is like “my content” — FBI warnings when you’re watching movies, intense music copyrights. They’re very conscious of the protection of their property and not letting it be used by others. But when I started in this industry, the mentality was, go ahead and take it, it will come back and we’ll make money through advertising. That’s our downfall. We now get cents on the dollar, and we let everyone have it, with the faith that it will come back, but because of the distribution model, it went out the door and never came back. It was hard to shift that mentality to hold tighter and put up paywalls. People recognized the need for exclusive use and requiring permissions for use of content. What I saw when AI came was a quick embracing of the mentality of, my content’s being used, it has value, it’s being used without permission and that’s not allowed. That was a big conceptual shift of our companies recognizing the value of our content out of the gate.

E: That’s an interesting mentality and something that’s tricky to deal with because we think news should be for everyone, we want the biggest reach possible, but it does bite us. Anything else you want to talk about for API readers?

D: AI can be incredibly empowering. It can create productivity and innovate in all ways. AI tools, natural language processing — it can be an amazing thing for newsrooms. But the two are connected: if you don’t fix the piece of it where we need to be compensated or at least provide permissions for the use of our content to train these machines that could become the replacement of us, what happens when this first part goes away if there’s nothing to feed on in the first place? What happens over time? If that goes away, what becomes of the system that relies on this content in the first place? It’s not a good thing for society and democracy.

DIG DEEPER

  • Check out N/MA’s work on artificial intelligence
  • During our conversation, During our conversation, Danielle said this about the legal cases to watch and learn from as we navigate this:
    • Read about the Getty cases in the U.K. and U.S. Getty has always been aggressive, but it’s easier with images, since there’s a strong foundational precedent in the courts because there’s a lot of clarity around that law. In their complaint you can see the watermark [in the AI-generated images]! But how do you attach metadata to a string of words?
    • Judge Orrick’s comments [in a lawsuit brought by artists against text-to-image AI developers] look at the input stage only, but I think output is essential to determine how the output is harming my product. 
    • In U.S. legislation, Senate majority leader Chuck Schumer has taken this issue up which is interesting because majority leaders typically don’t run with legislation. Senator Chris Coons is handling a piece that has to do with copyright, so there will be forum meetings this fall and NMA will be weighing in. As far as legislation in Congress, I don’t think it will happen before the end of the year, but when it comes to disclosures, transparency, compensation, IP around the use of our content that’s used and the outputs, the legislative proposals are exploring all of the above.
  • Are you looking for help creating an AI strategy for your newsroom? To work with API on creating a clear, transparent and audience-centered AI strategy, reply to this email or reach out to elite.truong@pressinstitute.org with the subject line “AI Strategy.”