Sign In or Create an Account.

By continuing, you agree to the Terms of Service and acknowledge our Privacy Policy

Climate Tech

Google Is Using AI to Fill a Flood Risk Data Gap

Researchers at the hyperscaler say they can predict flash floods with a new Gemini-produced dataset.

A flooded house and the Gemini logo.
Heatmap Illustration/Getty Images

Flash floods, when stormwater pools and rises rapidly in an area within just a few hours of a storm's onset, are one of the more dangerous hazards of a warming planet prone to heavier rainfall. They are also notoriously difficult to predict. But research out of Google on Thursday shows how artificial intelligence could unlock better forecasts and help communities prepare.

Google researchers used Gemini, the tech giant’s signature AI agent, to process millions of news articles from around the world about past floods and extract data on when and where the deluges occurred. After assembling this vast new dataset — the largest of its kind to date — they used it to train a flood prediction model that uses local, hourly meteorological data to produce 24-hour forecasts for urban flash floods in more than 150 countries.

The dataset, which Google has named Groundsource, is free for anyone to download and use, and the forecasts are now live on Google’s Flood Hub, an online portal that also predicts river-related flood events. The tool is somewhat crude — it simply indicates whether there is a medium or high likelihood of a flash flood occurring in the next 24 hours in a given area. It only covers urban areas, and it doesn’t tell you how severe the flood could be. The resolution is also pretty coarse, indicating risks at the scale of a city rather than a street or neighborhood.

Still, the researchers said the forecasts would be useful for alerting authorities to potential risks.

“People have been very interested, even at that level of granularity,” Gila Loike, a product manager at Google Research, told reporters in a press conference this week.

According to Google, a regional disaster authority in Southern Africa caught a flash flood alert while the tool was still in beta, confirmed the flood on the ground, and then deployed a humanitarian worker to oversee the response. “We’re still in the early days of seeing the impact of Groundsource, but that chain of events from a prediction in Flood Hub to boots on the ground is exactly what Flood Hub was built for,” Juliet Rothenberg, the product director for Google’s crisis resilience work, said.

One of the key reasons it’s so hard to predict flash floods is the lack of historical data. We have decent flood models for “riverine” flooding, when rivers overflow, because of physical gauges in rivers around the world that have collected water levels for decades, but there’s no equivalent for city streets.

News articles present a largely untapped source to fill this gap. The challenge is that the key bits of information, such as where and when the flood occurred, are buried in narrative texts and expressed in wildly inconsistent formats. It would take human experts untold hours and resources to wade through each one and record the data in a standardized manner. An AI agent such as Gemini, however, can do it much faster.

Google’s research team started out by crawling the web for news articles describing flood events going back to the year 2000, gathering an initial pool of more than 9 million stories from around the world. After getting rid of ads and menus and the like and translating the articles that were in other languages to English, they fed them to Gemini.

“You are a meticulous flood event analyst,” the researchers told the AI agent. The rest of the elaborate prompt is included in a non-peer-reviewed preprint paper detailing the group’s methods for producing the dataset. In essence, they goaded Gemini to take a sentence such as “Main Street flooded on Tuesday,” and interpret where, exactly, this Main Street was located, and which Tuesday the article was referring to.

The resulting dataset contains 2.6 million historical flood events across more than 150 countries. As a comparison, the next largest public dataset, the National Oceanic and Atmospheric Administration’s Storm Events database, contains about 2 million storm events from 1950 to the present, only about 230,000 of which are flood events. The biggest global dataset, the United Nations Office for Disaster Risk Reduction’s DesInventar system, contains 500,000 events, only a fraction of which are records of floods. It’s also restricted to participating nations and inconsistently updated.

“Oftentimes, the first question our researchers will ask when we talk about going into a new domain within crisis resilience is, what data do you have? How many data entries do you have?” Rothenberg said. “That’s what really unlocks the ability to make breakthroughs here.”

Humberto Vergara, an assistant professor of civil and environmental engineering at the University of Iowa who studies flash floods, agreed that the lack of flood observation data has been a significant obstacle for the field. He told me the Groundsource dataset will “definitely be of great interest” and that there is “definitely great need for things like this.” Using news reports to fill out the global picture of flooding is something researchers have been thinking about doing for a while, he added.

While Vergara was cautiously optimistic the data would be useful, he was quick to note that it would take additional efforts to validate. His lab is working on its own dataset based on satellite estimates of rainfall that could be used to prove out Google’s records, he said.

The Google team already made some efforts to validate Groundsource, cross-checking it with manual annotations of the news reports as well as with other existing databases. It found that about 82% of the events were labeled with the correct location and timeframe. “From a research perspective, using an 82% accurate dataset is actually acceptable,” Loike said. “A well-trained model can smooth out the inconsistencies and thereby learn the dominant patterns while ignoring the 18% of labeling errors.”

They also validated the Flood Hub predictions by comparing its U.S. outputs to flood and flash flood warnings produced by the National Weather Service. “Achieving performance metrics comparable to such a sophisticated, instrumentation-rich framework demonstrates how AI can bridge the warning gap in underserved regions that lack equivalent infrastructure,” the researchers wrote in a second non-peer-reviewed preprint describing the model development.

Part of the reason Vergara was cautious in praising the effort is that predicting flash floods is challenging for reasons beyond the lack of historical data. “Most of the driving force is rainfall,” he said. “Everybody in the community knows that predicting rainfall is extremely difficult. The best models out there cannot predict rainfall with the accuracy that is needed for flash floods with more than one or two hours of lead time.”

The utility of Google’s Flood Hub depends on who will be consuming the information, he said. It’s probably not high-resolution enough to be useful for emergency responders, but there might be agencies at the city or regional level that can use it as a situational awareness tool.

Rothenberg, of Google, is optimistic that this same method can produce useful predictions for other kinds of extreme events.

“Applying this methodology to flash flood reports is just the beginning,” Juliet Rothenberg, the product director for Google’s crisis resilience work, told reporters at the press conference. “We think there’s an immense opportunity in thinking about how we could use publicly available information to help predict heat waves or landslides, for example — other events that are hard to predict because the data hasn’t been centralized or it doesn’t exist.”

Blue

You’re out of free articles.

Subscribe today to experience Heatmap’s expert analysis 
of climate change, clean energy, and sustainability.
To continue reading
Create a free account or sign in to unlock more free articles.
or
Please enter an email address
By continuing, you agree to the Terms of Service and acknowledge our Privacy Policy
AM Briefing

$200 a Barrel

On Chilean copper, Chinese offshore wind, and American uranium

Iran Warns: ‘Get Ready for Oil to be $200 a Barrel’
Heatmap Illustration/Getty Images

Current conditions: A tornado that formed amid the storms pummeling the Midwest touched down in northwest Indiana and killed two • The Philippines’ Mount Kanlaon erupted 150 meters into the air in at least the fourth eruption on the archipelago this month • The swarm of earthquakes that started rattling northern Louisiana last week is continuing.

THE TOP FIVE

1. Iran warns: ‘Get ready for oil to be $200 a barrel’

Oil prices surged 8% as Iran refused to start ceasefire talks with the United States and vowed to drive oil prices up by more than 100%. In a statement, Ebrahim Zolfaqari, a spokesperson for Iran’s Khatam al-Anbiya military command headquarters, said the world should “get ready for oil to be $200 a barrel” as “we will never allow even a liter of oil to pass through the Strait of Hormuz for the benefit of the United States, the Zionist regime, or their partners.” Living up to its threat, Iranian missiles struck three ships Wednesday attempting to cross the narrow channel in the Persian Gulf through which about one-fifth of the world’s hydrocarbons typically flow. The U.S. military, after vowing to safely shepherd ships via the waterway, turned down requests yesterday for an escort, The Wall Street Journal reported. During a televised appearance Wednesday with Fox News’ Laura Ingraham, Secretary of Energy Chris Wright said the Strait would reopen “hopefully in the next few weeks.” Later that evening, during an interview that aired on CNN, President Donald Trump said the strait was in “great shape,” promising, “We’re going to look very strongly at the strait.”

Keep reading... Show less
Yellow
Energy

This Oil Supply Shock Is Very, Very Bad

Even releasing hundreds of millions of barrels from the world’s strategic reserves will only cover about a month of missing supply.

An anchored oil barrel.
Heatmap Illustration/Getty Images

Every day the Strait of Hormuz remains closed, the global oil supply deficit increases by millions of barrels. So far, even the biggest responses to the crisis are at best short-term and partial.

Today the International Energy Agency announced a coordinated deployment of 400 million barrels from its member states’ strategic reserves. The United States, which has a 414 million-barrel Strategic Petroleum Reserve, has yet to detail its plans to deploy reserves. President Trump appeared to confirm, however, that the U.S. would release some oil from the Strategic Petroleum Reserve. “Right now, we’ll reduce it a little bit, and that brings the prices down,” he told a Cincinnati television station Wednesday.

Keep reading... Show less
Energy

Washington State Just Outmaneuvered Trump’s Coal Order

A new law piles taxes on the state’s last remaining coal plant, making it too expensive to operate.

Donald Trump.
Heatmap Illustration/Getty Images, Wikimedia Commons

Trump may have ordered Washington’s last coal-fired power plant to stay open, but it’s unlikely ever to operate ever again thanks to a crafty bit of policy the Evergreen state just passed.

Washington’s Governor Bob Ferguson is expected to sign a bill on Wednesday that accomplishes one very narrow goal: It taxes the hell out of any electricity generated by the TransAlta Centralia coal plant, effectively pricing it out of the market.

Keep reading... Show less