A transformative, cloud-computing approach to analyzing data helps researchers better understand seismic activity
For the first time, we were able to identify this noise from some large machines as a distinct signal and pull it from the dataset, allowing us to separate natural signals from anthropogenic ones. - Omar Marcillo
LOS ALAMOS, N.M., May 19, 2020-For the first time, seismologists can characterize signals as a result of some industrial human activity on a continent-wide scale using cloud computing. In two recently published papers in Seismological Research Letters, scientists from Los Alamos National Laboratory demonstrate how previously characterized "noise” can now be viewed as a specific signal in a large geographical area thanks to an innovative approach to seismic data analyses.
"In the past, human-caused seismic signals as a result of industrial activities were viewed as ’noise’ that polluted a dataset, resulting in otherwise useful data being dismissed,” said Omar Marcillo, a seismologist at Los Alamos National Laboratory and lead author of the study. "For the first time, we were able to identify this noise from some large machines as a distinct signal and pull it from the dataset, allowing us to separate natural signals from anthropogenic ones.”
The study used a year’s worth of data from more than 1,700 seismic stations in the contiguous United States. Marcillo detected approximately 1.5 million industrial noise sequences, which corresponds on average to around 2.4 detections per day at each station.
"This shows us just how ubiquitous industrial noise is,” said Marcillo. "It’s important that we’re able to characterize it and separate it from the other seismic signals so we can understand exactly what we’re looking at when we analyze seismic activity.”
This data was accessed and processed using cloud computing-a novel approach that allows for greater scalability and flexibility in seismological research. The approach is detailed in a companion paper , which demonstrated how cloud computing services can be used to do large-scale seismic analysis ten times faster than traditional computing, which requires data to be downloaded, stored, and processed. Using Amazon Web Services’ cloud computing, researchers were able to acquire and process 5.6 terabytes of compressed seismic data in just 80 hours. To do this using traditional computing methods would have taken several weeks.
Marcillo said that his work to characterize industrial noise across the country would not have been possible without this new cloud-computing approach. "My colleagues and I had figured out how to separate the industrial noise signal from the rest of the seismic signal, but we couldn’t scale it,” he said. So Marcillo collaborated with Jonathan MacCarthy to find a way to expand it to cover a large geographical area; cloud computing was the answer. It is also flexible enough to adapt to the evolving needs of many research applications, including processing speed, memory requirements, and different processing architectures.
"Seismology is a data-rich field,” said MacCarthy, lead author of the paper on the cloud-based approach. "Previously, seismic data would have to be downloaded and processed by each individual researcher. Cloud computing allows all of that data to be stored in one place, and for researchers to easily access and work with it in a community-based way. It’s a huge development and has the potential to totally transform the way seismological research on large datasets is done.”
Omar E. Marcillo, Jonathan MacCarthy; Mapping Seismic Tonal Noise in the Contiguous United States. Seismological Research Letters ; 91 (3): 1707–1716. doi: doi.org/10.1785/0220190355
Jonathan MacCarthy, Omar Marcillo, Chad Trabant; Seismology in the Cloud: A New Streaming Workflow. Seismological Research Letters ; 91 (3): 1804–1812. doi: doi.org/10.1785/0220190357