Machine Learning for Gravitational Waves

As gravitational-wave observatories detect growing numbers of compact binary mergers, the computational cost of Bayesian parameter estimation becomes a critical bottleneck. Simulation-based inference (SBI) bypasses this bottleneck by training neural networks to learn the posterior distribution directly from simulated data.

Background

GW parameter estimation

Extracting source properties (masses, spins, sky location, orientation) from a gravitational-wave observation requires Bayesian inference. Given a likelihood \(p(d\mid\theta)\) for data \(d\) given parameters \(\theta\), and a prior \(p(\theta)\), the posterior is \[ p(\theta|d) = \frac{p(d\mid\theta)p(\theta)}{p(d)}. \] Standard samplers (MCMC, nested sampling) require millions of waveform evaluations per event, each costing milliseconds to seconds depending on the model. This makes traditional methods prohibitively expensive at high event rates.

Simulation-based inference

SBI replaces likelihood evaluations with simulations. A neural network \(q_\phi(\theta \mid d)\) is trained to approximate the posterior by minimizing \[ L[\phi] = \mathbb{E}_{p(\theta, d)}\left[-\log q_\phi(\theta\mid d)\right], \] over simulated data pairs \((\theta, d)\) drawn from the joint distribution. To obtain these pairs, we sample hierarchically: first draw from the prior \(\theta \sim p(\theta)\), then simulate data \(d\sim p(d\mid\theta)\). Once trained, the network produces posterior samples via fast forward passes, and can be re-used for every new observation—amortizing the training cost.

Our research

Highlights

My group develops SBI methods for gravitational waves, in an interdisciplinary collaboration spanning physics and machine learning. Our work ranges from advancing methodology to astrophysical applications. Highlights include:

2020 – Initial 5D proof-of-principle for black hole binaries, simulated data  [1] (with C. Simpson and J. Gair). Later expanded to 15D, with excellent agreement with standard samplers on GW150914  [2].

2021 – Together with PhD student Max Dax at MPI-IS, achieved full amortization by conditioning on noise power spectral density \(\to q_\phi(\theta | d, S_n)\)  [3]. This allows for tuning of the network to the detector at the time of the event. Also introduced group-equivariant NPE to achieve higher performance using symmetries  [4].

2022 – Introduced neural importance sampling to verify and correct NPE results  [5]. This also gives the Bayesian evidence and can be used as a flag for systematics issues (data quality, signal modeling).

2023 – Application to hierarchical Bayesian inference for populations of binaries  [6] (with K. Leyde). This combines results from many observations to infer properties of the population as a whole, including the distribution of masses and the cosmological expansion rate.

2025 – Extension to binary neutron stars, published in Nature  [7]. We coupled classical GW data analysis techniques (heterodyning and multibanding) with prior conditioning to simplify BNS signal morphology for the neural network. This enables complete inference in one second, including sky localization for multimessenger follow-up even before the merger takes place. (Blog post)

2025DINGO-T1, a transformer-based model for flexible inference (led by MPI-IS PhD student Annalena Kofler  [8]). We partition the frequency-domain data into small sub-domains, and take each of these as a token for a transformer. By randomly dropping tokens during train time, the network learns to interpret incomplete data. Annalena used this to enable inference with variable detector combinations and frequency ranges—all with a single model.

We also develop and maintain the Dingo code , which implements these methods. Dingo has already been used to drive GW science forward, including finding evidence for eccentric models and making forecasts for next-generation observatories.

Posterior distribution for the first detection, GW150914, comparing our code (Dingo, orange) against MCMC (LALInference, blue). The distributions are in excellent agreement.

Beyond speed, SBI brings qualitative advantages: it works even without an analytic likelihood (e.g., training on real detector noise rather than assuming Gaussian stationarity), and the network can ingest data in any representation — time domain, frequency domain, or multi-banded — enabling analysis strategies that are difficult with traditional samplers.

GW science with Dingo

Evidence for eccentric binaries

In work led by Nihar Gupte (AEI), we used Dingo to analyze a large number of detections from the first three LIGO-Virgo-KAGRA Observing Runs (O1–O3) and found evidence for orbital eccentricity in three events  [9].

In typical analyses, one assumes that orbits are quasi-circular — nearly circular and inspiralling due to gravitational radiation. However, in certain astrophysical contexts (e.g., triples, dynamical captures), the binary may still be eccentric when it enters the detector frequency band. Measuring eccentricity requires more expensive waveform models with two additional parameters.

Using standard techniques, analyzing a single event with an eccentric model can take \(O(\mathrm{weeks})\) on hundreds of CPU cores, making it infeasible to analyze all \(O(100)\) detections. Using Dingo, training takes about a week, but then each analysis takes under an hour — including importance sampling to the true posterior.

Evidence for eccentricity in three events  [9]. Using a uniform-in-eccentricity prior, all three events have support well away from zero eccentricity.

Evidence for eccentricity in three events  [9]. Using a uniform-in-eccentricity prior, all three events have support well away from zero eccentricity.

With the Bayesian evidence from importance sampling, one can compare eccentric versus quasi-circular models. Nihar found consistent evidence for eccentricity, both from individual Bayes factors and from a population-level analysis. For further details see the full paper  [9].

Other applications

  • Next-generation observatories. Filippo Santoliquido et al have applied DINGO to inference of high-redshift sources for the Einstein Telescope  [10]. In later work, they used DINGO to perform injection studies to compare various Einstein Telescope and Cosmic Explorer configurations  [11]. They showed that certain combinations (e.g., two misaligned ET detectors) are better at breaking sky-position degeneracies.
  • Gravitational lensing. Several works have applied DINGO to look for GW signals distorted by intervening mass distributions  [1214]. Lensing analyses are generally expensive because of the cost of calculating the frequency modulation due to the lens, making DINGO especially useful. DINGO has also been used to calculate background lensing rates and to look for lensing in GW231123.

References

[1]
S. R. Green, C. Simpson, and J. Gair, Gravitational-wave parameter estimation with autoregressive neural network flows, Phys. Rev. D 102, 104057 (2020).
[2]
S. R. Green and J. Gair, Complete parameter inference for GW150914 using deep learning, Mach. Learn. Sci. Tech. 2, 03LT01 (2021).
[3]
M. Dax, S. R. Green, J. Gair, J. H. Macke, A. Buonanno, and B. Schölkopf, Real-Time Gravitational Wave Science with Neural Posterior Estimation, Phys. Rev. Lett. 127, 241103 (2021).
[4]
M. Dax, S. R. Green, J. Gair, M. Deistler, B. Schölkopf, and J. H. Macke, Group equivariant neural posterior estimation, (2021).
[5]
M. Dax, S. R. Green, J. Gair, M. Pürrer, J. Wildberger, J. H. Macke, A. Buonanno, and B. Schölkopf, Neural Importance Sampling for Rapid and Reliable Gravitational-Wave Inference, Phys. Rev. Lett. 130, 171403 (2023).
[6]
K. Leyde, S. R. Green, A. Toubiana, and J. Gair, Gravitational wave populations and cosmology with neural posterior estimation, Phys. Rev. D 109, 064056 (2024).
[7]
M. Dax, S. R. Green, J. Gair, N. Gupte, M. Pürrer, V. Raymond, J. Wildberger, J. H. Macke, A. Buonanno, and B. Schölkopf, Real-time inference for binary neutron star mergers using machine learning, Nature 639, 49 (2025).
[8]
A. Kofler, M. Dax, S. R. Green, J. Wildberger, N. Gupte, J. H. Macke, J. Gair, A. Buonanno, and B. Schölkopf, Flexible Gravitational-Wave Parameter Estimation with Transformers, (2025).
[9]
[10]
[11]
[12]
J. C. L. Chan, L. Magaña Zertuche, J. M. Ezquiaga, R. K. L. Lo, L. Vujeva, and J. Bowman, Identification and characterization of distorted gravitational waves by lensing using deep learning, Phys. Rev. D 113, 024041 (2026).
[13]
M. Caldarola, S. Goyal, N. Gupte, S. R. Green, and M. Zumalacárregui, Accelerated inference of microlensed gravitational waves with machine learning, (2025).
[14]
J. C. L. Chan, J. M. Ezquiaga, R. K. L. Lo, J. Bowman, L. Magaña Zertuche, and L. Vujeva, Discovering gravitational waveform distortions from lensing: a deep dive into GW231123, (2025).