Pollsters trying to predict presidential election results and physicists searching for distant exoplanets have at least one thing in common: They often use a tried-and-true scientific technique called Bayesian inference.
Bayesian inference allows these scientists to effectively estimate some unknown parameter—such as who won an election—from data such as poll results. But Bayesian inference can be slow, sometimes taking weeks or months of computational time, or requiring a researcher to spend hours deriving difficult equations by hand.
Researchers at MIT and elsewhere have introduced an optimization technique that speeds things up without requiring a scientist to do extra work. Their method can achieve more accurate results faster than another popular approach to speed up Bayesian inference.
Using this new automated technique, a scientist can input their model, and then the optimization method does all the calculations under the hood to provide an approximation of some unknown parameter. The method also provides reliable uncertainty estimates that can help a researcher understand when to trust their predictions.
This versatile technique can be applied to a wide range of scientific problems, including Bayesian inference. For example, it could be used by economists studying the impact of microcredit loans in developing countries, or sports analysts using a model to rank the best tennis players.
„When you really dig into what people are doing in the social sciences, physics, chemistry or biology, they're often using the same tools. There's a lot of Bayesian analysis out there. If we can create a better tool that makes life easier for these researchers, it's really transformative for a lot of people in different research fields. can cause,” said senior author Tamara Broderick, an associate professor in MIT's Department of Electrical Engineering. Member of the Laboratory for Computer Science (EECS) and Information and Decision Systems and the Institute for Data, Systems and Society.
Co-lead authors on the paper include Ryan Giordano, assistant professor of statistics at the University of California, Berkeley, Broderick; and Martin Ingram, data scientist at AI firm KONUX. The paper was recent Published in Journal of Machine Learning Research.
Quick results
When researchers seek a faster form of Bayesian inference, they often turn to a technique called automatic differential variance inference (ADVI), which is often faster to run and easier to use.
But Broderick and his collaborators found several practical problems with ADVI. It must solve an optimization problem and can only do so approximately. Therefore, ADVI may still require a lot of computation time and user effort to determine whether the approximate solution is adequate. Once it arrives at a solution, it tends to provide poor uncertainty estimates.
Instead of reinventing the wheel, the team took many of the ideas from ADVI, but twisted them to create a technique called deterministic ADVI (DADVI) without these drawbacks.
With DADVI, it is very clear when the optimization is done, so the user does not have to spend extra computational time to make sure that the best solution has been found. DADVI also allows the incorporation of more powerful optimization methods that provide an additional speed and performance boost.
Once a decision is reached, DADVI is set to allow uncertainty corrections to be applied. These corrections make its uncertainty estimates more accurate than ADVI.
DATVI enables users to clearly see how much error has been encountered in the approximation of the optimization problem. This prevents a user from needlessly upgrading repeatedly.
„We wanted to see if we could live up to the promise of black-box inference. Once users build their model, they can run Bayesian inference and not have to get everything by hand, and they don't have to figure out when to stop their algorithm, and they can see how accurate their approximate solution is.” Realized,” Broderick says.
Defying conventional wisdom
DADVI is more useful than ADVI because it uses an efficient approximation method called sample mean approximation, which estimates an unknown quantity by taking a series of precision steps.
As the steps along the way are precise, it is clear when the objective has been achieved. Additionally, reaching that goal usually requires fewer steps.
Often, researchers expect the sample mean approximation to be more computationally intensive than the more popular method, called stochastic gradient, used by ADVI. But Broderick and his collaborators have shown, in many applications, that this is not the case.
„A lot of problems have really special structure, and you can use that special structure to be more efficient and get better performance. That's something we really saw in this paper,” he adds.
They tested DADVI on several real-world models and datasets, including a model used by economists to evaluate the effectiveness of microcredit loans and one used in ecology to determine whether a species is present at a particular site.
Across the board they found that DADVI can estimate unknown parameters faster and more reliably than other methods and achieves as good or better accuracy than ADVI. Because it is easier to use than other techniques, DADVI can provide a boost to scientists in various fields.
In the future, researchers want to dig deeper into correction methods for uncertainty estimates so they can better understand why these corrections produce such precision uncertainties and when they decrease.
„In applied statistics, approximate methods should be used for more complex or high-dimensional problems to allow computation of correct solutions in a reasonable time. This new paper provides an interesting theoretical and empirical result that points to an improvement over the popular approximate method for Bayesian inference,” said Andrew, professor of statistics and political science at Columbia University. says Kelman '85, '86. Not engaged in studies. „As one of the team involved in the development of that earlier work, I'm excited to see our algorithm replaced by something more stable.”
This research was supported by a National Science Foundation Career Award and the US Office of Naval Research.