Statistical Challenges in 21st Century Cosmology, 20-25 May, Valencia, Spain

Last week I was in Valencia for a conference on statistical methods in modern cosmology. The week began with a summer school for PhD students and a few postdocs on machine learning, sparsity and Bayesian methods. I was familiar with the Baysian methods but sparsity (dealing with data matrices where the majority of elements are zero) was completely new and I am looking forward to implementing some of the Machine Learning methods perhaps for the Herschel Extragalactic Legacy Project or for work I am about to do for Public Health England (more about that in a later blog post).

The introductory lecture by Stephane Maillat (Ecole Normale Superieure) gave an overview of neural network approaches to scientific problems. One particularly striking example was calculating molecule energies to higher accuracy than Density Functional Theory (DFT) in very short times. My PhD research used DFT heavily and we were always limited by computer resources. The fact that a neural network can learn how to predict ground state energies without including any physics in the model (!) was remarkable to say the least. We are certainly entering a brave new world.

There were however some dissenting voices. Neural networks and machine learning in general needs some work to make results more reliable. Google has started work on Tensor Flow probability which aims to assign some measure of errors to results. These methods also in general require a representative sample. Often we know that our samples are not representative and we aim to model selection biases. I think these issues both need to be addressed before ‘classical’ methods such as Bayesian inference are consigned to history.

I also presented a poster on ongoing work on deblending. Now that we have a prototype algorithm I need to get on with implementing and testing. It was great to see talks by Peter Melchior (Princeton) and Rachel Mandelbaum (Princeton) which both brought attention to the problem of blending for pretty much all science cases from the Large Synoptic Survey Telescope (LSST) and the space telescope Euclid. Clearly this problem is not going to go away and analysis of galaxy images will be limited by blending issues in the near future.

You can see the poster here.

I would recommend any PhD students or post docs to attend future summer schools and conferences. It was excellent to see so many researchers from around the world working on problems related to my research. The summer school offered an excellent introduction to modern statistical methods that can be quite simple to implement and may help you with your research.