This week I have been in Crete for a conference on the Space Infrared Telescope for Cosmology and Astrophysics (SPICA). SPICA is a proposed far infrared telescope that will have a slightly scaled down Herschel style mirror that crucially will be cooled to below 8K meaning it will offer a huge increase in sensitivity compared to the larger but warm (80K) Herschel mirror. The cold mirror means that the telescope will be six orders of magnitude more sensitive than Herschel and will no longer be limited by the telescope’s own emission but will be limited by the true far infrared background. Figure 1 shows the current design for the instrument aiming for a 2030 (ish) launch.
Figure 1. The SPICA telescope design.
There is currently no instrument observing the far infrared region of the electromagnetic spectrum that lies between the James Webb Space Telescope (JWST, 2021 launch) and the Atacama Large Millimetre Array (ALMA). The instrument will feature three instruments covering imaging, spectroscopy, and polarimetry over the range 12-230 micrometers.
The week in Crete has involved investigating all the possible science that can be done with the instrument. This is critical to the designers of the telescope when making engineering decisions about the three instruments. When designing an experiment it is critical to iterate between thinking about key science questions that could be answered by conceivable technology and then designing possible configurations. It is then possible to go back to work on possible measurements and how the designs could be improved and so on until we converge on a compromise between world class science, cost, and technological feasibility.
Figure 2. The obligatory group photo.
SPICA is an extremely exciting prospect for investigating the far-infrared universe in a time where no other instrument will be working in that region. The Poly Aromatic Hydrocarbon (PAH) features in galaxy spectra at those wavelengths contain a wealth of information about star formation and the redshift of objects. There are three proposals for the European Space Agency’s ‘M5’ call which will be decided in around a year. The competition will be fierce but SPICA would certainly have a huge impact on many areas of astronomy.
Over the last two days I have been attending a celebration of the ten year anniversary (1512h 14 May 2008) of the the launch of the Herschel space observatory. It has been a pleasure to visiting the European Space Astronomy Centre just outside Madrid. The centre is dotted with numerous models of European Space Agency telescopes.
Herschel was launched together with Planck. There is still an ongoing connection between the two instruments and it was fascinating to see the engineering behind launching two satellites together.
It was inspiring to see the great range of science that was done with the instrument. I have to admit I knew very little about the aspects outside my own field of extragalactic astronomy (extraextragalactic?). The whole enterprise has been ongoing for around thirty years and clearly occupied the large part of a number of people’s careers. It is testament o their hard work that the data is still leading to new scientific results.
For the last three months, I have been working with researchers from across the physical sciences at the University of Sussex on some software to classify videos of court cases as either deceitful or truthful. Building on work by Yusufu Shehu, we have constructed a neural network (a type of software partly inspired by neurons in the human brain) that can classify these videos with an accuracy of over 80%. Figure 1 shows our network design. For comparison human ability to spot lying is often below 60%. The software could in theory be trained on other data sets and we are actively looking for people in commercial areas who might be interested.
One interesting aspect of neural networks is that we don’t tell the network how to work, but rather we train it. This means that the network could be trained on other features. Any combination of video, audio, and text can in principal be classified into any sets where there is information in the data. Some possible ideas for how this might be applied in other situations are classifying phone conversations according to the emotional tone or likelihood of a successful sale. Classifying music into genres, speakers by gender or age, specific speakers for security reasons. Figure 2 shows some clips form the court case videos. A court case is a very particular environment so we are interested to apply the software to other settings.
A paper in 2018 by Krishnamurthy et al. developed a ‘multi-modal neural network’ and trained it on 120 court case videos taken from the Miami University deception detection database (Lloyd et al. 2019). They showed that it was possible to detect deception with an accuracy of 75%, a significant improvement on human performance. We have further developed their model and achieved improved results. In the multi-modal network video, text transcripts, audio and ‘micro-features’ are treated independently and then the results are combined to get a final probability. Figure 3 shows how the network is designed.
We are interested to have conversations with potential industry partners who might wish to take this forward with us. Please don’t hesitate to get in touch if you think this research could be useful to you. We are interested in applying these networks both to video and also to audio only. We see particular possibility for collaboration with an industrial partner in an area that relies on large volumes of audio data from, for instance, telephone calls.
My work on deblending is concentrating on more expensive and robust methods that could not be applied in the pipeline which must be run essentially every night on incoming data. It was clear that other methods must be developed for each given science case. There will have to be more work on resolved objects for instance. The real challenge is that the objects are not point sources. It is this combination of resolved and confused images that makes deblending such a challenge.
The conference was also a chance to find out more about the project as a whole including updates on the construction. The telescope is really taking shape and images from the El Peñón peak of Cerro Pachón and it is extremely exciting to see all the work, by scientists and engineers, going into the project’s success.
“Is it not curious, that so vast a being as the whale should see the world through so small an eye, and hear the thunder through an ear which is smaller than a hare’s? But if his eyes were broad as the lens of Herschel’s great telescope; and his ears capacious as the porches of cathedrals; would that make him any longer of sight, or sharper of hearing? Not at all.- Why then do you try to “enlarge” your mind? Subtilize it.” – Moby Dick.
For the last year I have been working on the Herschel Extragalactic Legacy Project (HELP), an EU funded project to use far infrared imaging from the Herschel Space Observatory to understand galaxy formation and evolution. We are gearing up for our first data release, DR1 on 1 October but we are making a lot of the data available now for beta testing.
We are very keen for the astronomical community to start using this huge dataset comprising 170 million galaxies over 1270 square degrees of extragalactic sky and indeed using and developing the code used to produce it. We have released all the code to perform the reduction on GitHub in the spirit of open science and reproducibility. The data can be accessed as raw data files from the Herschel Database at Marseille (HeDaM) and queried from a dedicated Virtual Observatory server. Although Herschel imaging has been the main focus of the project, we have taken public data from many different instruments spanning all the way for ultraviolet to radio data. Tying together these different data sets is a major challenge and will be required to make the most of the upcoming wide surveys such as from the Large synoptic Survey Telescope (optical), the Euclid space telescope (optical) and the Square Kilometre Array (radio).
We are also in the process of setting up mirrors here at Sussex and I plan to blog more about that soon. There is a vast amount of data and we are working on squeezing every last ounce of science out of all the public data from a wide array of different instruments which make up the full multi-wavelength data we have collated.
If you have any questions about how to use this database please leave a comment or email me.
Last week I was in Valencia for a conference on statistical methods in modern cosmology. The week began with a summer school for PhD students and a few postdocs on machine learning, sparsity and Bayesian methods. I was familiar with the Baysian methods but sparsity (dealing with data matrices where the majority of elements are zero) was completely new and I am looking forward to implementing some of the Machine Learning methods perhaps for the Herschel Extragalactic Legacy Project or for work I am about to do for Public Health England (more about that in a later blog post).
The introductory lecture by Stephane Maillat (Ecole Normale Superieure) gave an overview of neural network approaches to scientific problems. One particularly striking example was calculating molecule energies to higher accuracy than Density Functional Theory (DFT) in very short times. My PhD research used DFT heavily and we were always limited by computer resources. The fact that a neural network can learn how to predict ground state energies without including any physics in the model (!) was remarkable to say the least. We are certainly entering a brave new world.
There were however some dissenting voices. Neural networks and machine learning in general needs some work to make results more reliable. Google has started work on Tensor Flow probability which aims to assign some measure of errors to results. These methods also in general require a representative sample. Often we know that our samples are not representative and we aim to model selection biases. I think these issues both need to be addressed before ‘classical’ methods such as Bayesian inference are consigned to history.
I also presented a poster on ongoing work on deblending. Now that we have a prototype algorithm I need to get on with implementing and testing. It was great to see talks by Peter Melchior (Princeton) and Rachel Mandelbaum (Princeton) which both brought attention to the problem of blending for pretty much all science cases from the Large Synoptic Survey Telescope (LSST) and the space telescope Euclid. Clearly this problem is not going to go away and analysis of galaxy images will be limited by blending issues in the near future.
I would recommend any PhD students or post docs to attend future summer schools and conferences. It was excellent to see so many researchers from around the world working on problems related to my research. The summer school offered an excellent introduction to modern statistical methods that can be quite simple to implement and may help you with your research.
The Herschel Extragalactic Legacy Project (HELP) is a European research initiative to capitalise on the vast imaging data that was collected by the Herschel space telescope. The figure below shows the 23 fields that comprise HELP overlaid on the Planck map of galactic dust. These are mainly the famous extragalactic fields and come in different sizes and depths.
Last week we had a conference here at Sussex to show the astronomy community the data we are about to release, discuss the methods used to create it and talk about the science results from Herschel and HELP, past, present and future.
I gave a talk on the HELP masterlist the slides for which are available below.
We have a great deal of work to do to finish running the whole data pipeline for all 23 fields, containing photometry, photmetric redshifts, a full analysis of the Herschel fluxes and fitted galaxy spectral energy distributions for all the Herschel objects. It will all be worth it when we start to see the science results come through from this very wide area data release covering around 1300 square degrees.
I spent last week at the Institute of Astronomy in Cambridge discussing how the UK can take advantage of the incredible imaging data that promises to be produced by the Large Synoptic Survey Telescope. The telescope is set to receive first light in 2019 and there is a vast amount of work to do to prepare for the deluge of data that is about to flow out of Chile. One of the challenges is making sure we make best use of UK expertise and work in close collaboration with the majority of LSST scientists in the US.
We were meeting to discuss how best to target UK research to complement work being done elsewhere. There are some definite niches available to us, partly because of access we have to some UK data and partly for the expertise in multiwavelength science that has been built up here.
There were a number of excellent talks about Active Galactic Nuclei (AGN) and galaxy formation based on studies right across the wavelengths (x-rays to radio waves). There were a number of talks about photometric redshifts which is of direct relevance to the Herschel Extragalactic Legacy Project (HELP) that we are currently working on in Sussex. Ultimately it seems that building some software within the LSST stack that can handle UK near infrared images may be the best first step to preparing for possible multiwavelength LSST science.
We have around two years to prepare for the first LSST images and it is vital that we work to have software in place ready for it. On a personal note I think developing any code for multiwavelength pixel-based image analysis within the LSST software stack is an opportunity for us early career scientists to build expertise that will make us employable over the lifetime of LSST.
On a completely separate note; being back in Cambridge was a great chance to have a look around the West Cambridge site which has changed drastically since I was an undergraduate at the Cavendish. I visited the Department of Chemical Engineering and Biotechnology which was extremely impressive. There has clearly been a massive investment in the various science departments that have been built/extended there. I look forward to seeing how it continues to develop and all the research that will be generated there by what is essentially a load of geeks in a field.
I wasn’t very familiar with the MeerKAT International Giga-Hertz Tiered Extragalactic Exploration (MIGHTEE) survey or even the Karoo Array Telescope (MeerKAT)* which is a precursor to the enormously ambitious Square Kilometre Array (SKA). Gotta Love Physics Acronyms (GLPA). It reminded me what an exciting time to be doing astronomy it is with some huge data sets on the way at unprecedented scales. It was a chance to think about how to tie together the quite disparate data from various wavelength regimes which fed in quite well to the LSST meeting the following week.
A lot of the fields overlap with the LSST deep drilling fields as well as the Herschel extragalactic fields. The four fields are XMM-LSS, COSMOS, ELAIS-S1 and CDFS (names of areas on the sky that have been previously imaged)**. The challenge will be to move beyond the catalogue based cross matching done so far and towards dealing directly with pixel data.
I did my masters project on the SKA back in 2006 and it is amazing to see it starting to take shape with actual radio dishes on the ground in South Africa.
Being in Oxford was also a useful opportunity to meet with other members of the Herschel Extragalactic Legacy Project (HELP) to talk about the last stages of the project and how we are going to deliver all the final data. Something we can talk about further at the HELP meeting in Sussex in October.
* I can’t find where the Meer in MeerKAT comes from. I think there are actual meerkat populations near the telescope but this might be a prime example of acronym nesting.
** XMM-LSS: X-ray Multi Mirror telescope Large Scale Structure survey
COSMOS: Cosmological Evolution Survey***
ELAIS-S1: South 1
CDFS: Chandra Deep Field South
A couple of weeks ago I was in Boston for a meeting of the SERVS team and I thought I should get round to blogging about it. The small conference was organised by Anna Sajina at Tufts and was concerned with determining priorities for presenting and analysing data from the Spitzer telescope. I was there because a large part of my work is concerned with building a multiwavelength catalogue for the Herschel Extragalactic Legacy Project (HELP) and we are ingesting a number of Spitzer surveys including SERVS.
SERVS data is a key part of the HELP pipeline because we typically use the Infrared Array Camera (IRAC) fluxes to select objects to define our samples. It was also a chance to hear about all the research being done with these Spitzer fluxes which cover the mid infrared part fo the spectrum.