Deep Learning of Quasar Spectra to Discover and Characterize Damped Lya Systems

14 Sep 2017  ·  David Parks, J. Xavier Prochaska, Shawfeng Dong, Zheng Cai ·

We have designed, developed, and applied a convolutional neural network (CNN) architecture using multi-task learning to search for and characterize strong HI Lya absorption in quasar spectra. Without any explicit modeling of the quasar continuum nor application of the predicted line-profile for Lya from quantum mechanics, our algorithm predicts the presence of strong HI absorption and estimates the corresponding redshift zabs and HI column density NHI, with emphasis on damped Lya systems (DLAs, absorbers with log NHI > 20.3). We tuned the CNN model using a custom training set of DLAs injected into DLA-free quasar spectra from the Sloan Digital Sky Survey (SDSS), data release 5 (DR5). Testing on a held-back validation set demonstrates a high incidence of DLAs recovered by the algorithm (97.4% as DLAs and 99% as an HI absorber with log NHI > 19.5) and excellent estimates for zabs and NHI. Similar results are obtained against a human-generated survey of the SDSS DR5 dataset. The algorithm yields a low incidence of false positives and negatives but is challenged by overlapping DLAs and/or very high NHI systems. We have applied this CNN model to the quasar spectra of SDSS-DR7 and the Baryonic Oscillation Spectroscopic Survey (BOSS, data release 12) and provide catalogs of 4,913 and 50,969 DLAs respectively (including 1,659 and 9,230 high-confidence DLAs that were previously unpublished). This work validates the application of deep learning techniques to astronomical spectra for both classification and quantitative measurements.

PDF Abstract