arXiv:2110.01012v1 [astro-ph.IM] 3 Oct 2021

9
Time delay estimation in unresolved lensed quasars L. Biggio a , A. Domi b , S. Tosi c , G. Vernardos d , D. Ricci e , L. Paganin c , G. Bracco c a Eidgen¨ ossische Technische Hochschule Z¨ urich, R¨amistrasse 101, CH–8092 Z¨ urich, Switzerland b University of Amsterdam, Science Park 105, 1098 XG Amsterdam, the Netherlands c Universit´ a degli Studi di Genova and Istituto Nazionale di Fisica Nucleare (INFN) –Sezione di Genova, via Dodecaneso 33, 16146, Genoa, Italy d Institute of Physics, Laboratory of Astrophysics, Ecole Polytechnique Federale de Lausanne (EPFL), Observatoire de Sauverny, 1290 Versoix, Switzerland e Istituto Nazionale di Astrofisica (INAF), Osservatorio di Padova –Vicolo dell’Osservatorio, 5, 35122 Padova, Italy Abstract The fractional uncertainty on the H 0 measurement, performed with time–delay cosmography, linearly de- creases with the number of analysed systems and it is directly related to the uncertainty on relative time delays measurements between multiple images of gravitationally–lensed sources. Analysing more lensed sys- tems, and collecting data in regular and long–term monitoring campaigns contributes to mitigating such uncertainties. The ideal instruments would clearly be big telescopes thanks to their high angular resolution, but, because of the very large amount of observational requests they have to fulfill, they are hardly suitable for the purpose. On the other hand, small/medium–sized telescopes are much more accessible and are of- ten characterized by more versatile observational programs. However, their limited resolution capabilities and their often not privileged geographical locations may prevent them from providing well–separated im- ages of lensed sources. Future campaigns plan to discover a huge number of lensed quasar systems which small/medium–sized telescopes will not be able to properly resolve. This work presents a deep learning–based approach, which exploits the capabilities of convolutional neural networks to estimate the time–delay in unresolved lensed quasar systems. Experiments on simulated unre- solved light curves show the potential of the proposed method and pave the way for future applications in time–delay cosmography. 1. Introduction It is well known that the Hubble parameter H 0 quantifies the current expansion rate of the uni- verse. However, measurements of H 0 from different observations have led to a tension on its estimated value. In particular, early universe (EU) observa- tions [1] have measured H 0 = 67.4 ± 0.5 km s -1 Mpc -1 , whereas, late universe (LU) observations [2] give H 0 = 74.03 ± 1.42 km s -1 Mpc -1 , resulting in a tension of about 4.4σ. As firstly pointed out in Ref. [3], time-delay cos- mography offers an alternative way of determin- ing the Hubble parameter by measuring the time delay (ΔT ) between multiple images of a gravita- tionally lensed quasars (GLQs). The time delay is related to the Hubble parameter as H 0 1/ΔT . The most relevant results obtained via time-delay cosmography come from the H0LiCOW collabora- tion [4], who found H 0 = 73.3 +1.7 -1.8 km s -1 Mpc -1 from a sample of six GLQs monitored by the COS- MOGRAIL project [5]. This result, combined with the other LU observations [2], enhances the H 0 ten- sion up to 5.3σ. However, a more recent anal- ysis of 40 strong gravitational lenses, from TD- COSMO+SLACS [6], has found H 0 = 67.4 +4.1 -3.2 km s -1 Mpc -1 , relaxing the Hubble tension and demonstrating the importance of a deep under- standing of the mass density profiles of the lenses. In this context, further studies are needed for a more precise estimation of the Hubble parameter. To this extent, the fractional error on H 0 , for an ensemble of N GLQs, is related to the uncertain- ties in the time-delay estimation σ ΔT , line-of-sight convergence σ los and lens surface density σ hki as [7]: σ 2 H H 2 0 σ 2 ΔT /ΔT 2 + σ 2 los + σ 2 hki N (1) 1 arXiv:2110.01012v1 [astro-ph.IM] 3 Oct 2021

Transcript of arXiv:2110.01012v1 [astro-ph.IM] 3 Oct 2021

Page 1: arXiv:2110.01012v1 [astro-ph.IM] 3 Oct 2021

Time delay estimation in unresolved lensed quasars

L. Biggioa, A. Domib, S. Tosic, G. Vernardosd, D. Riccie, L. Paganinc, G. Braccoc

aEidgenossische Technische Hochschule Zurich, Ramistrasse 101, CH–8092 Zurich, SwitzerlandbUniversity of Amsterdam, Science Park 105, 1098 XG Amsterdam, the Netherlands

cUniversita degli Studi di Genova and Istituto Nazionale di Fisica Nucleare (INFN) –Sezione di Genova, via Dodecaneso 33,16146, Genoa, Italy

dInstitute of Physics, Laboratory of Astrophysics, Ecole Polytechnique Federale de Lausanne (EPFL), Observatoire deSauverny, 1290 Versoix, Switzerland

eIstituto Nazionale di Astrofisica (INAF), Osservatorio di Padova –Vicolo dell’Osservatorio, 5, 35122 Padova, Italy

Abstract

The fractional uncertainty on the H0 measurement, performed with time–delay cosmography, linearly de-creases with the number of analysed systems and it is directly related to the uncertainty on relative timedelays measurements between multiple images of gravitationally–lensed sources. Analysing more lensed sys-tems, and collecting data in regular and long–term monitoring campaigns contributes to mitigating suchuncertainties. The ideal instruments would clearly be big telescopes thanks to their high angular resolution,but, because of the very large amount of observational requests they have to fulfill, they are hardly suitablefor the purpose. On the other hand, small/medium–sized telescopes are much more accessible and are of-ten characterized by more versatile observational programs. However, their limited resolution capabilitiesand their often not privileged geographical locations may prevent them from providing well–separated im-ages of lensed sources. Future campaigns plan to discover a huge number of lensed quasar systems whichsmall/medium–sized telescopes will not be able to properly resolve.This work presents a deep learning–based approach, which exploits the capabilities of convolutional neuralnetworks to estimate the time–delay in unresolved lensed quasar systems. Experiments on simulated unre-solved light curves show the potential of the proposed method and pave the way for future applications intime–delay cosmography.

1. Introduction

It is well known that the Hubble parameter H0

quantifies the current expansion rate of the uni-verse. However, measurements of H0 from differentobservations have led to a tension on its estimatedvalue. In particular, early universe (EU) observa-tions [1] have measured H0 = 67.4 ± 0.5 km s−1

Mpc−1, whereas, late universe (LU) observations[2] give H0 = 74.03±1.42 km s−1 Mpc−1, resultingin a tension of about 4.4σ.As firstly pointed out in Ref. [3], time-delay cos-mography offers an alternative way of determin-ing the Hubble parameter by measuring the timedelay (∆T ) between multiple images of a gravita-tionally lensed quasars (GLQs). The time delay isrelated to the Hubble parameter as H0 ∝ 1/∆T .The most relevant results obtained via time-delaycosmography come from the H0LiCOW collabora-

tion [4], who found H0 = 73.3+1.7−1.8 km s−1 Mpc−1

from a sample of six GLQs monitored by the COS-MOGRAIL project [5]. This result, combined withthe other LU observations [2], enhances the H0 ten-sion up to 5.3σ. However, a more recent anal-ysis of 40 strong gravitational lenses, from TD-COSMO+SLACS [6], has found H0 = 67.4+4.1

−3.2

km s−1 Mpc−1, relaxing the Hubble tension anddemonstrating the importance of a deep under-standing of the mass density profiles of the lenses.In this context, further studies are needed for amore precise estimation of the Hubble parameter.To this extent, the fractional error on H0, for anensemble of N GLQs, is related to the uncertain-ties in the time-delay estimation σ∆T , line-of-sightconvergence σlos and lens surface density σ〈k〉 as [7]:

σ2H

H20

∼σ2

∆T /∆T2 + σ2

los + σ2〈k〉

N(1)

1

arX

iv:2

110.

0101

2v1

[as

tro-

ph.I

M]

3 O

ct 2

021

Page 2: arXiv:2110.01012v1 [astro-ph.IM] 3 Oct 2021

Image 1 Image 2 Image 3 Image 4

Figure 1: Top-left: distribution of known GLQs as a func-tion of the number of multiple images. Top-right: distribu-tion of known GLQs as a function of the maximum imageseparation. Bottom (left and right): Magnitude of the multi-ple images versus the maximum image separation. The greyregion contains 70% of the total GLQ sample.

where the first two terms are dominated by randomuncertainties and their contribution to the overalluncertainty scales as N−1/2. Therefore, increasingthe samples of analysed GLQs allows to reduce theuncertainty on H0.To date, a sample of about 220 GLQs is available1.However, only a very small subset, with well sep-arated multiple images, has been used to measureH0. The reason is that it is easier and safer toextract information from such systems, and conse-quently reduce the uncertainty on H0.Fig. 1 (bottom) shows the magnitude of the mul-tiple images versus the maximum image separationfor the known GLQs. Systems falling in the grey re-gion, which represents the 70% of the total sample,have a maximum image separation below 2 arcsec.In addition, Ref. [8] presents a new method tofind GLQs from unresolved light curves. Such sys-tems have therefore even smaller separation be-tween multiple images.This would make big telescopes the ideal instru-ments to perform lensed quasars monitoring, sueto their high sensitivity, their optimal geographicallocation, where the effects of atmospheric turbu-

1https://research.ast.cam.ac.uk/lensedquasars/

index.html

lence are less prominent, and state-of-the-art adap-tive optics systems. However, because of the timescales of the intrinsic variations of the sources, suchobservation campaigns should last years [5]. Con-sequently, due to the amount of observational re-quests that big telescopes have to fulfill, they canhardly be employed for these purposes. On theother hand, small/medium sized telescopes (≈ 1-2m) [9] can often guarantee a better availability ofobservation time. Unfortunately, their already re-duced sensitivity is further worsened by their of-ten less privileged geographical positions, in termsof clear nights and atmospheric seeing, which canreach 3 arcsec [10].The majority of GLQs already known, togetherwith future discoveries, will mainly appear as a sin-gle image for small/medium-size telescopes. Forthis reason, this work proposes a novel approach,based on deep learning (DL) algorithms, to esti-mate the time delay in unresolved GLQ light curves.For simplicity, this work focuses on double-lensedGLQs. This choice is further motivated by the factthat the majority of the already known systems aredoubles [11, 12], as shown in Fig. 1 (top-left). How-ever, a large amount of quadruply-imaged GLQs isexpected to be detected in the future [8]. Therefore,an extension of this work for systems with N > 2images is planned.The paper is structured as follows: Sec. 2 describesthe DL-based method used for evaluating the timedelay between multiple images. Sec. 3 describessimulation of the light curves needed for trainingthe DL algorithm. Sec. 4 shows the results of theproposed method on a test dataset.

2. Time delay estimation with deep learning

The method here proposed exploits the ability ofmodern convolutional neural networks (CNNs) toextract informative features directly from raw datain an end-to-end fashion given a pre-specified down-stream task of interest. In this case, such a task isrepresented by the estimation of the time-delay be-tween two unresolved quasar light curves.The approach is motivated by the surprisingly goodperformances of Machine Learning and, in par-ticular, Deep Learning (DL) methods in a widerange of engineering fields including Natural Lan-guage Processing, Computer Vision, Speech Recog-nition, and in the Natural Sciences with remark-able results obtained in Cosmology, Fluid Dynam-ics and Molecular Dynamics among many other

2

Page 3: arXiv:2110.01012v1 [astro-ph.IM] 3 Oct 2021

domains. In particular, CNNs have been success-fully used in Astrophysics in the cases of auto-mated tasks on large datasets from the wide sur-veys [13, 14, 15, 16, 17, 18].Most of these techniques are based on the super-vised learning paradigm, i.e. when labelled dataare available and the algorithm can rely on explicitsupervision signals. In the case of most DL al-gorithms, the extent of such supervision is oftensignificant, meaning that large labelled dataset areneeded to make learning effective. This scenario of-ten results in excellent performance when labelleddata are abundant and their collection is easy andnot expensive. However, these conditions are notoften satisfied and, in absence of the aforemen-tioned labelled dataset, one must resort to eitherunsupervised or self-supervised learning strategies,for which only unlabelled data are used, or to syn-thetic data generation in order to produce the de-sired labelled datasets in such a way that the artifi-cial data resemble the real ones as much as possible.This work follows the second route: the trainingdata are manually construct via a physics-groundedlens simulator described in Section 3. The benefitof such an approach is that, compatibly with theavailable computational resources, arbitrarily largedatasets can be generated. On the other hand, thedownside is that the performance of the model atinference time, i.e. when tested on real data, will bestrongly dependent on the degree of fidelity of theartificial data with respect to the real ones. Thisproblem is also known as sim2real gap and it is avery important aspect in many disciplines, includ-ing robotics and computer vision.The goal of this paper is to show that, given an asaccurate as possible simulator (in the sense specifiedabove), a fully data-driven CNN is able to retrievethe time delay from a single time series represent-ing the overlap between two unresolved light curves.The approach is modular in the sense that, if a moreprecise simulator is available, it can be simply usedto generate new data and retrain our CNN models.Furthermore, since the proposed models in the lit-erature have yielded state-of-the-art results in thecontext of time-series classification and regression,and they are used in this work with very little vari-ations compared to their original form, significantchanges in the architectures will not be needed toobtain results in line to those shown here, even inpresence of data generated from a different simula-tor.In the following, we motivate our choice of CNNs as

our data-driven models and we introduce the basicprinciples behind their architecture. Then, we de-tail some design choices and we provide informationabout our training procedure.

2.1. Convolutional Neural Networks

CNNs have been initially proposed in the contextof Computer Vision applications such as image clas-sification and segmentation. They differ from stan-dard fully-connected neural networks (where eachnode in a certain layer is linked to any other node inthe subsequent one) since they implement a convo-lution operation conferring them two biologically-inspired properties, namely weight sharing and lo-cal connectivity. The first results in the sameweights applied repeatedly to different areas of theinput data, whereas the second imposes that the ac-tion of such weights is realized only locally, on smallregions of the input space. Modern CNNs consist ofmultiple stacked layers implementing the aforemen-tioned operation in a hierarchical fashion. BesidesComputer Vision, CNNs have been also fruitfullyapplied to time series regression and classification.The choice of such networks in these contexts ismotivated by the structural assumptions (or induc-tive biases) at the basis of the design of CNN mod-els. Such architectures implement a series of con-volutions at each level of the hierarchy along theirdepth. They work by extracting local features fromthe input raw data, whose representation assumesincreasingly higher levels of complexity as we movealong the deeper layers of the network.This work assumes that eventual traces of the mag-nitude of the time delay between two curves man-ifest themselves at a local level, motivating thechoice of CNN as feature extractor. The applicationof CNNs to the problem of time-delay estimation indescribed the following paragraph.

2.2. Time-delay estimation with CNNs

The input of the CNN models consists of a timeseries representing an unresolved quasar light curvex = xtTt=1 where T is the length of the sequence.The output of the model is a real number y ∈ Rindicating the time-delay between the two curvesoverlapped into the input time-series. Fig. 2 sum-marises the methodology. Specifically, models aretrained by a generated dataset D = xi, yiNi=1,where N is the total number of training examplesand yi is the ground-truth time delay associatedwith the i-th training instance. This initial dataset

3

Page 4: arXiv:2110.01012v1 [astro-ph.IM] 3 Oct 2021

SIMULATED UNRESOLVEDDOUBLE-IMAGE QSO

LIGHT CURVESCNN TIME DELAY

ESTIMATION

INPUT OUTPUT

INFERENCE

TRAINING

Figure 2: Overview of the proposed methodology. Artificialdata are used to train the CNN model. Inference is thenperformed on the original non-resolved light curves using thetrained model.

D is split into three parts, namely a training datasetDtr, a validation dataset Dval and a testing datasetDts. The first is used to train the weights of ourmodel, the second to check the generalization per-formance during training and the last one to eval-uate the model once the training phase is termi-nated. Such a splitting is necessary to monitor theoccurrence of the so-called overfitting phenomenon,i.e. when the neural network simply memorizesthe training set and does not generalize outside thetraining distribution.The mean-squared error (MSE) is used as a lossfunction, namely to measure the error the networkis making in predicting y instead of y:

L =1

N

N∑i=1

(yi − yi)2 (2)

During training, the weights of the network are var-ied so that the value of this loss is minimized. Theoptimization algorithm used for this procedure iscalled stochastic gradient descent. In this work,a popular variant of such algorithm, called Adam[19], is used.During training, we periodically evaluate the net-work on the validation set and we check its perfor-mance in terms of MSE. As commonly done in theliterature, whenever the validation loss decreases,the weights of the network at that step of trainingare saved.

2.3. Models

Three different CNN models are here tested toperform the time-delay estimation task. The firsttwo, namely ResNet and InceptionTime, are the re-sults of recent research efforts in the area of timeseries classification and are adapted to our scope

with minimal changes compared to their originalimplementation. The choice of these three models ismotivated by the fact that we wanted to show thatstate-of-the-art time series analysis neural networkscan be efficiently applied with minor modificationsto a challenging cosmological problem. To the bestof the authors’ knowledge indeed, this is the firsttime ResNet and InceptionTime have been appliedto a problem involving time-series in the context ofCosmology and we hope our work can be inspira-tional for future applications of such models. Thethird is a relatively simple deep CNN model andits fine-tuned to maximize the performance on thevalidation set.

3. Generation of Light Curves Datasets

The proposed method is based on using mock-up data to model the lensed systems. Specifically,the MOLET software package [20] has been usedto generate light curves of GLQs multiple images.MOLET allows to include also the microlensing ef-fect, which highly affects time delay estimates [5].In this work, two different lensed quasar systemshave been simulated. The first system, hereafterdenoted as system A, is a basic test-system for anAGN point source, with a simple intrinsic variabil-ity and with microlensing variability light curvesgenerated from magnification maps. The secondsystem is RXJ 1131-1231 [21], hereafter denotedas RXJ for brevity. It consists of four multi-ple images, here combined to mimic an unresolveddoubly-imaged quasar.The assumed cosmological model is ΛCDM withΩm = 0.3 and H0 = 67.7kms−1Mpc−1 for systemA, whereas H0 = 84kms−1Mpc−1 for RXJ. Thischoice of parameters has been made for consistencywith Ref. [20]. To properly simulate the observedlight curves, the intrinsic variability of the quasarsource is needed. The assumed intrinsic variabilitycurves of the quasar of system A and RXJ are con-sistent with Ref. [20] and are shown in Fig. 3.The second step of the simulation needs the magni-fication maps, which have been generated, for boththe systems, with the GERLUMPH tool [22].Finally, the last simulation step accounts also forthe assumed instrumental gaps (daily or seasonal)to simulate a realistic campaign from an opticaltelescope. More details on the simulation of RXJcan be found in Ref. [20].2000 simulations of System A, and 8000 simulationsof RXJ are performed. Each simulation produced

4

Page 5: arXiv:2110.01012v1 [astro-ph.IM] 3 Oct 2021

0 1000 2000 3000 4000 5000 6000 700016.0

16.1

16.2

16.3

16.4

16.5

Syst

em A

- M

ag

0 1000 2000 3000 4000 5000 6000t [days]

17.8

18.0

18.2

18.4

RXJ 1

131-

1231

- M

ag

Figure 3: Assumed intrinsic variability of the quasar pointsource of system A (top) and RXJ 1131-1231 (bottom).

four resolved light curves, one for each multiple im-age. Each light curve is actually composed by twocurves: a continuous curve, which accounts for theintrinsic quasar variability as well as the microlens-ing effect, and a discrete curve, which accounts alsofor the realistic experimental gaps of a campaignwith an optical telescope.Since the goal is to measure the time delay inunresolved doubly-imaged systems, as seen bymedium/small size optical telescopes, to mimicsuch an effect the continuous curves have been com-bined together two at a time, to simulate a singlelight curve of an unresolved doubly-imaged GLQ.In this way, a total of six realisations of light curvesof an effective doubly-imaged quasar are obtainedfrom the light curves of the four images of the orig-inal GLQ. Therefore, each of these six light curvesis characterised by a different time delay, which isgiven by the absolute difference of the time delaysof the underlying light curves.The combination of the light curves is performedadopting the following definition of the magnitudeof a generic source X

magX = −2.5 log10 FX +K, (3)

where FX is the flux of the source X, namely theenergy per unit time per unit area incident on thedetector, and the constant K defines the zero pointof the magnitude scale. When two images A and B

of a GLQ cannot be resolved, a single image O willappear on the detector, with a flux given to a firstapproximation by the sum FO = FA + FB of thefluxes of the two overlapped images. According toEquation (3) the corresponding magnitude is

magO(t) = −2.5 log10 (FA(t) + FB(t+ ∆t)) +K.(4)

IfB is delayed by ∆t with respect to A, the informa-tion about the time-delay ∆t should still be presentin the features of the light curve of the image sumO. This, using all the various possible combinationsin pairs of the multiple images, has allowed to get6 · 2000 = 12000 unresolved light curves for SystemA and 6 · 8000 = 48000 light curves for RXJ. Fig. 4and Fig. 5 show ten samples each of the light curvesobtained from SystemA and RXJ respectively.

0 500 1000 1500 2000 2500 3000time (days)

12.0011.7511.5011.2511.0010.7510.50

Mag

nitu

de

Figure 4: Continuous sample light curves for System A.

0 500 1000 1500 2000 2500 3000time (days)

13.0

13.5

14.0

14.5

15.0

15.5

16.0

Mag

nitu

de

Figure 5: Continuous sample light curves for RXJ.

5

Page 6: arXiv:2110.01012v1 [astro-ph.IM] 3 Oct 2021

4. Results

This section shows the performance of the threeproposed CNNs architectures on the previously de-scribed simulated data. All the results reported inthe following evaluation are obtained on test data,namely light curves that our models never “see”during their training phase. The goal is indeedto verify their level of generalization on new testcurves extracted from the same distribution of thetraining data. Note that, since the test data aregenerated with the same simulation engine used toproduce the training set, here the ability of our al-gorithm to generalization out-of-distribution is nottested, i.e. we do not test its robustness with re-spect to the sim2real gap phenomenon.The performance of the models on both SystemAand RXJ are analysed at different levels. Thefirst evaluation studies the error distribution ε =tpredicted-ttrue between the predicted time delaytpredicted and the ground-truth one tpredicted. Sev-eral training runs are performed for each model and,for this experiment, the best performing one foreach architecture is selected. The kernel densityestimation [23, 24] is used to approximate the errordistributions for each model and each system. Theresults are shown in Fig. 6 for SystemA and in Fig.7 for RXJ . For both systems, the distribution pro-vided by the CNN model (shown in blue in the Fig-ures) is much narrower and symmetric around zero.ResNet (green) seems to outperform InceptionTime(red), even though both provide broader and moreantisymmetric curves than our CNN model.

15 10 5 0 5 10tpredicted ttrue (days)

0.0

0.2

0.4

0.6

0.8

Num

ber D

ensit

y

Figure 6: Kernel Density Estimation of the error distri-butions on the test set provided by our CNN model (blue),InceptionTime (red) and ResNet (green), for the System A.

15 10 5 0 5 10 15tpredicted ttrue (days)

0.0

0.2

0.4

0.6

0.8

Num

ber D

ensit

y

Figure 7: Kernel Density Estimation of the error distri-butions on the test set provided by our CNN model (blue),InceptionTime (red) and ResNet (green), for the RXJ sys-tem.

As a second evaluation, the distribution of the r2

coefficient of determination [23, 24] is investigatedto establish to what extent prediction and groundtruth are aligned. For each system, each modelis trained 20 times and, the performance of eachtrained model are tested on the test set in terms ofthe r2 score. This analysis results in the so-called”violin plots” shown in Fig. 8 for SystemA and inFig. 9 for RXJ . In both cases, the CNN (blue) isagain the best performing model. However, in thecase of system A, its r2 distribution is characterizedby a slightly larger variance, resulting in an overlapwith the ResNet (green) r2 distribution. Inception-Time is outperformed by the other two baselines inboth systems and, in the case of RXJ , it producesa high variance distribution with realizations rang-ing between a minimum of 0.6 and a maximum of0.8 r2 scores.In summary, the analysis in terms of error distri-bution and r2 score highlights that all models pro-vide very good performances on both systems. Itis important to emphasize that InceptionTime andResNet were not fine-tuned, in order to leave themas similar as possible to the original implementa-tions. This was done on purpose to showcase theflexibility of these models to work on very hetero-geneous types of data. We therefore expect themto further improve their results with more carefularchitectural and hyper-parameter design choices.After having verified that the proposed methodsgeneralize well on the test set, their robustness is as-sessed when noise is injected in the data. This oper-

6

Page 7: arXiv:2110.01012v1 [astro-ph.IM] 3 Oct 2021

CNN inception resnet

0.75

0.80

0.85

0.90

0.95

1.00

r2 sco

re

Figure 8: r2 score distributions provided by the three con-sidered models, for the System A.

CNN inception resnet0.60

0.65

0.70

0.75

0.80

0.85

0.90

0.95

r2 sco

re

Figure 9: r2 score distributions provided by the three con-sidered models, for the RXJ system.

ation effectively introduces a bias between trainingand testing distributions, whose severity dependson the intensity of the perturbation. Fig.10 andFig. 11 show how the r2 score of each model de-creases the standard deviation of the gaussian noiseincreases. Again, generally the CNN (blue) outper-forms the other models. However, in SystemA, itscurve tends to align with the InceptionTime oneas the noise level increases. Interestingly, the CNNmodel seems to be more robust on RXJ , providingsatisfactory values of the r2 score even for relativelylarge noise standard deviations.As a last experiment, we study how the previousanalysis changes if we train our best performingCNN on data perturbed by noise at variable stan-dard deviations. To do so, at training time for

10 5 10 4 10 3 10 2

1.5

1.0

0.5

0.0

0.5

1.0

r2 sco

re

CNNinceptionresnet

Figure 10: r2 scores for the three models as injected noisestandard deviation increases, for System A.

10 5 10 4 10 3 10 2

0.2

0.0

0.2

0.4

0.6

0.8

1.0

r2 sco

re

CNNinceptionresnet

Figure 11: r2 scores for the three models as injected noisestandard deviation increases, for the RXJ system.

each batch of data, we randomly sample a value forthe noise standard deviation from a uniform dis-tribution between 10−5 and 10−2 and we feed thecorrupted data into the network. Once training isterminated, we repeat the previous experiments bytesting the model on data corrupted by differentnoise perturbations. The results for SystemA andRXJ are shown in Figs. 12 and 13 respectively.

7

Page 8: arXiv:2110.01012v1 [astro-ph.IM] 3 Oct 2021

10 5 10 4 10 3 10 2

0.70

0.75

0.80

0.85

r2 sco

re

Figure 12: r2 scores for our CNN model trained with ran-dom noise injection as noise standard deviation increases, forSystem A.

10 5 10 4 10 3 10 20.65

0.70

0.75

0.80

0.85

0.90

r2 sco

re

Figure 13: r2 scores for our CNN model trained with ran-dom noise injections as noise standard deviation increases,for the RXJ system.

The result of injecting noise at training time is,as it might have guessed, a model which is morerobust to perturbations in the test data. On theother hand, this positive effect is obtained at theprice of a slight degradation in performance whenthe noise level is low. This experiment suggeststhat randomly injecting noise in the data at train-ing time in this case represents an effective strategyto obtain more robust models.In light of the presented experiments, the proposedCNNs architectures appear to guarantee satisfac-tory performances on the task of predicting the timedelay from unresolved quasar light curves. Overall,the CNN model here proposed seems to outperform

the others even though more fine-tuning and morecarefully designed choices might lead to close thisgap.

5. Conclusions

In this work, a new class of methods based on DLto extract the time delay between unresolved quasarlight curves has been presented. Our method ismotivated by two main factors: first and foremost,by the existing tension on the estimated values ofH0, the Hubble constant. Second, by the necessityto process and extract information from unresolvedquasar light curves, which are usually made avail-able by small/medium scale astronomical observa-tories. Our experimental evaluation shows that theapproach performs excellently on simulated datadescribing two different lensed quasar systems.The method here proposed has several advantagescompared to classical approaches: first, as previ-ously mentioned, it is designed to process unre-solved light curves which represent the main type ofdata small/medium sized telescopes are expected toprovide. Second, after an amortization phase dueto network training, inference time is very small,which is a desirable property for online applications.Third, the method is fully data-driven, its perfor-mance scales favourably with the dataset size andit makes little to none assumptions on the natureof the time-delay estimation problem.On the other hand, the current implementation isstill affected by some limitations that open newinteresting research questions. Specifically, themethod is highly reliant on the quality of the sim-ulated data and on their level of fidelity with re-spect to real quasar light curves. Preliminary ex-periments based on transferring the proposed ap-proach to real data, not reported in this version ofthe manuscript, show a degradation in performanceimputable to the above observation. This problemis a manifestation of the sim2real gap phenomenondescribed in Sec. 4. In order to alleviate it, the gapbetween training and test data must be reducedas much as possible. On the other hand, one cantake advantage of the combination of the sim2realgap phenomenon and our method to quantify thequality of the simulated data: the better the per-formance of the network, the closer the simulatorgrasps the details of the data it is trying to emu-late. One last future improvement to the methodhere discussed is to process irregularly sampled time

8

Page 9: arXiv:2110.01012v1 [astro-ph.IM] 3 Oct 2021

series which are commonly encountered in the con-text of cosmological applications. These new possi-bilities will be investigated in future publications.

References

[1] N. Aghanim et al., Planck 2018 results, A &A 641(2020) A6.

[2] A. G. Riess et al., Large magellanic cloud cepheidstandards provide a 1% foundation for thedetermination of the hubble constant and strongerevidence for physics beyond λcdm, ApJ 876 (2019) 85.

[3] S. Refsdal, On the possibility of determining hubble sparameter and the masses of galaxies from thegravitational lens effect, MNRAS 128 (1964) 307.

[4] K. C. Wong et al., H0LiCOW XIII. A 2.4%measurement of H0 from lensed quasars: 5.3σ tensionbetween early and late-Universe probes, MNRAS 498(2019) 1420–1439.

[5] M. Millon et al., COSMOGRAIL XIX: Time delays in18 strongly lensed quasars from 15 years of opticalmonitoring, A &A (2020) .

[6] S. Birrer et al., Tdcosmo, A &A 643 (2020) A165.[7] S. S. Tie and C. S. Kochanek, Microlensing makes

lensed quasar time delays significantly time variable,MNRAS 473 (2017) 80–90.

[8] Y. Shu, V. Belokurov and N. W. Evans, Discoveringstrongly-lensed qsos from unresolved light curves, 2020.

[9] U. Borgeest et al., A Dedicated Quasar MonitoringTelescope, in Examining Big Bang Diffus. Backgr.Radiations, pp. 527–528, (1996).

[10] H. Karttunen, P. Kroger, H. Oja, M. Poutanen andK. J. Donner, Fundamental Astronomy.Springer-Verlag Berlin and Heidelberg Gm, 2017,10.1007/978-3-662-53045-0.

[11] M. Oguri and P. J. Marshall, Gravitationally lensedquasars and supernovae in future wide-field opticalimaging surveys, MNRAS 405 (2010) 2579–2593.

[12] T. E. Collett, The population of galaxy-galaxy stronglenses in forthcoming optical imaging surveys, ApJ811 (2015) .

[13] G. Cabrera-Vives et al., Supernovae detection by usingconvolutional neural networks, 2016 InternationalJoint Conference on Neural Networks (IJCNN) (2016)251.

[14] A. Kimura et al., Single-epoch supernova classificationwith deep convolutional neural networks, IEEE 37thInternational Conference on Distributed ComputingSystems Workshops (ICDCSW) (2017) .

[15] H. S. D. George and E. A. Huerta, Deep transferlearning: A new deep learning glitch classificationmethod for advanced ligo, Physical Review D 97(2017) .

[16] K. Schawinski et al., Generative adversarial networksrecover features in astrophysical images of galaxiesbeyond the deconvolution limit,, MNRAS 467 (2017)110.

[17] N. Sedaghat and A. Mahabal, Effective imagedifferencing with convnets for real-time transienthunting, MNRAS 476 (2018) .

[18] C. J. Shallue and A. Vanderburg, Identifyingexoplanets with deep learning: A five-planet resonantchain around kepler-80 and an eighth planet aroundkepler-90, The Astronomical Journal 155 (2018) .

[19] D. P. Kingma and J. Ba, Adam: A method forstochastic optimization, 2015.

[20] G. Vernardos, Simulating time-varying strong lenses,2106.04344.

[21] J. F. Claeskens, D. Sluse, P. Riaud and J. Surdej,Multi wavelength study of the gravitational lenssystem RXS J1131-1231. II. Lens model and sourcereconstruction, A&A 451 (2006) 865[astro-ph/0602309].

[22] F. Neira et al., A quasar microlensing light-curvegenerator for lsst, MNRAS 495 (2020) 544–553.

[23] B. W. Silverman, Density Estimation for Statisticsand Data Analysis. Chapman and Hall, 1986.

[24] D. W. Scott, Multivariate Density Estimation. Wiley,1992.

9