Fermi Gamma-ray Space Telescope

Likelihood Analysis with Python

The python likelihood tools are a very powerful set of analysis tools that expand upon the command line tools provided with the Fermi Science Tools package. Not only can you perform all of the same likelihood analysis with the python tools that you can with the standard command line tools but you can directly access all of the model parameters. You can more easily script a standard analysis like light curve generation. There are also a few things built into the python tools that are not available from the command line like the calculation of upper limits.

There are many user contributed packages built upon the python backbone of the Science Tools and we are going to highlight and use a few of them in this tutorial like likeSED, make3FGLxml, and the LATAnalysisScripts.

This sample analysis is based on the PG 1553+113 analysis performed by the LAT team and described in Abdo, A. A. et al. 2010, ApJ, 708, 1310. At certain points we will refer to this article as well as the Cicerone. After you complete this tutorial you should be able to reproduce all of the data analysis performed in this publication including generating a spectrum (individual bins and a butterfly plot) and produce a light curve with the python tools. This tutorial assumes you have the most recent ScienceTools installed. We will also make significant use of python, so you might want to familiarize yourself with python (there's a beginner's guide at http://wiki.python.org/moin/BeginnersGuide). This tutorial also assumes that you've gone through the non-python based unbinned likelihood thread. This tutorial should take approximately 8 hours to complete (depending on your computer's speed) if you do everything but there are some steps you can skip along the way which shave off about 4 hours of that.

Note: In this tutorial, commands that start with '[user@localhost]$' are to be executed within a shell environment (bash, c-shell etc.) while commands that start with '>>>' are to be executed within an interactive python environment. Commands are in bold and outputs are in normal font. Text in ITALICS within the boxes are comments from the author.

Get the Data

For this thread the original data were extracted from the LAT data server with the following selections (these selections are similar to those in the paper):

  • Search Center (RA,Dec) = (238.929,11.1901)
  • Radius = 20 degrees
  • Start Time (MET) = 239557417 seconds (2008-08-04T15:43:37)
  • Stop Time (MET) = 256970880 seconds (2009-02-22T04:48:00)
  • Minimum Energy = 100 MeV
  • Maximum Energy = 300000 MeV

We've provided direct links to the event files as well as the spacecraft data file if you don't want to take the time to use the download server. For more information on how to download LAT data please see the Extract LAT Data tutorial.

You'll first need to make a file list with the names of your input event files:

[user@localhost]$ ls -1 *PH*.fits > PG1553_events.list

In the following analysis we've assumed that you've named your list of data files PG1553_events.list and the spacecraft file PG1553_SC.fits.

Perform Event Selections

You could follow the unbinned likelihood tutorial to perform your event selections using gtlike, gtmktime etc. directly from the command line, and then use pylikelihood later. But we're going to go ahead and use python. The gt_apps module provides methods to call these tools from within python. This'll get us used to using python.

So, let's jump into python:

[user@localhost]$ python
Python 2.7.8 (default, Apr 8 2015, 02:38:01)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.

Ok, we want to run gtselect inside but we first need to import the gt_apps module to gain access to it.

>>> import gt_apps as my_apps

Now, you can see what objects are part of the gt_apps module by executing:

>>> help(my_apps)

Which brings up the help documentation for that module (type 'x' to exit). The python object for gtselect is called filter and we first need to set all of it's options. This is very similar to calling gtselect form the command line and inputting all of the variables interactively. It might not seem that convenient to do it this way but it's really nice once you start building up scripts (see Building a Light Curve) and reading back options and such. For example, towards the end of this thread, we'll want to generate a light curve and we'll have to run the likelihood analysis for each datapoint. It'll be much easier to do all of this within python and change the tmin and tmax in an iterative fashion. Note that these python objects are just wrappers for the standalone tools so if you want any information on their options, see the corresponding documentation for the standalone tool.

>>> my_apps.filter['evclass'] = 128
>>> my_apps.filter['evtype'] = 3
>>> my_apps.filter['ra'] = 238.929
>>> my_apps.filter['dec'] = 11.1901
>>> my_apps.filter['rad'] = 10
>>> my_apps.filter['emin'] = 100
>>> my_apps.filter['emax'] = 300000
>>> my_apps.filter['zmax'] = 90
>>> my_apps.filter['tmin'] = 239557417
>>> my_apps.filter['tmax'] = 256970880
>>> my_apps.filter['infile'] = '@PG1553_events.list'
>>> my_apps.filter['outfile'] = 'PG1553_filtered.fits'

Once this is done, run gtselect:

>>> my_apps.filter.run()

time -p ~ScienceTools/x86_64-unknown-linux-gnu-libc2.12/bin/gtselect
infile=@PG1553_events.list outfile=PG1553_filtered.fits ra=238.929
dec=11.1901 rad=10.0 tmin=239557417.0 tmax=256970880.0 emin=100.0
emax=300000.0 zmax=90.0 evclsmin=0 evclsmax=10 evclass=128 evtype=3
convtype=-1 phasemin=0.0 phasemax=1.0 evtable="EVENTS" chatter=2
clobber=yes debug=no gui=no mode="ql"
real 0.73
user 0.48
sys 0.10

Note that you can see exactly what gtselect will do if you run it by typing:

>>> my_apps.filter.command()

'time -p ~/ScienceTools/x86_64-unknown-linux-gnu-libc2.12/bin/gtselect
infile=@PG1553_events.list outfile=PG1553_filtered.fits ra=238.929
dec=11.1901 rad=10.0 tmin=239557417.0 tmax=256970880.0 emin=100.0
emax=300000.0 zmax=90.0 evclsmin=0 evclsmax=10 evclass=128 evtype=3
convtype=-1 phasemin=0.0 phasemax=1.0 evtable="EVENTS" chatter=2
clobber=yes debug=no gui=no mode="ql"'

You have access to any of the inputs by directly accessing the filter['OPTIONS'] options.

Next, you need to run gtmktime. This is accessed within python via the maketime object:

>>> my_apps.maketime['scfile'] = 'PG1553_SC.fits'
>>> my_apps.maketime['filter'] = '(DATA_QUAL>0)&&(LAT_CONFIG==1)'
>>> my_apps.maketime['roicut'] = 'no'
>>> my_apps.maketime['evfile'] = 'PG1553_filtered.fits'
>>> my_apps.maketime['outfile'] = 'PG1553_filtered_gti.fits'
>>> my_apps.maketime.run()

time -p ~/ScienceTools/x86_64-unknown-linux-gnu-libc2.12/bin/gtmktime
scfile=PG1553_SC.fits sctable="SC_DATA"
roicut=yes evfile=PG1553_filtered.fits evtable="EVENTS"
outfile="PG1553_filtered_gti.fits" apply_filter=yes overwrite=no
header_obstimes=yes tstart=0.0 tstop=0.0 gtifile="default" chatter=2
clobber=yes debug=no gui=no mode="ql"
real 2.85
user 2.61
sys 0.17

We're using the most conservative and most commonly used cuts described in deatil in the Cicerone.

Livetime Cubes and Exposure Maps

At this point, you could make a counts map of the events we just selected using gtbin (it's called evtbin within python) and I won't discourage you but we're going to go ahead and create a livetime cube and exposure map. This might take a few minutes to complete so if you want to create a counts map and have a look at it, get these processes going and open another terminal to work on your counts map (see the likelihood tutorial for an example of running gtbin to produce a counts map).

Livetime Cube

This step will take approximately 15 - 30 minutes to complete so if you want to just download the PG1553_ltCube from us you can skip this step.

>>> my_apps.expCube['evfile'] = 'PG1553_filtered_gti.fits'
>>> my_apps.expCube['scfile'] = 'PG1553_SC.fits'
>>> my_apps.expCube['outfile'] = 'PG1553_ltCube.fits'
>>> my_apps.expCube['zmax'] = 90
>>> my_apps.expCube['dcostheta'] = 0.025
>>> my_apps.expCube['binsz'] = 1
>>> my_apps.expCube.run()

time -p ~/ScienceTools/x86_64-unknown-linux-gnu-libc2.12/bin/gtltcube
evfile="PG1553_filtered_gti.fits" evtable="EVENTS" scfile=PG1553_SC.fits
sctable="SC_DATA" outfile=PG1553_ltCube.fits dcostheta=0.025 binsz=1.0
phibins=0 tmin=0.0 tmax=0.0 file_version="1" zmax=180.0 chatter=2
clobber=yes debug=no gui=no mode="ql"
Working on file PG1553_SC.fits
real 747.65
user 744.81
sys 0.93

While you're waiting, you might have noticed that not all of the command line science tools have an equivalent object in gt_apps. This is easy to fix. Say you want to use gtltcubesun from within python. Just make it a GtApp:

>>> from GtApp import GtApp
>>> expCubeSun = GtApp('gtltcubesun','Likelihood')
>>> expCubeSun.command()

time -p ~/x86_64-unknown-linux-gnu-libc2.12/bin/gtltcubesun
evfile="" evtable="EVENTS" scfile= sctable="SC_DATA"
outfile=expCube.fits body="SUN" dcostheta=0.025 thetasunmax=180.0
binsz=0.5 phibins=0 tmin=0.0 tmax=0.0 powerbinsun=2.7 file_version="1"
zmax=180.0 chatter=2 clobber=yes debug=no gui=no mode="ql"

Exposure Map

>>> my_apps.expMap['evfile'] = 'PG1553_filtered_gti.fits'
>>> my_apps.expMap['scfile'] = 'PG1553_SC.fits'
>>> my_apps.expMap['expcube'] = 'PG1553_ltCube.fits'
>>> my_apps.expMap['outfile'] = 'PG1553_expMap.fits'
>>> my_apps.expMap['irfs'] = 'CALDB'
>>> my_apps.expMap['srcrad'] = 20
>>> my_apps.expMap['nlong'] = 120
>>> my_apps.expMap['nlat'] = 120
>>> my_apps.expMap['nenergies'] = 37
>>> my_apps.expMap.run()

time -p ~/ScienceTools/x86_64-unknown-linux-gnu-libc2.12/bin/gtexpmap
evfile=PG1553_filtered_gti.fits evtable="EVENTS" scfile=PG1553_SC.fits
sctable="SC_DATA" expcube=PG1553_ltCube.fits
outfile=PG1553_expMap.fits irfs="CALDB" srcrad=20.0 nlong=120
nlat=120 nenergies=20 submap=no nlongmin=0 nlongmax=0 nlatmin=0
nlatmax=0 chatter=2 clobber=yes debug=no gui=no mode="ql"

The exposure maps generated by this tool are meant
to be used for *unbinned* likelihood analysis only.
Do not use them for binned analyses.
Computing the ExposureMap using PG1553_ltCube.fits
real 348.94
user 346.99
sys 0.51

Generate XML Model File

We need to create an XML file with all of the sources of interest within the Region of Interest (ROI) of PG 1553+113 so we can correctly model the background. For more information on the format of the model file and how to create one, see the likelihood analysis tutorial. We'll use the user contributed tool "make3FGLxml.py" to create a model file based on the LAT 4-year LAT catalog. You'll need to download the FITS version of this file at http://fermi.gsfc.nasa.gov/ssc/data/access/lat/4yr_catalog/ and get the make3FGLxml.py tool from the user contributed software page and put them both in your working directory. Also make sure you have the most recent galactic diffuse and isotropic model files which can be found at http://fermi.gsfc.nasa.gov/ssc/data/access/lat/BackgroundModels.html. The catalog and background models are also packaged with your installation of the ScienceTools, which can be found at: $FERMI_DIR//refdata/fermi/galdiffuse/.

Now that we have all of the files we need, you can generate your model file in python:

>>> from make3FGLxml import *
This is make3FGLxml version 01.
>>> mymodel = srcList('gll_psc_v16.fit','PG1553_filtered_gti.fits','PG1553_model.xml')
>>> mymodel.makeModel('gll_iem_v06.fits', 'gll_iem_v06', 'iso_P8R2_SOURCE_V6_v06.txt', 'iso_P8R2_SOURCE_V6_v06')
Creating file and adding sources from 3FGL
Added 58 point sources and 0 extended sources

For more information on the make3FGLxml.py module, see the usage notes.

You should now have a file called 'PG1553_model.xml'. Open it up with your favorite text editor and take a look at it. It should look like PG1553_model.xml. There should be seven sources within 4 degrees of the ROI center, nine sources between 4 and 8 degrees, and eight sources between 8 and 12 degrees (4 of which are beyond 10 degrees and frozen). In all, there are 38 sources beyond 10 degrees. The script designates these as outside of the ROI (which is 10 degrees) and instructs us to leave all of the variables for these sources fixed. We'll agree with this (these sources are outside of our ROI but could still affect our fit since photons from these sources could fall within our ROI due to the LAT PSF). At the bottom of the file, the two diffuse sources are listed (the galactic diffuse and extragalactic isotropic).

Notice that we've deviated a bit from the published paper here. In the paper, the LAT team only included two sources; one from the 0FGL catalog and another, non-catalog source. This is because the later LAT catalogs had not been released at the time. However, these 3FGL sources are still in the data we've downloaded and should be modeled.

Back to looking at our XML model file, notice that all of the sources have been set up with various spectral models (see the Cicerone for details on the different spectral models) and the module we ran filled in the values for the spectrum from the 3FGL catalog. Also notice that PG 1553+113 is listed in the model file as 3FGL J1555.7+1111 with all of the parameters filled in for us. It's actually offset from the center \ of our ROI by 0.008 degrees. How nice! The only problem with this is that trying to use the 3FGL model for for multiple years of data (Log Parabola) will cause us some issues for such a short time duration as we are analyzing here. Therefore we want to change the model for 3FGL J155.7+1111 to a simple power-law for the purposes of this analysis thread. You'll have to modify the relevant python scripts on your own to match whatever source model may be relevant for your data. In the xml file, change the entry for 3FGL J1555.7+1111 from <source ROI_Center_Distance="0.008" name="3FGL J1555.7+1111" type="PointSource">
<spectrum type="LogParabola">
<!-- Source is 0.00802140915217 degrees away from ROI center -->
<parameter free="1" max="1e4" min="1e-4" name="norm" scale="1e-12" value="4.82531809648"/>
<parameter free="1" max="5.0" min="0.0" name="alpha" scale="1.0" value="1.60433"/>
<parameter free="1" max="10.0" min="0.0" name="beta" scale="1.0" value="0.0388407"/>
<parameter free="0" max="5e5" min="30" name="Eb" scale="1.0" value="1491.38"/>
<spatialModel type="SkyDirFunction">
<parameter free="0" max="360.0" min="-360.0" name="RA" scale="1.0" value="238.936"/>
<parameter free="0" max="90" min="-90" name="DEC" scale="1.0" value="11.1939"/>
to <source ROI_Center_Distance="0.008" name="3FGL J1555.7+1111" type="PointSource">
<spectrum type="PowerLaw">
<!-- Source is 0.00802140915217 degrees away from ROI center -->
<parameter free="1" max="1e4" min="1e-4" name="Prefactor" scale="1e-11" value="1.0"/>
<parameter free="1" max="10.0" min="0.0" name="Index" scale="-1.0" value="2.5"/>
<parameter free="0" max="5e5" min="30" name="Scale" scale="1.0" value="500"/>
<spatialModel type="SkyDirFunction">
<parameter free="0" max="360.0" min="-360.0" name="RA" scale="1.0" value="238.936"/>
<parameter free="0" max="90" min="-90" name="DEC" scale="1.0" value="11.1939"/>
Save the file, return to your python window and then you're off to the races once more.....

Compute the diffuse source responses.

The diffuse source responses tell the likelihood fitter what the expected contribution would be for each diffuse source, given the livetime associated with each event. The source model XML file must contain all of the diffuse sources to be fit. The gtdiffrsp tool will add one column to the event data file for each diffuse source. The diffuse response depends on the instrument response function (IRF), which must be in agreement with the selection of events, i.e. the event class and event type we are using in our analysis. Since we are using SOURCE class, CALDB should use the P8R2_SOURCE_V6 IRF for this tool.

If the diffuse responses are not precomputed using gtdiffrsp, then the gtlike tool will compute them at runtime (during the next step). However, as this step is very computationally intensive (often taking ~hours to complete), and it is very likely you will need to run gtlike more than once, it is probably wise to precompute these quantities.

>>> my_apps.diffResps['evfile'] = 'PG1553_filtered_gti.fits'
>>> my_apps.diffResps['scfile'] = 'PG1553_SC.fits'
>>> my_apps.diffResps['srcmdl'] = 'PG1553_model.xml'
>>> my_apps.diffResps['irfs'] = 'CALDB'
>>> my_apps.diffResps.run()
time -p /usr/local/ScienceTools/bin/gtdiffrsp evfile=PG1553_filtered_gti.fits
evtable="EVENTS" scfile=PG1553_SC.fits sctable="SC_DATA" srcmdl=PG1553_model.xml
irfs="CALDB" evclsmin=0 evclass="INDEF" evtype="INDEF" convert=no chatter=2
clobber=no debug=no gui=no mode="ql"
adding source gll_iem_v06
adding source iso_P8R2_SOURCE_V6_v06
Working on...
real 2725.55
user 2713.77
sys 3.47

Run the Likelihood Analysis

It's time to actually run the likelihood analysis now. First, you need to import the pyLikelihood module and then the UnbinnedAnalysis functions (there's also a binned analysis module that you can import to do binned likelihood analysis which behaves almost exactly the same as the unbinned analysis module). For more details on the pyLikelihood module, check out the pyLikelihood Usage Notes.

>>> import pyLikelihood
>>> from UnbinnedAnalysis import *
>>> obs = UnbinnedObs('PG1553_filtered_gti.fits','PG1553_SC.fits',expMap='PG1553_expMap.fits',

>>> like = UnbinnedAnalysis(obs,'PG1553_model.xml',optimizer='NewMinuit')

By now, you'll have two objects, 'obs', an UnbinnedObs object and like, an UnbinnedAnalysis object. You can view these objects attributes and set them from the command line in various ways. For example:

>>> print obs
Event file(s): PG1553_filtered_gti.fits
Spacecraft file(s): PG1553_SC.fits
Exposure map: PG1553_expMap.fits
Exposure cube: PG1553_ltCube.fits
>>> print like
Event file(s): PG1553_filtered_gti.fits
Spacecraft file(s): PG1553_SC.fits
Exposure map: PG1553_expMap.fits
Exposure cube: PG1553_ltCube.fits
Source model file: PG1553_model.xml
Optimizer: NewMinuit

or you can get directly at the objects attributes and methods by:

>>> dir(like)
['NpredValue', 'Ts', 'Ts_old', '_Nobs', '__call__', '__class__', '__delattr__', '__dict__', '__doc__', '__format__', '__getattribute__', '__getitem__', '__hash__', '__init__', '__module__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setitem__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_errors', '_importPlotter', '_inputs', '_isDiffuseOrNearby', '_minosIndexError', '_npredValues', '_plotData', '_plotResiduals', '_plotSource', '_renorm', '_separation', '_setSourceAttributes', '_srcCnts', '_srcDialog', '_xrange', 'addSource', 'covar_is_current', 'covariance', 'deleteSource', 'disp', 'e_vals', 'energies', 'energyFlux', 'energyFluxError', 'fit', 'flux', 'fluxError', 'freePars', 'freeze', 'getExtraSourceAttributes', 'logLike', 'maxdist', 'minosError', 'model', 'nobs', 'normPar', 'observation', 'oplot', 'optObject', 'optimize', 'optimizer', 'par_index', 'params', 'plot', 'plotSource', 'plotSourceFit', 'reset_ebounds', 'resids', 'restoreBestFit', 'saveCurrentFit', 'scan', 'setFitTolType', 'setFreeFlag', 'setPlotter', 'setSpectrum', 'sourceFitPlots', 'sourceFitResids', 'sourceNames', 'srcModel', 'state', 'syncSrcParams', 'thaw', 'tol', 'tolType', 'total_nobs', 'writeCountsSpectra', 'writeXml']

or get even more details by executing:

>>> help(like)

There are a lot of attributes and here you start to see the power of using pyLikelihood since you'll be able (once the fit is done) to access any of these attributes directly within python and use them in your own scripts. For example, you can see that the like object has a 'tol' attribute which we can read back to see what it is and then set it to what we want it to be.

>>> like.tol
>>> like.tolType
The tolType can be '0' for relative or '1' for absolute.
>>> like.tol = 0.0001
Now, we're ready to do the actual fit. This next step can take anywhere from 10 minutes to a few hours to complete, depending on your system. We're doing something a bit fancy here. We're getting the minimizating object (and calling it 'likeobj') from the logLike object so that we can access it later. We pass this object to the fit routine so that it knows which fitting object to use. We're also telling the code to calculate the covariance matrix so we can get at the errors.

>>> likeobj = pyLike.NewMinuit(like.logLike)
>>> like.fit(verbosity=0,covar=True,optObject=likeobj)

The number that is printed out here is the -log(Likelihood) of the total fit to the data. You can print the results of the fit by accessing like.model. You can also access the fit for a particular source by doing the following (the source name must match that in the XML model file).

>>> like.model['3FGL J1555.7+1111']
3FGL J1555.7+1111
Spectrum: PowerLaw 111 Prefactor: 3.395e+00 5.924e-02 1.000e-04 1.000e+04 ( 1.000e-11) 112 Index: 1.634e+00 8.154e-03 0.000e+00 1.000e+01 (-1.000e+00) 113 Scale: 5.000e+02 0.000e+00 3.000e+01 5.000e+05 ( 1.000e+00) fixed

You can plot the results of the fit by executing the plot command. The results are shown below:

>>> like.plot()
Plot of the likelihood fit Plot of the fit residuals

The output of the plot function of the like1 UnbinnedAnalysis object shows:

  • Left: the contribution of each of the objects in the model to the total model, and plots the data points on top.
  • Right: the residuals of the likelihood fit to the data.Notice that the fit is poor in the second to last bin.

Now, check if NewMinuit converged.

>>> print likeobj.getRetCode()

If you get anything other than '0' here, than NewMinuit didn't converge. There are several reasons that we might not get convergance:

  • The culprit is usually a parameter (or parameters) in the model that reach the limits set in the XML file. If this happens the minimizer cannot reach the formal minimum, and hence cannot calculate the curvature.
    • Often the problem is with spectral shape parameters (PL index etc..), so simply freezing the shape of all spectral parameters to their values from the full time period (and certainly for weaker background sources) when fitting a shorter time period may solve the problem. Remember that the 3FGL catalog used a full 4 years of data and we're using a much shorter time period here
    • Weak background sources are more likely to cause problems, so you could consider just freezing them completely (or removing them from the model). For example a background source from the catalog that is detected at TS~=25 in 2 years could cause convergence problems in a 1-month light curve, where it will often not be detectable at all.
  • If there are no parameters at their limits, then increasing the overall convergence tolerance may help - try using a value of 1E-8 for the absolute tolerance.
  • If that doesn't help then try to systematically simplify the model. Progressively freeze all sources, starting with those at the edge of the ROI in and moving in until you get a model simple enough for the minimizer to work reliably. For example if you are using a 10 degree ROI, you could start by freezing all background sources further than 7 degrees from the source of interest, and move to 5 degrees if that doesn't solve the problem.

Let's check the TS values of the sources in the model.

>>> sourceDetails = {}
>>> for source in like.sourceNames():
>>>    sourceDetails[source] = like.Ts(source)
>>> print sourceDetails
{'3FGL J1546.6+1812': 0.17268223245628178,
'3FGL J1622.9-0312': -0.16203457885421813,
'3FGL J1541.6+1414': 5.022071829531342,
'3FGL J1539.5+2746': -0.08898577070795,
'3FGL J1704.1+1234': -0.35892120632342994,
'3FGL J1625.1-0021': -0.21233836840838194,
'3FGL J1637.1+1314': -0.7221354788634926,
'3FGL J1543.5+0451': 13.312828778056428,
'3FGL J1511.8-0513': -0.01974983513355255,
'3FGL J1555.7+1111': 3131.243039842462,
'3FGL J1553.5+1256': 1293.1098725472111,
'3FGL J1656.0+2044': -0.1342923112679273,
'3FGL J1549.7-0658': -0.025776468915864825,
'3FGL J1517.0+2637': -0.00646698079071939,
'3FGL J1546.0+0818': 20.428234938299283,
'3FGL J1520.8-0348': -0.022230580914765596,
'3FGL J1644.4+2632': -0.06359240203164518,
'3FGL J1559.9+2319': -0.6947885756380856,
'3FGL J1516.9+1926': 0.9776446905452758,
'3FGL J1640.9+1142': -1.0533115186262876,
'3FGL J1506.4-0340': -0.016404472989961505,
'3FGL J1450.4+0911': 0.2836658810265362,
'3FGL J1601.9+2306': -0.08244648389518261,
'3FGL J1505.1+0326': 1.7678420771844685,
'iso_P8R2_SOURCE_V6_v06': 1705.6503756581806,
'3FGL J1549.4+0237': 209.98527692863718,
'3FGL J1650.0+0356': -0.10868578311055899,
'3FGL J1611.9+1404': 23.293160994770005,
'3FGL J1554.4+2010': 1.669362711487338,
'3FGL J1628.1+0254': -0.5462785894051194,
'3FGL J1504.4+1029': 129.09847361617722,
'3FGL J1630.8+1047': -3.604947129264474,
'3FGL J1636.7+2624': -0.024365422315895557,
'gll_iem_v06': 4373.269815435167,
'3FGL J1705.5+0948': -0.3167821641545743,
'3FGL J1548.4+1455': 28.76237250259146,
'3FGL J1714.0+0748': -0.03452549735084176,
'3FGL J1506.6+0811': 0.7437806166708469,
'3FGL J1653.6-0158': -0.16723551088944077,
'3FGL J1508.6+2709': -0.0016793855465948582,
'3FGL J1553.3-0322c': -0.06156782549805939,
'3FGL J1603.7+1106': 8.820508372969925,
'3FGL J1454.0+1622': 0.3378045877907425,
'3FGL J1549.5+1709': -4.093209114158526,
'3FGL J1500.9+2238': 0.021558950887992978,
'3FGL J1442.8+1200': 0.030036036856472492,
'3FGL J1440.9+0610': 0.08561139507219195,
'3FGL J1541.8+1105': 11.207513364264742,
'3FGL J1540.8+1449': -3.5654590828344226,
'3FGL J1650.8+0830': -0.9024815207812935,
'3FGL J1512.2+0202': 0.8020141257438809,
'3FGL J1507.4+1725': 0.08125757682137191,
'3FGL J1532.0+3018': -0.005078128073364496,
'3FGL J1533.2+1852': 24.718677337514237,
'3FGL J1608.6+1029': 93.42128267302178,
'3FGL J1607.0+1551': 208.8460137550719,
'3FGL J1552.1+0852': 57.02504306170158,
'3FGL J1543.5-0244': -0.07581187249161303,
'3FGL J1534.5+0128': 1.6464707893319428,
'3FGL J1550.5+0526': 60.54630964482203}

There are a number of sources with TS values less than 0. This should not happen unless the minimization failed (which it did in this case). The low TS indicates that our data set does not contain enough information to characterize a source at each of those locations. We should remove the very low-TS sources, fit again and then check the return code.

This process is usually the most frustrating part of any analysis and it often takes some work to get convergence. In the end, go back and address the items above to get things to work. It's very important that you do so. If your fit doesn't converge then you cannot trust your results.

>>> for source,TS in sourceDetails.iteritems():
>>>    print source, TS
>>>    if (TS < 2):
>>>       print "Deleting..."
>>>       like.deleteSource(source)
3FGL J1546.6+1812 0.17268223245628178
3FGL J1622.9-0312 -0.16203457885421813
3FGL J1541.6+1414 5.022071829531342
3FGL J1539.5+2746 -0.08898577070795
3FGL J1704.1+1234 -0.35892120632342994
3FGL J1625.1-0021 -0.21233836840838194
3FGL J1637.1+1314 -0.7221354788634926
3FGL J1543.5+0451 13.312828778056428
3FGL J1511.8-0513 -0.01974983513355255
3FGL J1555.7+1111 3131.243039842462
3FGL J1553.5+1256 1293.1098725472111

...Many more of these here...

iso_P8R2_SOURCE_V6_v06 1705.6503756581806
3FGL J1549.4+0237 209.98527692863718
3FGL J1650.0+0356 -0.10868578311055899
3FGL J1611.9+1404 23.293160994770005
3FGL J1554.4+2010 1.669362711487338
3FGL J1628.1+0254 -0.5462785894051194
3FGL J1504.4+1029 129.09847361617722
3FGL J1630.8+1047 -3.604947129264474
3FGL J1636.7+2624 -0.024365422315895557
gll_iem_v06 4373.269815435167
3FGL J1705.5+0948 -0.3167821641545743
3FGL J1548.4+1455 28.76237250259146
3FGL J1714.0+0748 -0.03452549735084176

...And yet more of these here...

3FGL J1533.2+1852 24.718677337514237
3FGL J1608.6+1029 93.42128267302178
3FGL J1607.0+1551 208.8460137550719
3FGL J1552.1+0852 57.02504306170158
3FGL J1543.5-0244 -0.07581187249161303
3FGL J1534.5+0128 1.6464707893319428
3FGL J1550.5+0526 60.54630964482203
>>> like.fit(verbosity=0,covar=True,optObject=likeobj)
>>> print likeobj.getRetCode()

Still didn't converge....I told you this could be frustrating.....Lets try another trick: we can open our xml model back up and freeze the parameters of all sources that are further than a set distance away from the center of the ROI. The code does this automatically for sources that are 10 dg or further away, but sometimes its not enough. Lets try freezing all parameters of sources further than 8 degrees away and rerun the fit....(note that you must reload the xml model file in python everytime you tinker with it) >>> like.fit(verbosity=0,covar=True,optObject=likeobj)
>>> print likeobj.getRetCode()

Excellent. We have convergence and can move forward. Note, however, that this doesn't mean you have the 'right' answer. It just means that you have the answer assuming the model you put in. This is a subtle feature of the likelihood method.

Let's take a look at the residuals and overal model using matplotlib.

>>> import matplotlib.pyplot as plt
>>> import numpy as np
We need to import these two modules for plotting and some other stuff. They are included in the Science Tools.
>>> E = (like.energies[:-1] + like.energies[1:])/2.
The 'energies' array are the endpoints so we take the midpoint of the bins.
>>> plt.figure(figsize=(9,9))
>>> plt.ylim((0.4,1e4))
>>> plt.xlim((200,300000))
>>> sum_model = np.zeros_like(like._srcCnts(like.sourceNames()[0]))
>>> for sourceName in like.sourceNames():
>>>    sum_model = sum_model + like._srcCnts(sourceName)
>>>    plt.loglog(E,like._srcCnts(sourceName),label=sourceName[1:])
>>> plt.loglog(E,sum_model,label='Total Model')
>>> plt.errorbar(E,like._Nobs(),yerr=np.sqrt(like._Nobs()), fmt='o',label='Counts')
>>> plt.legend(bbox_to_anchor=(1.05, 1), loc=2)
>>> plt.show()
Plot of the likelihood fit >>> resid = (like._Nobs() - sum_model)/sum_model
>>> resid_err = (np.sqrt(like._Nobs())/sum_model)
>>> plt.figure(figsize=(9,9))
>>> plt.xscale('log')
>>> plt.errorbar(E,resid,yerr=resid_err,fmt='o')
>>> plt.axhline(0.0,ls=':')
>>> plt.show()
Plot of the residuals

These are just some examples of how to access the underlying data from the python tools. As you can see, most of the information is accessable if you dig around. Note that the residuals aren't that great in several bins, especially at high energies. This could be a missing source or that we're not modelling a bright source very well. If you look up at the model plot, the high energy tail of PG1553 might be biasing the fit at high energies and we might get a better fit with a log-parabola or broken power-law. That's for another day though.

Let's check out the final parameters of the fit for PG1553.

>>> like.model['3FGL J1555.7+1111']
3FGL J1555.7+1111
Spectrum: PowerLaw
38 Prefactor: 3.367e+00 2.068e-01 1.000e-04 1.000e+04 ( 1.000e-11)
39 Index: 1.631e+00 2.802e-02 0.000e+00 1.000e+01 (-1.000e+00)
40 Scale: 5.000e+02 0.000e+00 3.000e+01 5.000e+05 ( 1.000e+00) fixed
>>> like.flux('3FGL J1555.7+1111',emin=100)
>>> like.fluxError('3FGL J1555.7+1111',emin=100)

You can also get the TS value for a specific source:

>>> like.Ts('3FGL J1555.7+1111')

you can calculate how many standard deviations (σ) this corresponds to. (Remember that the TS value is ∼ σ2.)

>>> np.sqrt(like.Ts('3FGL J1555.7+1111'))

Let's save the results.

>>> like.logLike.writeXml('PG1553_fit.xml')

Create some TS Maps

If you want to check on your fit you should probably make a test statistic (TS) map of your region at this point. This will allow you to check and make sure that you didn't miss anything in the ROI (like a source not in the catalog you used). A TS map is created by moving a putative point source through a grid of locations on the sky and maximizing the log(likelihood) at each grid point. We're going to make two TS Maps. If you have access to multiple CPUs, you might want to start one of the maps running in one terminal and the other in another one since this will take several hours to complete. Our main goal is to create one TS Map with PG1553 included in the fit and another without. The first one will allow us to make sure we didn't miss any sources in our ROI and the other will allow us to see the source. All of the other sources in our model file will be included in the fit and shouldn't show up in the final map.

If you don't want to create these maps, go ahead and skip the following steps and jump to the section on making a butterfly plot. The creation of these maps is optional and is only for double checking your work. You should still modify your model fit file as detailed in the following paragraph.

First open a new terminal and then copy PG1553_fit.xml as PG1553_fit_TSMap.xml. Modify PG1553_fit_TSMap.xml to make everything fixed by changing all of the free="1" statements to free="0". Modify the lines that look like:

<parameter error="1.670710816" free="1" max="10000" min="0.0001" name="Prefactor" scale="1e-13" value="6.248416874" />
<parameter error="0.208035277" free="1" max="5" min="0" name="Index" scale="-1" value="2.174167564" />

to look like:

<parameter error="1.670710816" free="0" max="10000" min="0.0001" name="Prefactor" scale="1e-13" value="6.248416874" />
<parameter error="0.208035277" free="0" max="5" min="0" name="Index" scale="-1" value="2.174167564" />

Then copy it and call it PG1553_fit_noPG1553.xml and comment out (or delete) the PG1553 source. If you want to generate the TS maps, in one window do the following:

>>> my_apps.TsMap['statistic'] = "UNBINNED"
>>> my_apps.TsMap['scfile'] = "PG1553_SC.fits"
>>> my_apps.TsMap['evfile'] = "PG1553_filtered_gti.fits"
>>> my_apps.TsMap['expmap'] = "PG1553_expMap.fits"
>>> my_apps.TsMap['expcube'] = "PG1553_ltCube.fits"
>>> my_apps.TsMap['srcmdl'] = "PG1553_fit_TSMap.xml"
>>> my_apps.TsMap['irfs'] = "CALDB"
>>> my_apps.TsMap['optimizer'] = "NEWMINUIT"
>>> my_apps.TsMap['outfile'] = "PG1553_TSmap_resid.fits"
>>> my_apps.TsMap['nxpix'] = 25
>>> my_apps.TsMap['nypix'] = 25
>>> my_apps.TsMap['binsz'] = 0.5
>>> my_apps.TsMap['coordsys'] = "CEL"
>>> my_apps.TsMap['xref'] = 238.929
>>> my_apps.TsMap['yref'] = 11.1901
>>> my_apps.TsMap['proj'] = 'STG'
>>> my_apps.TsMap.run()

In another window do the following:

>>> my_apps.TsMap['statistic'] = "UNBINNED"
>>> my_apps.TsMap['scfile'] = "PG1553_SC.fits"
>>> my_apps.TsMap['evfile'] = "PG1553_filtered_gti.fits"
>>> my_apps.TsMap['expmap'] = "PG1553_expMap.fits"
>>> my_apps.TsMap['expcube'] = "PG1553_ltCube.fits"
>>> my_apps.TsMap['srcmdl'] = "PG1553_fit_noPG1553.xml"
>>> my_apps.TsMap['irfs'] = "CALDB"
>>> my_apps.TsMap['optimizer'] = "NEWMINUIT"
>>> my_apps.TsMap['outfile'] = "PG1553_TSmap_noPG1553.fits"
>>> my_apps.TsMap['nxpix'] = 25
>>> my_apps.TsMap['nypix'] = 25
>>> my_apps.TsMap['binsz'] = 0.5
>>> my_apps.TsMap['coordsys'] = "CEL"
>>> my_apps.TsMap['xref'] = 238.929
>>> my_apps.TsMap['yref'] = 11.1901
>>> my_apps.TsMap['proj'] = 'STG'
>>> my_apps.TsMap.run()

This will take a while (over 3 hours each) so go get a cup of coffee. The final TS maps will be pretty rough since we selected to only do 25x25 0.5 degree bins but it will allow us to check if anything is missing and go back and fix it. These rough plots aren't publication ready but they'll do in a pinch to verify that we've got the model right.

Once the TS maps have been generated, open them up in whatever viewing program you wish. Here's how you could use Matplotlib to do it.

>>> import pyfits
>>> residHDU = pyfits.open('PG1553_TSmap_resid.fits')
>>> sourceHDU = pyfits.open('PG1553_TSmap_noPG1553.fits')
>>> fig = plt.figure(figsize=(16,8))
>>> plt.subplot(1,2,1)
>>> plt.imshow(residHDU[0].data)
>>> plt.colorbar()
>>> plt.subplot(1,2,2)
>>> plt.imshow(sourceHDU[0].data)
>>> plt.colorbar()
>>> plt.show()
TS maps

You can see that the 'noPG1553' map (right) has a prominent source in the middle of the ROI. This is PG 1553+113 and if you execute the following you can see that the TS value of the maxiumum pretty well with the TS value of PG 1553+113 that we arrived at above.

>>> np.max(sourceHDU[0].data)

If you then look at the 'residuals' plot (left), you can see that it's pretty flat in TS space (the brightest pixel has a TS value of around 9).

>>> np.max(residHDU[0].data)

This indicates that we didn't miss any significant sources. If you did see something significant (TS > 20 or so), you probably should go back and figure out what source you missed.

Produce the Butterfly Plot

Now we'll produce the butterfly plot. First, get the differential flux, the decorrelation (pivot) energy, and the index. You'll also need the errors.

>>> N0 = like.model['3FGL J1555.7+1111'].funcs['Spectrum'].getParam('Prefactor').value()
>>> N0_err = like.model['3FGL J1555.7+1111'].funcs['Spectrum'].getParam('Prefactor').error()
>>> gamma = like.model['3FGL J1555.7+1111'].funcs['Spectrum'].getParam('Index').value()
>>> gamma_err = like.model['3FGL J1555.7+1111'].funcs['Spectrum'].getParam('Index').error()
>>> E0 = like.model['3FGL J1555.7+1111'].funcs['Spectrum'].getParam('Scale').value()

Calculate the Butterfly Plot

We need to calculate the differential flux at several energy points as well as the error on that differential flux. The differential flux is defined as:

and the error on that flux is defined as:

Where covγγ = σγ2, where γ is the spectral index. To calculate covγγ, use this code:

>>> freeParValues = []
>>> for sourcename in like.sourceNames():
>>>    for element in like.freePars(sourcename):
>>>       freeParValues.append(element.getValue())
>>> g_index = freeParValues.index(like.freePars('3FGL J1555.7+1111')[1].getValue())
>>> cov_gg = like.covariance[g_index][g_index]

So, let calculate F(E) and F(E) +/- Δ F(E) so that we can plot it. First, define a function for the flux and one for the error on the flux.

>>> f = lambda E,N0,E0,gamma: N0*(E/E0)**(-1*gamma)
>>> ferr = lambda E,F,N0,N0err,E0,cov_gg: F*np.sqrt(N0err**2/N0**2 + ((np.log(E/E0))**2)*cov_gg)

Now, generate some energies to evaluate the functions above and evaluate them.

>>> E = np.logspace(2,5,100)
>>> F = f(E,N0,E0,gamma)
>>> Ferr = ferr(E,F,N0,N0_err,E0,cov_gg)

Now you can plot this. Here we're multiplying the flux by E2 for clarity.

>>> plt.figure(figsize=(8,8))
>>> plt.xlabel('Energy [MeV]')
>>> plt.ylabel(r'E$^2$ dN/dE [MeV s$^{-1}$ cm$^{-2}$]')
>>> plt.loglog(E,E**2*(F+Ferr))
>>> plt.loglog(E,E**2*(F-Ferr))
>>> plt.plot(E,E**2*F)
>>> plt.show()

Butterfly plot of spectrum of PG 1553+113

The red line is the nominal fit and the blue and green lines are the 1 sigma contours.

Generate Spectral Points

Note: you need a version of the ScienceTools with ROOT installed to run through the steps in this section.

To generate spectral points to plot on top of the butterfly that we just produced, you need to go back to the data selection part and use gtselect (filter in python) to divide up your data set in energy bins and run the likelihood fit on each of these individual bins. Luckily, there's a script that does this for us that we'll employ to get a final result. This script also generates the butterfly plot we just produced so you won't have to redo that again and it also does a lot of checks on the data to make sure that everything is going ok. If you've made it this far, you might be a little curious as to why we didn't jump right into using this tool but now you're in a position to play with the python tools and make them do what you want them to. The script is also much more careful in handling units and saving everything to files than we have been in this interactive session.

So download the likeSED.py user contributed tool (it's in the SED_scripts package) and load it up into python. You can find information on the usage of this tool on the same page where you downloaded it. It can be used to generate an SED plot for both binned and unbinned analyses but we're only going to work on a binned analysis here.

>>> from likeSED import *
This is likeSED version 12.1, modified to handle Pass 7 selections.

Now you need to create a likeInput object that takes our unbinnedAnalysis object, the source name and the number of bins we want as arguments. We're going to make 9 bins here like in the paper and we're also going to make custom bins and bin centers. You can have the module chose bins and bin centers for you (via the getECent function) but we're going to do it this way so we can have the double-wide bin at the end. We're also going to use the 'fit2' file (the result of our fit on the full dataset with the NewMinuit optimizer) for the fits of the individual bins but we need to edit it first to make everything fixed except for the integral value of PG 1553+113. Go ahead and do that now. We're basically assuming that the spectral index is the same for all bins and that the contributions of the other sources within the ROI are not changing with energy.

>>> inputs = likeInput(like,'3FGL J1555.7+1111',nbins=9)
>>> inputs.plotBins()

This last step will take a while (approximately 30 minutes) because it's creating an expmap and event file for each of the bins that we requested (look in your working directory for these files). Once it is done we'll tell it to do the fit of the full energy band and make sure we request that it keep the covariance matrix. Then we'll create a likeSED object from our inputs, tell it where we want our centers to be and then fit each of the bands. After this is done, we can plot it all with the Plot function.

>>> inputs.fullFit(CoVar=True)
Full energy range model for 3FGL J1555.7+1111:
3FGL J1555.7+1111
Spectrum: PowerLaw
32 Prefactor: 5.928e-01 2.586e-02 1.000e-04 1.000e+04 ( 1.000e-11)
33 Index: 1.621e+00 2.674e-02 0.000e+00 1.000e+01 (-1.000e+00)
34 Scale: 1.448e+03 0.000e+00 3.000e+01 5.000e+05 ( 1.000e+00) fixed

Flux 0.1-300.0 GeV 7.2e-08 +/- 4.2e-09 cm^-2 s^-1
Test Statistic 3141.46
>>> sed = likeSED(inputs)
>>> sed.getECent()
>>> sed.fitBands()
-Running Likelihood for band0-
-Running Likelihood for band1-
-Running Likelihood for band2-
-Running Likelihood for band3-
-Running Likelihood for band4-
-Running Likelihood for band5-
-Running Likelihood for band6-
-Running Likelihood for band7-
-Running Likelihood for band8-
>>> sed.Plot()

This produces the following three plots: a TS plot, a counts spectrum, and a butterfly plot (statistical errors only):

likeSED TS plot

The TS plot shows the TS value for each bin in the spectrum produced by likeSED. Notice that the last bin is double-wide.

likeSED counts spectrum

Counts spectrum produced by likeSED along with the spectral points.

likeSED flux spectrum

Butterfly plot along with the spectral points produced by likeSED. Note that this plot does not include the systematic errors.

Mmm, looking at that spectrum, there might be some curvature there like was hinted at in the model plot above. The butterfly model is the likelihood output and will only tell us the best fit power law since that is what we had in our model. The SED points are fits with a power-law in individual energy bands and can show us if the model we're assuming is a good assumption. Testing for curvature is for another day...

Now we're going to leave python and look at the files that likeSED produced. Open a new terminal. In your working directory you will find a bunch of 3FGL_J1555.7+1111_9bins files. These are just the event and exposure maps for each bin we selected. If you wanted to completely redo the likeSED analysis you should delete these. Some other files to note are 3FGL_J1555.7+1111_9bins_likeSEDout.fits which contains all of the data needed to reproduce the plots seen when we ran the Plot function. There is also a root file (3FGL_J1555.7+1111_9bins_histos.root) with dumps of the plots and eps files of all of the plots. Now that we've generated a butterfly SED and individual SED data points we can move on to working on a light curve.

Building a Light Curve

Coming soon...

Last updated by: Joseph R. Eggen 02/16/2018