Fermi Gamma-ray Space Telescope

Binned Likelihood Tutorial

The detection, flux determination and spectral modeling of Fermi LAT sources is accomplished by a maximum likelihood optimization technique, as described in the Cicerone (see also, e.g., Abdo, A. A. et al. 2009, ApJS, 183, 46). To illustrate how to use the Likelihood software, this narrative gives a step-by-step description for performing a binned likelihood analysis.

Binned vs Unbinned Likelihood

Binned likelihood analysis is the preferred method for most types of LAT analysis (see Cicerone). However, when analyzing data over short time periods (with few events), it is better to use the unbinned analysis. To perform an unbinned likelihood analysis see the Unbinned Likelihood tutorial.

Additional references:

Prerequisites

You will need an event data file, a spacecraft data file (also referred to as the "pointing and livetime history" file), and the current background models (available for download). You may choose to select your own data files, or to use the files provided within this tutorial. Custom data sets may be retrieved from the Lat Data Server.

Steps

  1. Make subselections from the event data

    Since there is computational overhead for each event associated with each diffuse component, it is useful to filter out any events that are not within the extraction region used for the analysis.

  2. Make a counts map from the event data

    These simple FITS images let us inspect our data and help to pick out obvious candidate sources.

  3. Create a 3D (Binned) counts map

    The binned counts map is used to reduce computation requirements in regions with large numbers of events.

  4. Download the latest diffuse model files

    The recommended models are gal_2yearp7v6_v0.fits (a very large file) and iso_p7v6source.txt. All of the background models along with a description of the models are available here.

  5. Create a source model XML file

    The source model XML file contains the various sources and their model parameters to be fit using the gtlike tool.

  6. Compute livetimes and exposure

    Precomputing the livetime for the dataset speeds up the exposure calculation.

  7. Compute exposure map

    This accounts for exposure as a function of energy, based on the cuts made. The exposure map must be recomputed if any change is made to the data selection or binning.

  8. Compute source map

    Here the exposure calculation is applied to each of the sources described in the model.

  9. Run gtlike

    Fitting the data to the model provides flux, errors, spectral indices, and other information.

  10. Create a model map

    This can be compared to the counts map to verify the quality of the fit.

» Back to Top

1. Make subselections from the event data

For this case we will use two years of LAT Pass 7 data. This is a longer data set than is described in the Extract LAT Data tutorial.

NOTE: The ROI used by the binned likelihood analysis is defined by the 3D counts map boundary. Region selection used in the data extraction step, which is conical, must fully contain the 3D counts map spatial boundary, which is square. For this reason, in the binned likelihood analysis we will need a larger data selection than is used for unbinned likelihood.

Selection of data:

  1. Search Center (RA, DEC) =(193.98, -5.82)
  2. Radius = 30 degrees
  3. Start Time (MET) = 239557417 seconds (2008-08-04 T15:43:37)
  4. Stop Time (MET) = 302572802 seconds (2010-08-04 T00:00:00)
  5. Minimum Energy = 100 MeV
  6. Maximum Energy = 100000 MeV

This two-year dataset generates numerous data files. We provide the user with the original event data files and the accompanying spacecraft file:

In order to combine the two events files for your analysis, you must first generate a text file listing the events files to be included. (If you do not wish to download all the individual files, you can skip to the next step and retrieve the combined, filtered event file. However, you will need the spacecraft file to complete the analysis, so you should retrieve that now.) To generate the file list, type:

prompt> ls *_PH* > binned_events.txt

This text file (binned_events.txt) will be used in place of the input fits filename when running gtselect. The syntax requires that you use an @ before the filename to indicate that this is a text file input rather than a fits file.

When analyzing point sources, it is recommended that you include events with high probability of being photons. To do this, you should use gtselect to cut on the event class, keeping only the source class events (event class 2, or as recommended in the Cicerone):

prompt> gtselect evclass=2

Be aware that evclass is a hidden parameter. So to use it for a cut, you must type it in the command line.

We perform a selection to the data we want to analyze. For this example, we consider the source class photons within our 30 degree region of interest (ROI) centered on the blazar 3C 279. For some of the selections that we made with the data server and don't want to modify, we can use "INDEF" to instruct the tool to read those values from the data file header. Here, we are only filtering on event class and applying a zenith cut, so many of the parameters are designated as "INDEF".

We apply the gtselect tool to the data file as follows:

prompt> gtselect evclass=2
Input FT1 file[] @binned_events.txt
Output FT1 file[] 3C279_binned_filtered.fits
RA for new search center (degrees) (0:360) [] INDEF
Dec for new search center (degrees) (-90:90) [] INDEF
radius of new search region (degrees) (0:180) [] INDEF
start time (MET in s) (0:) [] INDEF
end time (MET in s) (0:) [] INDEF
lower energy limit (MeV) (0:) [] 100
upper energy limit (MeV) (0:) [] 100000
maximum zenith angle value (degrees) (0:180) [] 100
Done.
prompt>

In the last step we also selected the energy range and the maximum zenith angle value (100 degrees) as suggested in Cicerone and recommended by the LAT instrument team. The Earth's limb is a strong source of background gamma rays and we filter them out with a zenith-angle cut. The filtered data are provided here.

After the data selection is made you must compensate the livetime for the fact that you have removed some events. There are several options for calculating livetime depending on your observation type and science goals. For a detailed discussion of these options, see Likelihood Livetime and Exposure in the Cicerone. To deal with the cut on zenith-angle in this analysis, we use the gtmktime tool to exclude the time intervals where the zenith cut intersects the ROI from the list of good time intervals (GTIs). This is especially important if you are studying a narrow ROI (with a radius of less than 20 degrees), as your source of interest may come quite close to the Earth's limb. To make this correction you have to run gtmktime and answer "yes" at:

Apply ROI-based zenith angle cut [] yes

gtmktime also provides an opportunity to select GTIs by filtering on information provided in the spacecraft file. The current gtmktime filter expression recommended by the LAT team in the Cicerone is DATA_QUAL==1 && (LAT_CONFIG==1) && ABS(ROCK_ANGLE)<52. This excludes time periods when some spacecraft event has affected the quality of the data, ensures the LAT instrument was in normal science data-taking mode, and requires that the spacecraft be within the range of rocking angles used during nominal sky-survey observations.

Here is an example of running gtmktime for our analysis of the region surrounding 3C 279.

prompt> gtmktime
Spacecraft data file[] L12041614001899489F7F83_SC00.fits
Filter expression[] DATA_QUAL==1 && (LAT_CONFIG==1) && ABS(ROCK_ANGLE)<52
Apply ROI-based zenith angle cut[] yes
Event data file[] 3C279_binned_filtered.fits
Output event file name[] 3C279_binned_gti.fits

The data with all the cuts described above is provided in this link. A more detailed discussion of data selection can be found in the Data Preparation analysis thread.

To view the DSS keywords in a given extension of a data file, use the gtvcut tool and review the data cuts on the EVENTS extension. This provides a listing of the keywords reflecting each cut appplied to the data file and their values, including the entire list of GTIs. We use the option "suppress_gtis=yes" to keep the output easily readable:

prompt> gtvcut suppress_gtis=yes
Input FITS file[] 3C279_binned_gti.fits
Extension name[EVENTS]
DSTYP1: BIT_MASK(EVENT_CLASS,2)
DSUNI1: DIMENSIONLESS
DSVAL1: 1:1

DSTYP2: POS(RA,DEC)
DSUNI2: deg
DSVAL2: CIRCLE(193.98,-5.82,30)

DSTYP3: TIME
DSUNI3: s
DSVAL3: TABLE
DSREF3: :GTI

GTIs: (suppressed)

DSTYP4: ENERGY
DSUNI4: MeV
DSVAL4: 100:100000

DSTYP5: ZENITH_ANGLE
DSUNI5: deg
DSVAL5: 0:100

prompt>

Here you can see the event class, the location and radius of the data selection, as well as the energy range in MeV, the zenith angle cut, and the fact that the time cuts to be used in the exposure calculation are defined by the GTI table.

Various science tools will be unable to run if you have multiple copies of a particular DSS keyword. This can happen if the position used in extracting the data from the data server is different than the position used with gtselect. It is wise to review the keywords for duplicates before proceeding. If you do have keyword duplication, it is advisable to regenerate the data file with consistent cuts.

2. Make a counts map from the event data

Next, we create a counts map of the ROI, summed over photon energies, in order to identify candidate sources and to ensure that the field looks sensible as a simple sanity check. For creating the counts map, we will use the gtbin tool with the option "CMAP" as shown below:

prompt> gtbin
This is gtbin version ScienceTools-09-27-01
Type of output file (CCUBE|CMAP|LC|PHA1|PHA2) [] CMAP
Event data file name[] 3C279_binned_gti.fits
Output file name[] 3C279_binned_cmap.fits
Spacecraft data file name[] NONE
Size of the X axis in pixels[] 300
Size of the Y axis in pixels[] 300
Image scale (in degrees/pixel)[] 0.2
Coordinate system (CEL - celestial, GAL -galactic) (CEL|GAL) [CEL]
First coordinate of image center in degrees (RA or galactic l)[] 193.98
Second coordinate of image center in degrees (DEC or galactic b)[] -5.82
Rotation angle of image axis, in degrees[] 0.0
Projection method e.g. AIT|ARC|CAR|GLS|MER|NCP|SIN|STG|TAN:[] STG
prompt> ds9 3C279_binned_cmap.fits &

We chose an ROI of 30 degrees, corresponding to 60 degrees in diameter. Since we want a pixel size of 0.2 degrees/pixel, than we must select 60/0.2=300 pixels for the size of the x and y axes. The last command launches the visualization tool ds9 and produces a display of the generated count map.

3C279_binned_cmap.fits image

You can see several strong sources and a number of lesser sources in this map. Mousing over the positions of these sources shows that two of them are likely 3C 279 and 3C 273. It is important to inspect your data prior to proceeding to verify that the contents are as you expect. A malformed data query or improper data selection can generate a non-circular region, or a file with zero events. By inspecting your data prior to analysis, you have an opportunity to detect such issues early in the analysis. (A more detailed discussion of data exploration can be found in the Explore LAT Data analysis thread.)

» Back to Top

3. Create a 3-D (binned) counts map

Since the counts map shows the expected data, you are ready to prepare your data set for analysis. For binned likelihood analysis, the data input is a three-dimensional counts map with an energy axis, called a counts cube. The gtbin tool performs this task as well, by using the CCUBE option.

The binning of the counts map determines the binning of the exposure calculation. The likelihood analysis may lose accuracy if the energy bins are not sufficiently narrow to accommodate more rapid variations in the effective area with decreasing energy below a few hundred MeV. For a typical analysis, ten logarithmically spaced bins per decade in energy are recommended. The analysis is less sensitive to the spatial binning and 0.2 deg bins are a reasonable standard.

This counts cube is a square binned region that must fit within the circular acceptance cone defined during the data extraction step, and visible in the counts map above. This is the reason we expanded the data selection from 20 degrees (for unbinned analysis) to 30 degrees (for binned analysis). To find the maximum size of the region your data will support, find the side of a square that can be fully inscribed within your circular acceptance region (multiply the radius of the acceptance cone by sqrt(2)). For this example, the maximum length for a side is 42.43 degrees. This means the entire area covered by the smaller acceptance region (40 degree diameter) used in the Unbinned Likelihood analysis will be included in this analysis as well.

square inscribed in a circle

For a comparable analysis here, we use the same central coordinate, and a 40x40 degree counts cube.

To create the counts cube we run gtbin as follows:

prompt> gtbin
This is gtbin version ScienceTools-09-27-01
Type of output file (CCUBE|CMAP|LC|PHA1|PHA2) [] CCUBE
Event data file name[] 3C279_binned_gti.fits
Output file name[] 3C279_binned_ccube.fits
Spacecraft data file name[NONE]
Size of the X axis in pixels[] 200
Size of the Y axis in pixels[] 200
Image scale (in degrees/pixel)[] 0.2
Coordinate system (CEL - celestial, GAL -galactic) (CEL|GAL) [CEL]
First coordinate of image center in degrees (RA or galactic l)[] 193.98
Second coordinate of image center in degrees (DEC or galactic b)[] -5.82
Rotation angle of image axis, in degrees[0.]
Projection method e.g. AIT|ARC|CAR|GLS|MER|NCP|SIN|STG|TAN:[] STG
Algorithm for defining energy bins (FILE|LIN|LOG) [LOG]
Start value for first energy bin in MeV[] 100
Stop value for last energy bin in MeV[] 100000
Number of logarithmically uniform energy bins[] 30
prompt>

The counts cube generated in this step is provided here.

If you open the counts cube with ds9, you see that it is made up of 30 images, one for each logarithmic energy bin. By playing through these images, it is easy to see how the PSF of the LAT changes with energy. You can also see that changing energy cuts could be helpful when trying to optimize the localization or spectral information for specific sources. Be sure to verify that there are no black corners on your counts cube. These corners correspond to regions with no data, and will cause errors in your exposure calculations.

» Back to Top

4. Download the latest diffuse model files

When you use the current Galactic diffuse emission model (gll_iem_v05.fits) in a likelihood analysis, you also want to use the corresponding model for the extragalactic isotropic diffuse emission, which includes the residual cosmic-ray background. The recommended isotropic model is iso_source05.txt. For this example, you should save it in the same directory with your data. The isotropic spectrum is valid only for the P7REP_SOURCE_V15 response functions and only for data sets with front + back events combined. All of the most up-to-date background models along with a description of the models are available here.

» Back to Top

5. Create a source model XML file

The gtlike tool reads the source model from an XML file. The model file contains your best guess at the locations and spectral forms for the sources in your data. A source model can be created using the model editor tool, by using the user contributed tool make2FGLxml.py, or by editing the file directly within a text editor.

Here we cannot use the same source model that was used to analyze 6 months of data in the Unbinned Likelihood tutorial, as the 2-year data set contains many more significant sources and will not converge. Instead, we will use the 2FGL catalog to define our source model by running make2FGLxml.py:

prompt> python

...
>>> from make2FGLxml import *
This is make2FGLxml version 04r1.
For use with the gll_psc_v02.fit and gll_psc_v05.fit and later LAT catalog files.
>>> mymodel=srcList('gll_psc_v08.fit','3C279_binned_gti.fits','3C279_binned_input.xml')
>>> mymodel.makeModel('/FULL_PATH/diffmodels/gll_iem_v05.fits','gll_iem_v05','/FULL_PATH/diffmodels/iso_source_v05.txt','iso_source_v05')
Creating file and adding sources for 2FGL
Added 137 point sources and 0 extended sources
>>> exit()

prompt>

It is necessary to specify the entire path to location of the diffuse model on your system. The resulting XML model contains 137 sources. Clearly, the simple 4-source model we used for the 6-month Unbinned Likelihood analysis would have been too simplistic.

This xml file uses the spectral model from the 2FGL catalog analysis for each source (in this case we are using the latest point source catalog gll_psc_v08.fit available at the LAT 2-yr Catalog page). However, that analysis used a subset of the available spectral models. A dedicated analysis of the region may indicate a different spectral model is preferred. For more details on the options available for your XML models, see:

Finally, the make2FGLxml.py script automatically adds 5 degrees to your ROI to account for sources that lie just outside your data region, but with photons that could be in your data. In addition, it frees all the spectral parameters for sources within your ROI, and fixes them for the others. With 137 source, this leaves too many free parameters to gain a good spectral fit. It is advisable to revise these values so that only sources near your source of interest, or very bright source, have all spectral parameters free. Farther away, you can fix the spectral form and free only the normalization parameter (or 'prefactor'). If you are working in a crowded region, or have nested sources (e.g. a point source on top of an extended source), you will probably want to fix parameters for some sources even if they lie close to your source of interest.

For this analysis we have modified the automatically generated XML file by fixing the spectral parameters and removing all weak sources (i.e., those with the "Prefactor" below 1e-14) for all sources located at more than 7° from 3C 279. The normalization parameter is left free for all the remaining sources within the ROI. Both these changes are necessary: having too many free parameters will not allow the fit to converge (see the Run gtlike section for the fitting step). Instead of removing sources based on their intensity, one can use the TS value (obtained from the 2FGL catalog) to purge sources that likely are not important in contributing to the observed emission. Since running gtlike can take a long time to complete, one should run quick trials in the first place: We suggest to start by removing many sources from the XML file (i.e., with relatively higher "Prefactor" or TS values) in the first trial and see which objects appear in the residual map (see the Create a Model Map step). Those sources will be re-inserted in the XML file to fit the data and to re-generate the residual map. The second trial will take longer than the first one. This process can be iterated until the user is satisfied with the final result. The XML file used in this example can be found here.

XML for Extended Sources

In some regions, the make2FGLxml.py script may add one or more extended sources to your XML model. The script will provide the number of extended sources included in the model. In order to use these extended sources, you will need to downloaded the extended source templates from the LAT Catalog page (look for "Extended Source template archive"). Extract the archive in the directory of your choice and note the path to the template files, which have names like "W44.fits" and "VelaX.fits". You will need to modify your XML model so that the path to the template file is correct.

Here is an example of the proper format for an extended source XML entry for Binned Likelihood analysis:

<source name="SpatialMap_source" type="DiffuseSource">
<spectrum type="PowerLaw2">
<parameter free="1" max="1000.0" min="1e-05" name="Integral" scale="1e-06" value="1.0"/>
<parameter free="1" max="-1.0" min="-5.0" name="Index" scale="1.0" value="-2.0"/>
<parameter free="0" max="200000.0" min="20.0" name="LowerLimit" scale="1.0" value="20.0"/>
<parameter free="0" max="200000.0" min="20.0" name="UpperLimit" scale="1.0" value="2e5"/>
</spectrum>
<W44 file="$(PATH_TO_FILE)/W44.fits" type="SpatialMap" map_based_integral="true">
<parameter free="0" max="1000.0" min="0.001" name="Normalization" scale= "1.0" value="1.0"/>
</spatialModel>
</source>

» Back to Top

6. Compute livetimes and exposure

To speed up the exposure calculations performed by Likelihood, it is helpful to pre-compute the livetime as a function of sky position and off-axis angle. The gtltcube tool creates a livetime cube, which is a HealPix table, covering the entire sky, of the integrated livetime as a function of inclination with respect to the LAT z-axis.

Here is an example of how to run gtltcube:

prompt> gtltcube
Event data file[] 3C279_binned_gti.fits
Spacecraft data file[] L12041614001899489F7F83_SC00.fits
Output file[] 3C279_binned_ltcube.fits
Step size in cos(theta) (0.:1.)[] 0.025
Pixel size (degrees)[] 1
Working on file L12041614001899489F7F83_SC00.fits
.....................!
prompt>

Note: Values such as "0.1" for "Step size in cos(theta)" are known to give unexpected results. Use "0.09" instead.

The livetime cube generated for this analysis can be found here. For more information about the livetime cubes see the documentation in Cicerone, and also the explanation in the Unbinned Likelihood tutorial.

» Back to Top

7. Compute exposure map

Next you must apply the livetime calculated in the previous step to your region of interest. To do this, we use the gtexpcube2 tool, which is an updated version of the previous gtexpcube. This tool generates a binned exposure map, an accounting of the exposure at each position in the sky, that are a required input to the likelihood process.

NOTE: In the past, running gtsrcmaps calculated the exposure map for you, so most analyses skipped the binned exposure map generation step. With the introduction of gtexpcube2, this is no longer the case. You must explicitly command the creation of the exposure map as a separate analysis step.

In order to create an exposure map that accounts for contributions from all the sources in your analysis region, you must consider not just the sources included in the counts cube. The large PSF of the LAT means that at low energies, source from well outside your counts cube could affect the sources you are analyzing. To compensate for this, you must create an exposure map that includes sources up to 10 degrees outside your ROI. (The ROI is determined by the radius you downloaded from the data server, here a 30° radius.) In addition, you should account for all the exposure that contributes to those additional sources. Since at low energies the PSF is large, the exposure map should be expanded by another 10 degrees to accomodate this additional exposure. For this data set, this means the exposure map should have about a 50° radius (though value that works is closer to 55°). Since the exposure map uses square pixels, to match the binning in the counts cube, and to ensure we don't have errors, we generate a 400x400 pixel map.

If you provide gtexpcube2 a filename for your counts cube, it will use the information from that file to define the geometry of the exposure map. This is legacy behavior and will not give you the necessary 20° buffer you need to completely account for the exposure of nearby sources. (It will also cause an error in the next step.) Instead, you should specify the appropriate geometry for the exposure map, remembering that the counts cube used 0.2° pixel binning.

To generate the exposure map for this data set:

prompt> gtexpcube2
Livetime cube file[] 3C279_binned_ltcube.fits
Counts map file[] none
Output file name[] 3C279_binned_expcube.fits
Response functions to use[] P7SOURCE_V6
Size of the X axis in pixels[] 400
Size of the Y axis in pixels[] 400
Image scale (in degrees/pixel)[] .2
Coordinate system (CEL - celestial, GAL -galactic) (CEL|GAL) [] CEL
First coordinate of image center in degrees (RA or galactic l)[] 193.98
Second coordinate of image center in degrees (DEC or galactic b)[] -5.82
Rotation angle of image axis, in degrees[] 0
Projection method e.g. AIT|ARC|CAR|GLS|MER|NCP|SIN|STG|TAN[] AIT
Start energy (MeV) of first bin[] 100
Stop energy (MeV) of last bin[] 100000
Number of logarithmically-spaced energy bins[] 30
Computing binned exposure map....................!

The generated exposure map can be found here.

At this point, you may decide it is easier to simply generate exposure maps for the entire sky. You may be right, as it certainly simplifies the step when scripting. However, making an all-sky map increases the processing time for this step, though the increase is modest. To generate an all-sky exposure map (rather than the 55° radius exposure map we calculated above) you need to specify the proper binning and explicitly give the number of pixels for the entire sky (360°x180°). Here is an example:

prompt> gtexpcube2
Livetime cube file[] 3C279_binned_ltcube.fits
Counts map file[] none
Output file name[] 3C279_binned_allsky_expcube.fits
Response functions to use[] P7SOURCE_V6
Size of the X axis in pixels[] 1800
Size of the Y axis in pixels[] 900
Image scale (in degrees/pixel)[] .2
Coordinate system (CEL - celestial, GAL -galactic) (CEL|GAL) [] CEL
First coordinate of image center in degrees (RA or galactic l)[] 193.98
Second coordinate of image center in degrees (DEC or galactic b)[] -5.82
Rotation angle of image axis, in degrees[] 0
Projection method e.g. AIT|ARC|CAR|GLS|MER|NCP|SIN|STG|TAN[] AIT
Start energy (MeV) of first bin[] 100
Stop energy (MeV) of last bin[] 100000
Number of logarithmically-spaced energy bins[] 30
Computing binned exposure map....................!

The all-sky exposure map can be found here.

8. Compute source map

The gtsrcmaps tool creates model counts maps for use with the binned likelihood analysis. To do this, it takes each source spectrum in the XML model, multiplies it by the exposure at the source position, and convolves that exposure with the effective PSF.

This is an example of how to run the tool:

prompt> gtsrcmaps
Exposure hypercube file[] 3C279_binned_ltcube.fits
Counts map file[] 3C279_binned_ccube.fits
Source model file[] 3C279_binned_input.xml
Binned exposure map[] 3C279_binned_allsky_expcube.fits
Source maps output file[] 3C279_binned_srcmaps.fits
Response functions[P7SOURCE_V6]
WARNING: Point source _2FGLJ1051.8+0107 lies 31.7368 degrees from the ROI center at RA, Dec = 193.98, -5.82
WARNING: Point source _2FGLJ1058.4+0133 lies 30.2398 degrees from the ROI center at RA, Dec = 193.98, -5.82
WARNING: Point source _2FGLJ1059.0+0222 lies 30.3034 degrees from the ROI center at RA, Dec = 193.98, -5.82
WARNING: Point source _2FGLJ1059.9-2051 lies 31.8725 degrees from the ROI center at RA, Dec = 193.98, -5.82

>Some Output Suppressed<

Generating SourceMap for _2FGLJ1505.1+0324....................!
Generating SourceMap for _2FGLJ1510.9-0545....................!
Generating SourceMap for _2FGLJ1511.8-0513....................!
Generating SourceMap for _2FGLJ1512.2+0201....................!
Generating SourceMap for _2FGLJ1512.8-0906....................!
Generating SourceMap for gal_2yearp7v6_v0....................!
Generating SourceMap for iso_p7v6source....................!
prompt>

The output of gtsrcmaps can be found here.

Similarly to the Unbinned Likelihood analysis, the exposure needs to be recalculated if the ROI, zenith angle, time, event class, or energy selections applied to the data are changed. For the binned analysis, this also includes the spatial and energy binning of the 3D counts map (which affects the exposure map as well).

Because your model map includes sources outside your ROI, you will likely see a list of warnings like this at the beginning of the output:

WARNING: Point source _2FGLJ1051.8+0107 lies 31.7368 degrees from the ROI center at RA, Dec = 193.98, -5.82
WARNING: Point source _2FGLJ1058.4+0133 lies 30.2398 degrees from the ROI center at RA, Dec = 193.98, -5.82
WARNING: Point source _2FGLJ1059.0+0222 lies 30.3034 degrees from the ROI center at RA, Dec = 193.98, -5.82
WARNING: Point source _2FGLJ1059.9-2051 lies 31.8725 degrees from the ROI center at RA, Dec = 193.98, -5.82

These are expected and should cause no problem in your analysis. In addition, if your exposure map is too small for the region, you will see the following warning:

Caught St13runtime_error at the top level:
Request for exposure at a sky position that is outside of the map boundaries.

The contribution of the diffuse source outside of the exposure
and counts map boundaries is being computed to account for PSF
leakage into the analysis region. To handle this, use an all-sky
binned exposure map. Alternatively, to neglect contributions
outside of the counts map region, use the emapbnds=no option when
running gtsrcmaps.

In this situation, you should increase the dimensions of your exposure map, or just move to the all-sky version.

Source map generation for the point sources is fairly quick, and maps for many point sources may take up a lot of disk space. If you are analyzing a single long data set, it may be preferable to skip pre-computing the source maps for these sources at this stage. gtlike will compute these on the fly if they appear in the XML definition and a corresponding map is not in the source maps FITS file. To skip generating source maps for point sources, specify "ptsrc=no" on the command line when running gtsrcmaps. However, if you expect to perform multiple fits on the same set of data, precomputing the source maps will probably save you time.

» Back to Top

9. Run gtlike

NOTE: Prior to running gtlike for Unbinned Likelihood, it is necessary to calculate the diffuse response for each event (when that response is not precomputed). However, for Binned Likelihood analysis the diffuse response is calculated over the entire bin, so this step is not necessary.

Now we are ready to run the gtlike application. Here, we request that the fitted parameters be saved to an output XML model file for use in later steps.

prompt> gtlike refit=yes plot=yes sfile=3C279_binned_output.xml

Statistic to use (BINNED|UNBINNED)[] BINNED
Counts map file[] 3C279_binned_srcmaps.fits
Binned exposure map[] 3C279_binned_allsky_expcube.fits
Exposure hypercube file[] 3C279_binned_ltcube.fits
Source model file[] 3C279_binned_input.xml
Response functions to use[P7SOURCE_V6]
Optimizer (DRMNFB|NEWMINUIT|MINUIT|DRMNGB|LBFGS)[] NEWMINUIT

Most of the entries prompted for are fairly obvious. In addition to the various XML and FITS files, the user is prompted for a choice of IRFs, the type of statistic to use, and the optimizer.

The statistics available are:

  • UNBINNED This should be used for short timescale or low source count data. If this option is chosen then parameters for the spacecraft file, event file, and exposure file must be given. See explanation in: Likelihood Tutorial
  • BINNED This is a standard binned analysis, described in this tutorial. This analysis is used for long timescale or high-density data (such as in the Galactic plane) which can cause memory errors in the unbinned analysis. If this option is chosen then parameters for the source map file, livetime file, and exposure file must be given.

There are five optimizers from which to choose: DRMNGB, DRMNFB, NEWMINUIT, MINUIT and LBFGS. Generally speaking, the faster way to find the parameter estimates is to use DRMNGB (or DRMNFB) to find initial values and then use MINUIT (or NEWMINUIT) to find more accurate results. If you have trouble achieving convergence at first, you can loosen your tolerance by setting the hidden parameter ftol on the command line. (The default value for ftol is 0.001.)

Analyzing a 2-year dataset will take many hours (in our case more than 2 days with a 32-bit machine with 1 GB of RAM). The required running time is high if your source is in the Galactic plane. Here is some output from our fit, where 2FGLJ1229.1+0202 and 2FGLJ1256.1-0547 corresponds to 3C 273 and 3C 279, respectively:

>Some Output Suppressed<

Minuit did successfully converge.
# of function calls: 11692
minimum function Value: 492772.005644
minimum edm: 8.722875143868e-06

(MORE OUTPUTS HERE...)

Computing TS values for each source (114 total)
..................................................................................................................!

Photon fluxes are computed for the energy range 100 to 100000 MeV

(MORE OUTPUTS HERE...)

_2FGLJ1229.1+0202:
norm: 1.29092 +/- 0.0156756
alpha: 2.4524
beta: 0.111229
Eb: 231.151
TS value: 20479.6
Flux: 6.61064e-07 +/- 8.02153e-09 photons/cm^2/s

(MORE OUTPUTS HERE...)

_2FGLJ1256.1-0547:
norm: 6.27961 +/- 0.0814749
alpha: 2.24464 +/- 0.0186767
beta: 0.061845 +/- 0.0090505
Eb: 294.806
TS value: 22205.8
Flux: 5.48426e-07 +/- 8.28549e-09 photons/cm^2/s

(MORE OUTPUTS HERE...)

gal_2yearp7v6_v0:
Prefactor: 1.0685 +/- 0.00821222
Index: 0
Scale: 100
Flux: 0.000516126 +/- 3.96642e-06 photons/cm^2/s

iso_p7v6source:
Normalization: 0.994416 +/- 0.00845461
Flux: 0.000206217 +/- 1.75197e-06 photons/cm^2/s

WARNING: Fit may be bad in range [100, 158.489] (MeV)
WARNING: Fit may be bad in range [251.189, 398.107] (MeV)
WARNING: Fit may be bad in range [501.187, 630.957] (MeV)

Total number of observed counts: 486458
Total number of model events: 486463

-log(Likelihood): 492772.0056

Writing fitted model to 3C279_binned_output.xml

Since we selected 'plot=yes' in the command line, a plot of the fitted data appears.

In the first plot the counts/MeV vs MeV are plotted. The points are the data, and the lines are the models. Error bars on the points represent sqrt(Nobs) in that band, where Nobs is the observed number of counts. The black line is the sum of the models for all sources. The colored lines follow the sources as follows:

  • Black - summed model
  • Red - first source in results.dat file (see below)
  • Green - second source
  • Blue - third source
  • Magenta - fourth source
  • Cyan - the fifth source

If you have more sources the colors are reused in the same order. In our case we have, in order of decreasing value on the y-axis: summed model (black), the extragalactic background (black), the galactic background (cyan), 3C 273 (red), and 3C 279 (black).

The second plot gives the residuals between your model and the data. Error bars here represent (sqrt(Nopbs))/Npred, where Npred is the predicted number of counts in each band based on the fitted model.

To assess the quality of the fit, look first for the words at the top of the output "<Optimizer> did successfully converge." Successful convergence is a minimum requirement for a good fit. Next, look at the energy ranges that are generating warnings of bad fits. If any of these ranges affect your source of interest, you may need to revise the source model and refit. You can also look at the residuals on the plot (bottom panel). If the residuals indicate a poor fit overall (e.g., the points trending all low or all high) you should consider changing your model file, perhaps by using a different source model definition, and refit the data.

If the fits and spectral shapes are good, but could be improved, you may wish to simply update your model file to hold some of the spectral parameters fixed. For example, by fixing the spectral model for 3C 273, you may get a better quality fit for 3C 279. Close the plot and you will be asked if you wish to refit the data.

Refit? [y] n
Elapsed CPU time: 1571.805872
prompt>

Here, hitting 'return' will instruct the application to fit again. We are happy with the result, so we type 'n' and end the fit.

» Back to Top

Results

When it completes, gtlike generates two standard output files: results.dat with the results of the fit, and counts_spectra.fits with predicted counts for each source for each energy bin. The points in the plot above are the data points from the counts_spectra.fits file, while the colored lines follow the source model information in the results.dat file. If you re-run the tool in the same directory, these files will be overwritten by default. Use the clobber=no option on the command line to keep from overwriting the output files.

Unfortunately, the fit details and the value for the -log(likelihood) are not recorded in the automatic output files. You should consider logging the output to a text file for your records by using '> fit_data.txt' (or something similar) with your gtlike command:

prompt> gtlike refit=yes plot=yes sfile=3C279_output_model.xml > fit_data.txt

In this example, we used the 'sfile' parameter to request that the model results be written to an output XML file. This file contains the source model results that were written to results.dat at the completion of the fit.

Note: If you have specified an output XML model file and you wish to modify your model while waiting at the 'Refit? [y]' prompt, you will need to copy the results of the output model file to your input model before making those modifications.

The results of the likelihood analysis have to be scaled by the quantity called "scale" in the XML model in order to obtain the total photon flux (photons cm-2 s-1) of the source. You must refer to the model formula of your source for the interpretation of each parameter. In our example the 'prefactor' of our power law model of the first fitted source (2FGLJ1051.8+0107) has to be scaled by the factor 'scale'=10-13. For example the total flux of 2FGLJ1051.8+0107 is the integral between 100 MeV and 100000 MeV of:

Prefactor x scale x (E /100)index=(1.3268x10-13) * (E/100)-2.07654

Errors reported with each value in the results.dat file are 1σ estimates (based on inverse-Hessian at the optimum of the log-likelihood surface).

» Back to Top

Other Useful Hidden Parameters

If you are scripting and wish to generate multiple output files without overwriting, the 'results' and 'specfile' parameters allow you to specify output filenames for the results.dat and counts_spectra.fits files respectively.

If you do not specify a source model output file with the 'sfile' parameter, then the input model file will be overwritten with the latest fit. This is convenient as it allows the user to edit that file while the application is waiting at the 'Refit? [y]' prompt so that parameters can be adjusted and set free or fixed. This would be similar to the use of the "newpar", "freeze", and "thaw" commands in XSPEC.

10. Create a model map

For comparison to the counts map data, we create a model map of the region based on the fit parameters. This map is essentially an infinite-statistics counts map of the region-of-interest based on our model fit.

The gtmodel application reads in the fitted model, applies the proper scaling to the source maps, and adds them together to get the final map.

prompt>gtmodel
Source maps file [] 3C279_binned_srcmaps.fits
Source model file [] 3C279_binned_output.xml
Output file [] 3C279_model_map.fits
Response functions [] P7SOURCE_V6
Exposure cube [] 3C279_binned_ltcube.fits
Binned exposure map [] 3C279_binned_allsky_expcube.fits
prompt>

To understand how well the fit matches the data, we want to compare the model map just created with the counts map over the same field of view. First we have to create the new counts map that matches in size the model map (the one generated in Step 2 encircles the ROI, while the model map is completely inscribed within the ROI): We will use again the gtbin tool with the option "CMAP" as shown below:

prompt> gtbin
This is gtbin version ScienceTools-09-27-01
Type of output file (CCUBE|CMAP|LC|PHA1|PHA2) [] CMAP
Event data file name[] 3C279_binned_gti.fits
Output file name[] 3C279_binned_cmap_small.fits
Spacecraft data file name[] NONE
Size of the X axis in pixels[] 200
Size of the Y axis in pixels[] 200
Image scale (in degrees/pixel)[] 0.2
Coordinate system (CEL - celestial, GAL -galactic) (CEL|GAL) [CEL]
First coordinate of image center in degrees (RA or galactic l)[] 193.98
Second coordinate of image center in degrees (DEC or galactic b)[] -5.82
Rotation angle of image axis, in degrees[] 0.0
Projection method e.g. AIT|ARC|CAR|GLS|MER|NCP|SIN|STG|TAN:[] STG

Here we've plotted the model map next to the the energy-summed counts ma for the data.

energy-summed counts map and the corresponding model map

Finally we want to create the residual map by using the FTOOL farith to check if we can improve the model:

prompt> farith
Name of 1st FITS file and [ext#][] 3C279_binned_cmap_small.fits
Name of 2nd FITS file and [ext#][] 3C279_model_map.fits
Name of OUTFIL FITS file[] 3C279_residual.fits
Operation to perform (ADD,SUB,DIV,MUL(or +,-,/,*),MIN,MAX)[] SUB
prompt>

The residual map is shown below. As you can see, the binning we chose probably used pixels that were too large. The primary sources, 3C 273 and 3C 279, have some positive pixels next to some negative ones. This effect could be lessened by either using a smaller pixel size or by offsetting the central position slightly from the position of the blazar (or both). If your residual map contains bright sources, the next step would be to iterate the analysis with the additional sources included in the XML model file.

Residuals map after subtracton


Last updated by: Elizabeth Ferrara - 12/21/2013