# Likelihood Model Fitting

We now have all the inputs necessary for fitting parameters using the likelihood tool, gtlike:

• An event file with the counts to be fit.
• A spacecraft file that covers the time range over which the counts in the event file were extracted.
• A source model.
• Precomputed quantities, such as the livetime cube and the exposure map.

Fitting involves finding the set of parameters that maximizes the likelihood. Since the likelihood is a non-linear function of the parameters, algorithms for maximizing non-linear functions can be used. The maximum is found by iteratively calculating the function for different sets of trial parameters; by estimating derivatives of the function with respect to the parameters, the algorithms choose new trial parameters that are progressively closer to the set that maximizes the function. The function is calculated for new sets of trial parameters until the change in the function value between iterations is sufficiently small (or the number of iterations reaches a maximum value). While iterating, these algorithms map out the dependence of the function on the parameters, particularly near the function's maximum. The uncertainties on the best fit parameters are related to this dependence. Different algorithms vary in how rapidly they converge to the function maximum, the amount of computer memory they require, and the accuracy with which they map out the dependence of the function on the parameters near the maximum (and thus estimate the uncertainty).

In running gtlike you have a choice of algorithms, called optimizers, for maximizing the likelihood. The optimizers determine the best-fit spectral parameters, but not the location; the source coordinates are fixed. (To find the location, use the gtfindsrc tool.) In the Fermitools there are five optimizers to maximize the log likelihood function: DRMNGB, DRMNFB, NEWMINUIT, MINUIT and LBFGS.

• DRMNGB finds the local minima of a continuously differentiable function subject to simple upper and lower bound constrains. It uses a variant of Newton's method with quasi-Newton Hessian updating method, and model/trust-region technique to aid convergence from poor starting values. The original code obtained from Netlib is in Fortran, but it was converted to C++ and has some convergence problems.
• DRMNFB interfaces with many of the same subroutines as DRMNGB, but handles the derivative information differently and does not seem to suffer from some of the convergence problems encountered with DRMNGB.
• MINUIT interfaces with the original FORTRAN Minuit processed by f2c (FORTRAN to C translator) and then adapted to compile as C++. In the Fermitools, only a few of MINUIT's possibilities are used. For example, all variables are treated as bounded. No user interaction is allowed, and only the MIGRAD algorithm is implemented. For more information about MINUIT, see the MINUIT Function Minimization and Error Analysis Reference Manual, available from the documentation section of the Minuit website.
• NEWMINUIT interfaces with an entirely **NEW** code of 'true' C++ designed in an object-oriented way. It is based on the original MINUIT for algorithms and functionality and uses only a few of MINUIT's features: the MIGRAD and HESSE algorithms. All variables are treated as bounded. No user interaction is allowed and, while it has no limits on the number of free parameters, there is certainly a practical limit, beyond which any fit is suspect. The MINUIT manual suggests a maximum number of around 15.
• LBFGS was originally obtained from Netlib. The original code is in Fortran, but as with the others, it was translated to C++. The "L'' in the name means "limited memory''. That means that the full approximate Hessian is not available.

Note: Generally speaking, a reasonable strategy is to run gtlike with the DRMNFB optimizer until convergence, and then to run gtlike using the NEWMINUIT optimizer with the best fit parameter values from this first run in order to calculate the uncertainties on the parameter values. DRMNFB is efficient at finding the maximum likelihood but approximates the parameter dependence near this maximum. Consequently, the uncertainties provided by this optimizer may not be reliable. On the other hand, NEWMINUIT is a conservative optimizer that converges more slowly than these other methods, and indeed may exhaust the number of permitted iterations before convergence. However, NEWMINUIT more accurately maps out the parameter space near the likelihood maximum, and thus provides more reliable uncertainty estimates.

The convergence criteria is controlled by the hidden variable 'fit_tolerance' whose definition depends on the particular optimizer but is approximately the fractional change in the logarithm of the likelihood.

### Likelihood Output

Interpretation of the gtlike output requires knowledge of the parameters used in the source model. The output reports six parameters: Prefactor, Index, Scale, Npred, ROI distance, and TS value. The likelihood analysis generates a flux value, listed as a Prefactor. To convert from the prefactor to flux, multiply the prefactor by the reported Scale value, then apply the scale used in the source XML model. The resulting value will be in units of 10-8 ph cm-2 s-1.

The value reported for Index is the fitted spectral index over the entire energy range used in the analysis. As the likelihood analysis is considered best for dealing with multidimenstional data, often there is a desire to generate a spectral fit using this technique. To accomplish this, one must divide the data into discrete energy ranges using gtselect, and then perform the likelihood analysis multiple times over each individual energy range. The result will be a single index per range, which can then be combined to generate the fit. In this instance, the error in energy on each point will be the range used in the fit.

The number of photons used in the fit is listed as Npred. A very small Npred value will cause large errors, and may indicate the need for a longer time span for the fit. The ROI distance indicated the angular separation between the center of the ROI and the location of the fit for that source.

Finally, TS value is the Test Statistic resulting from the fit. As a general rule, the Test Statistic is approxmiately the square of the significance. For a more complete explanation of the test statistic resulting from a likelihood fit, see Mattox, J. et al, 1996, ApJ., 461, 396.

» Forward to Source Detection and Localization
» Back to Calculating Livetime and Exposure
» Back to the beginning of the likelihood section
» Back to the beginning of the Cicerone.