# Introduction to the Data Primitives¶

Within the GBM Data Tools, there are data primitive classes that handle the properties and manipulation of the fundamental types of data that GBM produces. Often you won’t need to directly worry about these classes and how they work if you are doing sufficiently high-level work using the rest of the data tools (e.g. plotting lightcurves/spectra, standard data binning, background/spectral fitting, etc.), however the API exposes a full array of properties and methods that allow you to work directly with the data primitives.

In general, there are three primary primitive types that describe GBM Data: `Bins`

, `TimeEnergyBins`

, and `EventList`

. Specifically, `Bins`

refers to 1D binned data (histograms), and there are two derived types with special additional properties: `TimeBins`

and `EnergyBins`

. While `TimeEnergyBins`

is a class representing 2D binned data, where one axis is time and the other is energy. The final of these primitive types, `EventList`

, is a class representing a list of events, where
each event has some attributes, namely time and energy channel.

In addition to these primary primitives, there are two more additional “utility” primitives called `TimeRange`

and `GTI`

which represent a single span of time, and a series of disjoint spans of time, respectively.

## TimeBins and EnergyBins¶

These classes are 1D histograms with special properties. Both can be instantiated with an array of `counts`

(number of things in each bin), `lo_edges`

and `hi_edges`

(the edge boundaries for each bin), and `exposure`

(the exposure in each bin).

```
[1]:
```

```
import numpy as np
from gbm.data.primitives import TimeBins, EnergyBins
# create a TimeBins object: 4 time bins, with exposure < binwidth due to some deadtime
time_counts = np.array([1,3,5,2])
time_lo_edges = np.array([1.0, 2.0, 3.0, 4.0])
time_hi_edges = np.array([2.0, 3.0, 4.0, 5.0])
time_exposure = np.array([0.99, 0.98, 0.97, 0.96])
timebins = TimeBins(time_counts, time_lo_edges, time_hi_edges, time_exposure)
# create an EnergyBins object: 4 energy bins, with the same exposure for each bin
energy_counts = np.array([1,3,5,2])
energy_lo_edges = np.array([10.0, 35.0, 100.0, 250.0])
energy_hi_edges = np.array([35.0, 100.0, 250.0, 600.0])
energy_exposure = np.array([9.67]*4)
energybins = EnergyBins(energy_counts, energy_lo_edges, energy_hi_edges, energy_exposure)
```

There are a variety of properties that can be retrieved from the objects once they’ve been created. For example, maybe you want to know what the rate (and rate uncertainty) is for each of your bins:

```
[2]:
```

```
timebins.rates, \
timebins.rate_uncertainty
```

```
[2]:
```

```
(array([1.01010101, 3.06122449, 5.15463918, 2.08333333]),
array([1.01010101, 1.76739878, 2.30522472, 1.47313913]))
```

Or maybe you need the bin centers and full domain of the data because you want to make a plot:

```
[3]:
```

```
timebins.centroids, \
timebins.range
```

```
[3]:
```

```
(array([1.5, 2.5, 3.5, 4.5]), (1.0, 5.0))
```

Similarly, you can retrieve the same properties for `EnergyBins`

, with some important differences. `EnergyBins`

is defined to account for a spectrum that spans orders of magnitude, therefore the bin centroids are calculated as the geometric mean, instead of the arithmetic mean. Also, the rate is naturally defined as the differential rate (i.e. divided by the bin width).

```
[4]:
```

```
energybins.centroids, \
energybins.rates
```

```
[4]:
```

```
(array([ 18.70828693, 59.16079783, 158.11388301, 387.29833462]),
array([0.0041365 , 0.00477289, 0.00344709, 0.00059093]))
```

In addition to these properties (and more), these classes have some important methods. You can add multiple `TimeBins`

or `EnergyBins`

together to create a new class that is the sum (provided they cover the same time and energy range).

```
[5]:
```

```
timebins2 = TimeBins.sum([timebins, timebins])
timebins.counts, \
timebins2.counts
```

```
[5]:
```

```
(array([1, 3, 5, 2]), array([ 2., 6., 10., 4.]))
```

```
[6]:
```

```
energybins2 = EnergyBins.sum([energybins, energybins])
energybins.counts, \
energybins2.counts
```

```
[6]:
```

```
(array([1, 3, 5, 2]), array([ 2., 6., 10., 4.]))
```

You can merge multiple `TimeBins`

or `EnergyBins`

together into one (even if they are overlapping)

```
[7]:
```

```
new_timebins = TimeBins(np.array([10., 20.0, 3.0]), np.array([8.0, 9.0, 10.0]), np.array([9.0, 10.0, 11.0]), np.array([0.97, 0.96, 0.97]))
merged_timebins = TimeBins.merge([timebins, new_timebins])
merged_timebins.centroids, \
merged_timebins.counts
```

```
[7]:
```

```
(array([ 1.5, 2.5, 3.5, 4.5, 8.5, 9.5, 10.5]),
array([ 1., 3., 5., 2., 10., 20., 3.]))
```

And because data can have gaps in it (and that can complicate analysis if not properly handled), these classes have the ability to split into a list of contiguous bins:

```
[8]:
```

```
split_timebins = merged_timebins.contiguous_bins()
[(tb.centroids, tb.counts) for tb in split_timebins]
```

```
[8]:
```

```
[(array([1.5, 2.5, 3.5, 4.5]), array([1., 3., 5., 2.])),
(array([ 8.5, 9.5, 10.5]), array([10., 20., 3.]))]
```

Maybe we don’t want all of our data for an analysis, but only a snippet. We can make a slice:

```
[9]:
```

```
# slice energy bins so that we only have data in the 50-200 keV range, inclusive
slice_energybins = energybins.slice(50.0, 200.0)
slice_energybins.range
```

```
[9]:
```

```
(35.0, 250.0)
```

And finally we can rebin the data if we want, using some type of binning algorithm:

```
[10]:
```

```
from gbm.binning.binned import combine_by_factor
rebinned_energybins = energybins.rebin(combine_by_factor, 2) # rebin by a factor of 2
rebinned_energybins.centroids, \
rebinned_energybins.counts
```

```
[10]:
```

```
(array([ 31.6227766 , 244.94897428]), array([4, 7]))
```

## TimeEnergyBins¶

This class is a 2D representation of data that have 2 axes: time and energy. To successfully create such an object, you need to provide a 2D `counts`

array (the number of counts in each time bin for each energy channel), `tstart`

and `tstop`

, the edges for the time bins, `exposure`

the time exposure in each bin, and `emin`

and `emax`

, the edges for the energy bins. In general, many of the same properties and methods for `TimeBins`

and `EnergyBins`

are available here, keeping in
mind that there are different methods for each axis.

```
[11]:
```

```
from gbm.data.primitives import TimeEnergyBins
counts = np.array([[1,10], [3, 20], [5,3], [2,7]]) # 4 time bins, 2 energy channels
tstart = np.array([1.0, 2.0, 3.0, 4.0])
tstop = np.array([2.0, 3.0, 4.0, 5.0])
exposure = np.array([0.99, 0.98, 0.97, 0.96])
emin = np.array([35.0, 150.0])
emax = np.array([150.0, 350.0])
bins = TimeEnergyBins(counts, tstart, tstop, exposure, emin, emax)
# how many times, how many channels
bins.numtimes, \
bins.numchans
```

```
[11]:
```

```
(4, 2)
```

Similar to the 1D counterparts, we can access various properties like the bin centroids:

```
[12]:
```

```
bins.time_centroids, \
bins.energy_centroids
```

```
[12]:
```

```
(array([1.5, 2.5, 3.5, 4.5]), array([ 72.45688373, 229.12878475]))
```

Or the bin widths:

```
[13]:
```

```
bins.time_widths, \
bins.chan_widths
```

```
[13]:
```

```
(array([1., 1., 1., 1.]), array([115., 200.]))
```

And we can immediately return the rates and uncertainty for each bin:

```
[14]:
```

```
print(bins.rates)
print(bins.rate_uncertainty)
```

```
[[ 1.01010101 10.1010101 ]
[ 3.06122449 20.40816327]
[ 5.15463918 3.09278351]
[ 2.08333333 7.29166667]]
[[1.01010101 3.19421986]
[1.76739878 4.56340404]
[2.30522472 1.78561939]
[1.47313913 2.75599095]]
```

As for the available methods, you can perform all of following operations similar to the 1D counterparts: * Retrieve a list of contiguous bins (`contiguous_time_bins()`

and `contiguous_energy_bins()`

) * Perform a slice (`slice_time()`

and `slice_energy()`

) * Rebin (`rebin_time()`

and `rebin_energy()`

) * Merge multiple `TimeEnergyBins`

across one axis (`merge_time()`

and `merge_energy()`

)

In addition to these methods, you can also integrate along one axis of `TimeEnergyBins`

to return the corresponding `TimeBins`

or `EnergyBins`

object:

```
[15]:
```

```
# integrate over energy
energy_integrated = bins.integrate_energy()
type(energy_integrated), \
energy_integrated.counts
```

```
[15]:
```

```
(gbm.data.primitives.TimeBins, array([11, 23, 8, 9]))
```

```
[16]:
```

```
# integrate over time
time_integrated = bins.integrate_time()
type(time_integrated), \
time_integrated.counts
```

```
[16]:
```

```
(gbm.data.primitives.EnergyBins, array([11, 40]))
```

## EventList¶

Unlike the previous primitives, `EventList`

is design to work with *semi-unbinned* data. Generally `EventList`

is a list of events with attributes; specifically, it is a list of event arrival times with a PHA channel attribute. The *semi-unbinned* qualification comes from the fact that the time axis of the data is not binned (down to the precision of the instrument being able to measure the individual arrival times of photons), but the energy axis *is necessarily* binned into energy
channels. For this reason, `EventList`

not only contains a list of times, but has some metadata that translates the PHA channel attribute to an energy calibration.

Because of the metadata requirement, there are a couple of ways to create an `EventList`

object. One can instantiate with the relevant arrays in numpy.recarray or astropy.io.fits.fitsrec.FITS_rec format (e.g. from reading one of the GBM TTE data files). Or maybe we want to *create* an EventList from a simulation. Then we would use `EventList.from_lists()`

which requires a list of arrival times, a list of PHA channels, and lists of the lower and upper edges of the energy channel calibration.

```
[17]:
```

```
from gbm.data.primitives import EventList
# 7 events
times = [1.0, 1.5, 2.7, 3.2, 4.9, 8.8, 8.7]
phas = [3, 1, 0, 2, 1, 3, 0]
# 4 channels
emin = [10.0, 35.0, 100.0, 250.0]
emax = [35.0, 100.0, 250.0, 600.0]
eventlist = EventList.from_lists(times, phas, emin, emax)
eventlist.size, \
eventlist.time_range, \
eventlist.energy_range
```

```
[17]:
```

```
(7, (1.0, 8.8), (10.0, 600.0))
```

One of the nice properties of `EventList`

is that you can immediately retrieve the count spectrum as an `EnergyBins`

object:

```
[18]:
```

```
spectrum = eventlist.count_spectrum
type(spectrum), \
spectrum.counts
```

```
[18]:
```

```
(gbm.data.primitives.EnergyBins, array([2, 2, 1, 2]))
```

As with the binned datatypes, you can slice in time and energy, which returns a new `EventList`

containing the slice:

```
[19]:
```

```
time_sliced = eventlist.time_slice(2.0, 5.0) # return events with 2-5 s
chan_sliced = eventlist.channel_slice(1, 2) # return events in channels 1 & 2
energy_sliced = eventlist.energy_slice(10.0, 100.0) # return events in energy channels spanning 10-100 keV
time_sliced.size, \
chan_sliced.size, \
energy_sliced.size
```

```
[19]:
```

```
(3, 3, 4)
```

As with `TimeEnergyBins`

, you can rebin the `EventList`

in energy using `rebin_energy()`

. You can also *bin* in time, given a binning function, which returns a `TimeEnergyBins`

object:

```
[20]:
```

```
from gbm.binning.unbinned import bin_by_time
bins = eventlist.bin(bin_by_time, 1.0) # bins the data into 1.0 s wide bins
type(bins), \
bins.counts
```

```
[20]:
```

```
(gbm.data.primitives.TimeEnergyBins,
array([[0., 1., 0., 1.],
[1., 0., 0., 0.],
[0., 0., 1., 0.],
[0., 1., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.],
[1., 0., 0., 1.]]))
```

You can also retrieve the exposure for all or part of the EventList. By default, this assumes the deadtime profile for GBM, but can be modified through kwargs:

```
[21]:
```

```
eventlist.get_exposure(), \
eventlist.get_exposure(time_ranges=(1.0, 5.0))
```

```
[21]:
```

```
(7.7999670000000005, 3.9999796)
```

Finally, multiple EventLists can be merged into a single `EventList`

. If there is a chance that the EventLists may contain duplicate events, you may want to set the *force_unique* keyword.

```
[22]:
```

```
merged_eventlist = EventList.merge([time_sliced, chan_sliced], force_unique=True)
merged_eventlist.size
```

```
[22]:
```

```
4
```

## TimeRange and GTI¶

These two primitives are designed to represent time ranges (surprise!) and series of time ranges. The `TimeRange`

class provides a simple class defining a time range and associated methods. `GTI`

(Good Time Intervals) can contain many `TimeRanges`

.

Let’s create a `TimeRange`

:

```
[23]:
```

```
from gbm.data.primitives import TimeRange, GTI
# a time range from 0 s to 10 s
timerange1 = TimeRange(0.0, 10.0)
# time range duration and center
timerange1.duration, \
timerange1.center
```

```
[23]:
```

```
(10.0, 5.0)
```

You can check to see if it contains a time of interest:

```
[24]:
```

```
timerange1.contains(3.5), \
timerange1.contains(-1.0)
```

```
[24]:
```

```
(True, False)
```

You can also do simple time range “arithmetic”:

```
[25]:
```

```
timerange2 = TimeRange(4.8, 22.0)
time_union = TimeRange.union(timerange1, timerange2)
time_intersect = TimeRange.intersection(timerange1, timerange2)
print(time_union)
print(time_intersect)
```

```
(0.0, 22.0)
(4.8, 10.0)
```

As for `GTI`

, you can instantiate it a few different ways. Let’s start with just one range:

```
[26]:
```

```
gti1 = GTI(*timerange1.as_tuple())
# number of contiguous intervals, full range
gti1.num_intervals, \
gti1.range
```

```
[26]:
```

```
(1, (0.0, 10.0))
```

And now we can insert a new time range into our `GTI`

:

```
[27]:
```

```
gti1.insert(23.0, 55.0) # insert new interval from 23 s to 55 s
gti1.num_intervals, \
gti1.range
```

```
[27]:
```

```
(2, (0.0, 55.0))
```

And we can merge two `GTI`

into one:

```
[28]:
```

```
# create another GTI from a list of (tstart, tstop) tuples
gti2 = GTI.from_list([(100.0, 200.0), (250.0, 275.0), (507.0, 508.)])
gti_merged = GTI.merge(gti1, gti2)
gti_merged.num_intervals, \
gti_merged.range
```

```
[28]:
```

```
(5, (0.0, 508.0))
```

As with the simple `TimeRange`

, we can check to see if a time of interest is contained in the `GTI`

:

```
[29]:
```

```
gti_merged.contains(17.0), \
gti_merged.contains(170.0)
```

```
[29]:
```

```
(False, True)
```

And if you want to see the full `GTI`

intervals:

```
[30]:
```

```
gti_merged.as_list()
```

```
[30]:
```

```
[(0.0, 10.0), (23.0, 55.0), (100.0, 200.0), (250.0, 275.0), (507.0, 508.0)]
```

Hopefully your eyes haven’t glazed over yet. If not, you’ve survived the deepest dive into code! Onward to the plugin architecture and capabilities!