I recently wrote a guest blog post for the Bio-computational Evolution in Action Consortium (BEACON) on the importance of teaching, or at least introducing all STEM graduate students with cognitive sciences and cognitive flaws that can affect human mind and reasoning. The full post can found here. The following is a summary of the post:
In a world in which science and technological breakthroughs dominate all aspects of almost every individual human life, scientists and researchers are under an ever increasing pressure to cross and expand the borders of human knowledge. As new discoveries require higher levels of precision and reproducibility, excess workload and hyper-competitive work environments have made researchers more prone to human cognitive biases. A solution to this emerging problem is to introduce all graduate students in STEM fields with the limitations of human mind and scientific instruments and their potential role in false positive discoveries and misconduct of scientific research. I suggest that a full-semester course that covers relevant topics including those mandated by NSF as Responsible Conduct of Research should be developed and tailored for each individual STEM field of research and be offered as an integral core course of every graduate program across the world.
– BATSE Catalog trigger numbers for 1366 events classified as Long GRBs in this analysis can be found here.
– Spectral data for these triggers are available from BATSE Current & 4B GRB Catalogs.
– The spectral peak energy EP,obs estimates for all LGRBs can be found here.
– Full MCMC samples of the model parameters for different cosmic rates can be found in the project’s repository.
A Brief (incomplete) Introduction to the Story:
The Luminosity Function (LF) of Long-duration Gamma-Ray Bursts (LGRBs) has been subject of many researches in GRB community. Early attempts to constrain the LF of LGRBs in the BATSE era were primarily aimed at finding the true origin of LGRBs: Cosmological vs. Galactic. Back in the 90’ there was a great debate and suspicion about the cosmological origin of LGRBs with some scientists arguing that a cosmological origin for GRBs would imply an enormous output of energy on the order of. 1051 [erg] in a matter of a few seconds. Nevertheless, observations of the GRB afterglows in the late 90’ and the first measurement of a GRB redshift, ruled out the galactic models as a potential candidate for GRBs, or at best, for many classes of Gamma-Ray events. Now with GRB distance puzzle being solved, researchers turned into other interesting aspects of these bursts, such as the studies of GRB energetics and the correlations among the spectral parameters of the prompt gamma-ray emission from (mostly) LGRBs. Most prominently, some observational astronomers reported strong correlations between the total isotropic emission of the gamma-ray energies (Eiso), or the isotropic peak luminosity (Liso) and the spectral peak energies (EP,z) of LGRBs. Such correlations were later criticized by some other researchers for the lack of significance and sample incompleteness. The culprit here turns out to be the unknown complex selection effects in the GRB detection mechanism, spectral analysis, and redshift measurement which modify the observed sample of LGRBs from the true underlying population without leaving a clear trace. Nevertheless, the debate still goes on to this date among GRB researchers (c.f. Shahmoradi & Nemiroff 2011 for a complete review).
The goal of the presented analysis was to derive a multivariate model capable of reproducing the joint 4-dimensional distribution of 4 spectral parameters of 1366 BATSE LGRBs:
– Liso: the isotropic peak luminosity
– Eiso: the isotropic total emission
– EP,z: the comoving-frame time-integrated spectral peak energy
– T90,z: the comoving-frame duration
Examples of multivariate treatment of LGRB data are rare in Gamma-Ray Burst literature, with the most recent (and perhaps the only) such work presented by Butler et al. (2010). Conversely, many authors have focused primarily on the univariate distribution of the spectral parameters, most importantly on the luminosity function (LF). A variety of univariate models have been proposed as the LGRB LF and fit to data by approximating the complex detector threshold as a step function (e.g., Schmidt 1999) or an efficiency grid (e.g. the four-interval efficiency modeling of Guetta et al. 2005) or by other approximation methods. A more accurate modeling of the LF, however, requires at least two LGRB observable incorporated in the model: the observed bolometric peak flux (Pbol) and the observed spectral peak energy (EP). The parameter EP is required, since most gamma-ray detectors are photon counters, a quantity that depends on not only Pbol but also on the observed spectral peak energy (EP) of the burst. This leads to the requirement of using a bivariate distribution as the minimum acceptable model to begin with, for the purpose of constraining the LF. The choice of model can be almost anything, since the current theories of LGRBs prompt emission do not set strong limits on the shape and range of the luminosity function (or any other LGRB spectral variables).
A Multivariate (4-Dimensional) Model of LGRBs Spectral Parameters:
Here, the multivariate log-normal distribution is proposed as the simplest natural candidate model capable of describing data. The motivation behind this choice of model comes from the available observational data that closely resembles a joint multivariate log-normal distribution for four most widely studied spectral parameters of LGRBs in the observer-frame:
– Pbol: the bolometric peak flux
– Sbol: the bolometric fluence
– Ep: the observed spectral peak energy
– T90: the observed duration
Since most LGRBs originate from moderate redshifts z~1-3 (a fact known thanks to Swift satellite observations), the convolution of these observer-frame parameters with the redshift distribution results in negligible variation in the shape of the comoving-frame joint distribution of the same LGRB parameters. Therefore, the redshift-convoluted 4-Dimensional (4D) observer-frame distribution can be well approximated as a linear translation from the observer-frame parameter space to the comoving-frame parameter space, keeping the shape of the distribution almost intact. This implies that the joint distribution of the intrinsic LGRB variables: the isotropic peak luminosity (Liso), the total isotropic emission (Eiso), the rest-frame spectral peak energy (Ep,z), and , the rest-frame gamma-ray duration (T90,z) might be indeed well described by a multivariate log-normal distribution.
For this study, LGRB data were collected from the largest catalog of bursts available to date: BATSE 2130 GRBs. First, an elaborate method to separate the two classes of GRBs: Short vs. Long was devised. Details of the methods used are described in Shahmoradi (2012). Then the proposed quadru-variate LGRB world model, convolved with BATSE trigger threshold was fit to the observational data of 1366 BATSE LGRBs. The best-fit parameters were obtained by maximizing the likelihood function of the model given BATSE data, by setting up a Markov Chain to randomly explore the likelihood space for the global maximum. Extensive goodness-of-fit tests were performed to ensure the model adequately describes data.
Here is an example prediction of the LGRB world model for the well-known Eiso-EP,z (the Amati) relation:
and here is an example plot of the reconstruction of the univariate BATSE LGRB data measured in the observer-frame according to the best-fit parameters:
The psychological literature is full of studies that demonstrate how the humans’ limited senses can result in cognitive flaws and biases in our understanding of the universe. In fact, psychologists have pinpointed many specific biases that affect not only the way we see but how we think about and perceive the world. Confirmation bias, for example, is the tendency to notice, accept, and remember data that confirms what we already believe, and to ignore, forget, or explain away data that is contradictory to our beliefs. To make things worse, add the (unknown) limitations of instruments by which human probes the universe. The combined effects of human and instrument biases can result in erroneous conclusions and predictions.
Fortunately, many of such biases are now well understood by scientists, in particular, experimental physicists, observational astronomers and quantitative biologists. An example, is the well-known Malmquist bias in Observational Astronomy. Nevertheless, as our circle of knowledge expands, so does the circumference of darkness surrounding it, bringing new types of biases and selection effects with it, that might affect humans’ understanding of natural phenomena.
Back in 2008, I and my adviser – professor Robert Nemiroff – began to model the unknown selection effects and biases that might happen during the detection process and spectral analysis of Long-duration Gamma-Ray Bursts (LGRBs). Surprisingly, it turned out that apparently a large fraction of LGRBs go undetected by gamma-ray detectors (e.g. BATSE, Swift). In addition, a significant number of detected LGRBs are generally ignored in data analysis due to low quality data. The combined effects of the two biases, together with the unknown selection effects on the redshift measurements of LGRBs can artificially signify the strength of (or even create) some of the well-known LGRB spectral correlations. In particular, we found that a significant fraction (>19%) of BATSE LGRBs are likely inconsistent with the Amati relation at >3σ significance level. This would in turn undermine the utilities of LGRBs as cosmological tools to probe Dark Energy’s equation of state (e.g., Schaefer 2007, Amati et al. 2008).
Here is an example graph taken from Shahmoradi & Nemiroff (2011), demonstrating the extent of BATSE LGRBs inconsistency with the proposed Amati & Yonetoku relations for LGRBs:
– The spectral peak energy (EP,obs) estimates for 2130 BATSE GRBs can be downloaded from here.
– Conditional EP,obs probability density functions for all bursts can be downloaded collectively as a zip file.
One of the most widely used spectral parameters in the studies of Gamma-Ray Bursts (GRBs) is the time-integrated νFν spectrum peak energy of these cosmic events. Since the early 1990s, there has been a growing trend in the research community to plot GRBs’s spectra in the form of E2 dE or νFν versus energy, (E), where Fν is the spectral flux at the frequency ν. This has the advantage of making it easy to discern the energy of the peak power from the burst. The νFν plot of many of the bursts’ spectra shows a peak which is denoted by EP,obs where the subscript “P,obs” stands for OBServer-frame spectral Peak energy (in contrast to the Comoving-frame Peak energy).
Now, it turns out that measuring of this quantity (i.e., the spectral peak energy) for many GRBs in particular those with very low Signal-to-Noise Ratio (SNR) – is a challenging task for GRB astronomers. This is primarily due to the fact that most GRBs do not have high quality spectral data enough to fit and constrain all the parameters of GRB spectral models (in particular the Band Spectral model). Therefore a lot of GRBs detected by the gamma-ray satellites are generally excluded in spectral analysis. For example, by the year 2008, the spectra of only 350 out of 2704 GRBs detected by BATSE Large Area Detectors on board CGRO satellite had been analyzed in detail using a variety of spectral models (e.g., Kaneko et al. 2006). These 350 GRBs were especially bright to allow for accurate spectral analyses, selected according to some researcher-defined inclusion rules, such as requiring a minimum flux or signal-to-noise ratio. Such inclusion rules, of course, may carry significant limitations and biases (e.g., Shahmoradi & Nemiroff, 2009, Shahmoradi & Nemiroff 2011).
On a regular winter day of 2008, professor Nemiroff & I were thinking of finding an easy way to measure the peak energies (EP,obs) of Gamma-Ray Bursts without having to go through the hassle of spectral fitting: To find another spectral parameter of GRBs that would serve as a proxy for EP,obs</font> of the bursts. Based on Dr. Nemiroff’s original suggestion to use hardness as EP,obs proxy, it turned out that a new definition of GRB hardness could indeed serve as a good indicator of the spectral peak energy of the bursts. The correlation of EP,obs and GRB hardness had been long known. However, our rigorous analysis allowed for the first time to quantify this correlation and use it to estimate the peak energies of the entire catalog of BATSE Gamma-Ray Bursts: 2130 GRBs.
Here is an example plot of the “Hardness – EP,obs” correlation taken from Shahmoradi & Nemiroff (2010):
Perhaps more important than the correlation itself, was the (re)discovery of the two classes of Short-duration GRBs (SGRBs) and Long-duration GRBs (LGRBs) by the double-component shape of the EP,obs distribution of BATSE 2130 GRBs:
For the first time in GRB research history, the Hardness-EP,obs relation enabled us to unravel the potential shape of the EP,obs distributions of the two classes of GRBs in the largest catalog of GRBs available to date (as of August 2012). The above plot clearly shows that the EP,obs distribution of 2130 BATSE GRBs is likely the result of the superposition of two Gaussian components, with long-soft class of BATSE GRBs having a Gaussian EP,obs distribution centered at ~140 keV and short-hard class of BATSE GRBs having a Gaussian EP,obs distribution centered at ~520 keV. Also according to the observed ratio of the two Gaussian components of the BATSE EP,obs distribution in the plot above, ~%70 of BATSE GRBs belong to long-soft class of bursts (LGRBs) and ~%30 of BATSE GRBs belong to short-hard class of bursts (SGRBs).
After months of research and article reading, my undergraduate degree project is complete and available to view here.