Overdisp: A Stata (and Mata) Package for Direct Detection of Overdispersion in Poisson and Negative Binomial Regression Models

  • Luiz Paulo Lopes Fávero University of São Paulo
  • Patrícia Belfiore Federal University of ABC
  • Marco Aurélio dos Santos Montvero Consulting
  • R. Freitas Souza
Keywords: Overdisp, Overdispersion, Count-Data Models, Stata

Abstract

Stata has several procedures that can be used in analyzing count-data regression models and, more specifically, in studying the behavior of the dependent variable, conditional on explanatory variables. Identifying overdispersion in countdata models is one of the most important procedures that allow researchers to correctly choose estimations such as Poisson or negative binomial, given the distribution of the dependent variable. The main purpose of this paper is to present a new command for the identification of overdispersion in the data as an alternative to the procedure presented by Cameron and Trivedi [5], since it directly identifies overdispersion in the data, without the need to previously estimate a specific type of count-data model. When estimating Poisson or negative binomial regression models in which the dependent variable is quantitative, with discrete and non-negative values, the new Stata package overdisp helps researchers to directly propose more consistent and adequate models. As a second contribution, we also present a simulation to show the consistency of the overdispersion test using the overdisp command. Findings show that, if the test indicates equidispersion in the data, there are consistent evidence that the distribution of the dependent variable is, in fact, Poisson. If, on the other hand, the test indicates overdispersion in the data, researchers should investigate more deeply whether the dependent variable actually exhibits better adherence to the Poisson-Gamma distribution or not.

Author Biographies

Luiz Paulo Lopes Fávero, University of São Paulo
Full Professor in Econometrics at the School of Economics, Business and Accounting
Patrícia Belfiore, Federal University of ABC
Professor at the Center of Engineering, Modeling and Applied Social Sciences

References

Z. Y. Algamal, Variable selection in count data regression model based on firefly algorithm, Statistics, Optimization and Information Computing , vol. 7, pp. 520–529, 2019.

E. Avci, Flexiblity of using Com-Poisson regression model for count data, Statistics, Optimization and Information Computing , vol. 6, pp. 278–285, 2018.

A. C. Cameron, and P. K. Trivedi, Econometric models based on count data: comparisons and applications of some estimators and tests, Journal of Applied Econometrics, vol. 1, no. 1, pp. 29–53, 1986.

A. C. Cameron, and P. K. Trivedi, Microeconometrics: Methods and Applications, Cambridge University Press, New York, 2005.

A. C. Cameron, and P. K. Trivedi, Microeconometrics using Stata, Stata Press, College Station, 2010.

A. C. Cameron, and P. K. Trivedi, Regression Analysis of Count Data, Cambridge University Press, Cambridge, 2013.

L. P. F´avero, and P. Belfiore, Data Science for Business and Decision Making, Academic Press Elsevier, Cambridge, 2019.

L. P. F´avero, and P. Belfiore, overdisp: module to detect overdispersion in count-data models using Stata, Statistical Software Components, Boston College Department of Economics. https://ideas.repec.org/c/boc/bocode/s458496.html, 2018.

L. P. F´avero, M. A. Santos, and R. G. Serra, Cross-border branching in the Latin American banking sector, International Journal of Bank Marketing , vol. 36, no. 3, pp. 496–528, 2018.

W. Gardner, E. P. Mulvey, and E. C. Shaw, Regression analyses of counts and rates: Poisson, overdispersed Poisson, and negative binomial models, Psychological Bulletin, vol. 118, no. 3, pp. 392–404, 1995.

S. Gurmu, Tests for detecting overdispersion in the positive Poisson regression model, Journal of Business & Economic Statistics, vol. 9, no. 3, pp. 215–222, 1991.

J. A. Hausman, B. H. Hall, and Z. Griliches, Econometric models for count data with an application to the patents-R&D relationship, Econometrica , vol. 52, no. 4, pp. 909–938, 1984.

G. Leckie, runmixregls: a program to run the mixregls mixed-effects location scale software from within Stata, Journal of Statistical Software, vol. 59, pp. 1–41, 2014.

J. S. Long, and J. Freese, Regression Models for Categorical Dependent Variables using Stata, Stata Press, College Station, 2006.

M. Manj´on, and O. Mart´ınez, The chi-squared goodness-of-fit test for count-data models, Stata Journal, 14, pp. 798–816, 2014.

R Core Team, R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. http://www.R-project.org/, 2016.

S. Rabe-Hesketh, and A. Skrondal, Multilevel and Longitudinal Modeling using Stata: Categorical Responses, Counts, and Survival Stata Press, College Station, 2012.

SAS Institute Inc, SAS/STAT Software, Version 9.22, Cary, NC, http://www.sas.com/, 2018.

StataCorp, Stata Data Analysis Statistical Software, Release 15, College Station, TX, http://www.stata.com/, 2018.

H. Zhang, Y. Liu, and B. Li, Notes on discrete compound Poisson model with applications to risk theory, Insurance: Mathematics and Economics, vol. 59, p. 325–336, 2014.

M. L. Zwilling, Negative binomial regression, The Mathematica Journal, vol. 15, pp. 1–18, 2013.

Published
2020-06-14
How to Cite
Fávero, L. P. L., Belfiore, P., Santos, M. A. dos, & Souza, R. F. (2020). Overdisp: A Stata (and Mata) Package for Direct Detection of Overdispersion in Poisson and Negative Binomial Regression Models. Statistics, Optimization & Information Computing, 8(3), 773-789. https://doi.org/10.19139/soic-2310-5070-557
Section
Research Articles