Date of Award


Document Type

Campus Access Dissertation


Epidemiology and Biostatistics



First Advisor

Wilfried Karmaus


Traditional influenza surveillance methods require all reported cases to have a laboratory positive confirmed diagnosis, causing delays in outbreak detection. Many public health authorities have begun monitoring influenza-like illness (ILI) which does not require a patient to seek medical care or undergo formal influenza testing and can be used to detect influenza outbreaks earlier than traditional surveillance methods. ILI consists of influenza symptoms and is usually defined as fever plus sore throat or cough with the possible addition of other cold symptoms like runny nose and malaise. Another advantage is that ILI can be monitored through syndromic surveillance which use readily available data sources to monitor syndromes associated with specific diseases, like ILI and influenza. The New York City Department of Health and Mental Hygiene (NYCDOHMH) developed their emergency department based syndromic surveillance system after the events of September 11th, 2001. Electronic emergency department visit logs are transmitted daily and include visits for the previous day which are coded into four age groups (Ages 0-4, 5-17, 18-59, and ¡Ý60 years) and ILI syndrome through patient¡¯s chief complaint. Visits are monitored weekly for temporal and temporal-spatial increases (signals) in ILI syndrome. Our main objective was to compare Poisson cyclical regression, Seasonal Autoregressive Integrated Moving Average (SARIMA), and General Additive Modeling (GAM) statistical modeling techniques to find the optimal modeling technique for ILI visits from January 2002 to June 2007. The optimal modeling technique was then used to validate forecasts from July 2007 to December 2008 and to study the H1N1 2009 pandemic in relation to seasonal influenza during the period of January 2002 to February 2010. The optimal modeling technique was also studied for the feasibility to develop backcasts for January 2000 to December 2001 in hopes of further developing long term backcasts of historic pandemics in the 20th century. Comparison of modeling techniques indicated the smallest difference between observed and expected values were SARIMA models. SARIMA short-term models provided useful forecasts, but long-term models (backcasts) produced large differences in observed and expected values. The difference in observed and expected values in long-term models exponentially grew over time, indicating that the SARIMA models may only be useful in forecasting up to 90 days and backcasting up to 60 days. Long-term SARIMA model white noise tests were significant, suggesting that other potential model predictors may need to be considered to account for the significant explained variance that remained in the models. Short term and long-term ILI models may need to be developed separately to provide optimal forecasts and may require differing modeling approaches and different potential model predictors. Potential model predictors for long term forecasts need to be re-evaluated to appropriately measure underlying population changes over time (socio-economic status, access to healthcare, population density, and life expectancy). Caution should be taken when applying results to other ILI data, since underlying population characteristics may have a direct effect on ILI models. Although SARIMA models did not provide useful long-term forecasts, the signals from these models did confirm a known signature pattern for annual influenza epidemics and pandemics showing a shift in morbidity from the elderly to younger age groups from seasonal influenza to pandemics. Future studies are needed to evaluate and appropriately compare ILI modeling approaches and measurements of all potential model predictors that may affect short term and long term ILI models.