
ORNL has played a key role in developing novel Big Data toolkits in the context of syndromic disease surveillance. Our platform, the Oak Ridge Bio-surveillance Toolkit (ORBiT) enables large-scale analysis of heterogeneous data sources, including environmental, climate/weather related data, prescriptions records and other novel data streams emerging from social media (e.g., Twitter, Instagram). ORBiT is targeted at developing novel statistical and machine learning tools instead of acting as a central data collection interface from these heterogeneous resources. Additionally, it also provides an application programming interface (API) that can be used by end-users to target specific bio-surveillance applications. Machine learning tools are tightly integrated with visualization tools in a web-based framework to aid the end users or analysts in exploring potential links between heterogeneous data sets, detecting patterns/correlations across multiple data streams, identifying emerging disease outbreaks, forecasting emerging epidemics, and monitoring control strategies. ORBiT is implemented as a component-based, plug-and-play toolkit that exploits existing distributed cloud-based analytics frameworks.