Abstract
The LUCID DOE consortium, part of the Department of Energy’s Biological and Environmental Research (BER) program, advances Low Dose Radiation (LDR) research through multidisciplinary efforts across seven key thrusts. This document focuses on Thrust 1, which centers on the creation of curated multimodal population health datasets and supports broader efforts within the LUCID program, including AI-based hypothesis generation, experimental design, and the study of LDR-induced health risks. Specifically, it describes the identification and cataloging of Thrust 1’s curated LDR datasets and biodata, emphasizing their critical role in supporting various research thrusts within the consortium, with potential applications in healthcare and public policy. In addition, the document includes an evaluation of three Large Language Models (LLMs)—GPT-4, SOLAR-10B, and Mixtral-8x7B—based on their ability to extract features from 25 LDR studies. The results indicate that GPT-4 performed the best, while Mixtral-8x7B demonstrated limited knowledge. Overall, this work advances understanding in radiation protection, risk assessment, and medical treatments, while providing valuable resources for researchers, educators, and policymakers.