The Million Veteran Program (MVP) was launched in 2011 by the US Department of Veterans Affairs (VA) to enroll at least 1 million veterans in a longitudinal cohort to better understand how genes, lifestyle, military experience, and environmental exposures interact to influence health and illness and ultimately enable precision health care. The MVP has established a national, centralized infrastructure for recruitment and enrollment, biospecimen and data collection and storage, data generation and curation, and secure data access. When the COVID-19 pandemic hit in 2020, the MVP was leveraged to support research utilizing the following key infrastructure components: (1) MVP recruitment and enrollment platform to provide support for COVID-19 vaccine and treatment trials and to collect COVID-19 data from MVP participants; (2) using MVP Phenomics for COVID-19 research data cleaning and curation, assisting with the development of a VA Severity Index for COVID-19, and forming 6 scientific working groups to coordinate COVID-19 research questions; and (3) the VA/MVP and US Department of Energy (DOE) partnership to assist in responding to COVID-19 research questions identified by the US Food and Drug Administration (FDA). This article describes these infrastructure components in more detail and highlights key findings from the MVP COVID-19 research efforts.
MVP Infrastructure
The Veterans Health Administration (VHA) Office of Research and Development (ORD) oversaw efforts to develop the VA Coronavirus Research Volunteer List (the COVID-19 registry). To support the registry, the MVP leveraged its infrastructure to facilitate a rapid response. The MVP is designed as a full-service and centralized recruitment and enrollment platform. This includes MVP office oversight; MVP coordinating centers that manage the centralized platform; an information center that handles inbound and outbound calls; an informatics system built for recruitment and enrollment monitoring and tracking; and a network of more than 70 participating MVP sites with dedicated staff to conduct recruitment and enrollment activities. The MVP used its informatics infrastructure to support secure data storage for the registry volunteer information. MVP coordinating center staff worked with the COVID-19 registry to invite > 125,000 MVP participants from approximately 20 MVP sites. Additionally, MVP information center staff made > 4000 calls to prospective registry volunteers. This work resulted in 1300 volunteers agreeing to be contacted by COVID-19 vaccine clinical trial study teams (including Moderna, Janssen, AstraZeneca, and Novavax). About 20 MVP site staff (spanning 14 MVP sites) also were deployed to support COVID-19 work for clinical care capabilities or vaccine trials.
New Data Collection
The MVP protocol was approved by the VA Central Institutional Review Board (IRB) in 2011. As part of initial enrollment in MVP, participants consented to recontact for additional self-report information along with access to their electronic health record (EHR). This allows for the linkage of EHR and survey response data, thus providing a comprehensive understanding of health history before and after a self-reported COVID-19 diagnosis. Between May 2020 and September 2021, the MVP COVID-19 survey was distributed to existing MVP participants via mail, telephone, and email with the ability to complete the survey by paper and pencil or through the MVP online system. Dissemination of the survey was approved by the VA Central IRB in 2020, with nearly 730,000 eligible MVP participants contacted. As of June 2022, 255,737 MVP participants (35% of the eligible cohort) had completed the survey; 86% completed a paper survey while 14% completed it online. Respondents were primarily older (≥ 65 years); 90% were male; close to 7% reported Hispanic ethnicity, and 11% reported Black race.
Findings from this survey provide insight into pandemic behaviors not consistently captured in EHRs, such as psychosocial aspects, including social and emotional support, loss of tangible and intangible resources, as well as COVID-19–related behaviors, such as social distancing and self-protective practices.1 MVP COVID-19 survey data combined with veteran EHRs, responses to other MVP surveys, and genetic data enable MVP researchers to better understand epidemiological, clinical, and psychosocial aspects of the disease. Future COVID-19 studies may use self-reported survey responses to enrich understanding about the effects of the disease on a veteran’s daily life, and possibly validate existing EHR COVID-19 diagnoses and hospitalization findings. This comprehensive data resource provides a unique opportunity to identify new targets for disease prevention, treatment, and management with an emphasis on individual variability in genes, environment, and lifestyle.
COVID-19 Research
In early 2020, the burden of COVID-19 on the US was unprecedented, and little was known about risk factors for severe COVID-19 and deaths. The MVP Phenomics team quickly responded with a large-scale phenome-wide association study (PheWAS) of > 1800 phenotypes (physical and biochemical traits) and COVID-19 progression. Its goal was to characterize risk factors and outcomes associated with COVID-19 disease progression.2 Data curation and assembly occurred rapidly through integrated efforts led by MVP and VA COVID-19 initiatives. The MVP utilized its phenomics core resource to understand the progression of COVID-19 defined by SARS-CoV-2 infection, hospitalization, intensive care unit admission, and 30-day mortality using VA EHR data.
To broaden disease progression data curation and fit the specific needs of the VA, we operationalized and validated the World Health Organization clinical severity scale and used VA EHR data to create the VA Severity Index for COVID-19 (VASIC).3 The VASIC category is now part of the MVP core data repository, where volumes of data from multiple activities are integrated through an automated process to create monthly research-ready data cubes. These activities include extensive data curation, mapping, phenotyping, and adjudication that are performed to curate oxygen supplementation status and other procedures related to treatment that are processed and understood in real time. The data cubes were provisioned to MVP COVID-19 researchers. In addition, the VASIC scale variable is now integrated within the larger VA system for all researchers to use as part of its wider COVID-19 initiative. The VA Centralized Interactive Phenomics Resource (CIPHER) phenomics library now hosts the details of VASIC, codes, metadata, and related COVID-19 data products for all VA communities. In partnership with CIPHER and other internal and external COVID-19 initiatives, the MVP continues to play an integral part for the VA and beyond in the development of a phenomics algorithm for long COVID, or post-acute COVID-19 syndrome (PACS).