How the cloud powers Type 1 diabetes research

From preventing infectious diseases with vaccines to countering bacterial infections with antibiotics, modern medical treatments are possible because of modern research. Health research paves a path for medical developments and cures for diseases. At the heart of modern health research is the intersection of people—patients, researchers, medical professionals—and data.

Advances in technology are transforming the way health research can be conducted. It is now possible to integrate data from siloed sources into a data lake, a central repository where health data are aggregated and analyzed at scale. Now, more than ever, there are opportunities for collaborative research to accelerate life-saving medical innovation – and that’s exactly what JDRF International is doing on the AWS Cloud.

JDRF reimagines what is possible in T1D prevention, treatment, and cure

Type 1 diabetes (T1D) affects millions of people worldwide today. It is an autoimmune disease that strikes suddenly and has no known prevention or cure. Patients must rely on daily insulin treatments for survival and constantly monitor their blood-sugar level. Fluctuations in blood glucose levels put patients at risk for potentially life-threatening episodes and devastating complications. T1D is a lifelong and relentless burden on all patients.

JDRF is the leading global T1D research and advocacy organization. With a vision of eliminating T1D permanently, JDRF accelerates delivering new therapies through its research and development pipeline by focusing investments on the most promising cure opportunities and advancing regulatory approvals and health care access.

Historically, a key challenge in T1D research is collecting and harmonizing data from stakeholders into impactful action. Siloed datasets reside in different spaces and formats, each telling an important yet incomplete piece of a patient’s clinical journey. To break this barrier, JDRF designed the Diabetes Data Platform (D2 Platform) Project to curate and publish the above data into a “research store,” making it available to researchers, clinicians, and key diabetes stakeholder organizations. As a winner of the AWS IMAGINE Grant in 2020, JDRF received unrestricted funding, AWS Promotional Credit, and guidance from AWS technical specialists to begin implementing this project.

D2 Platform leverages the cloud to power T1D research

The D2 Platform leverages AWS Cloud technology to create a comprehensive, scalable, and secure registry of diabetes data collected from disparate sources to drive progress across the research ecosystem for cure, prevention, and treatment of T1D. Acting as a network facilitator, JDRF brings experts together and guides them toward clinically impactful research, focusing on the most important questions concerning key stakeholders (e.g. patients, regulators, and drug developers), like, Who gets T1D?, When do people get T1D?, and What’s the treatment for T1D?

 

From there, the team at JDRF identifies the best available data sources to answer these questions and leverages a data lake—the D2 Platform—to enable researchers to investigate salient clinical or environmental risk factors for timely diagnoses and treatments.

In addition to facilitating research, JDRF translates and deploys these findings to decision makers and service providers like the Food and Drug Administration (FDA) and healthcare institutions. Being at the nexus of research and clinical activities allows JDRF to create a comprehensive strategy towards the most high-impact pathway to combat T1D. Ultimately, the goal is for the D2 Platform to help inform and advance how the public is screened for T1D and how current patients are treated.

Global health researchers collaborate on the cloud

Combining datasets makes way for more comprehensive analyses and insights, but the data needs to be hosted on an independent platform that allows for equal ownership and secure protection of sensitive data. In partnership with JDRF, the research entities chose AWS for the task. In addition to satisfying security and data privacy requirements, AWS powers the undifferentiated heavy lifting in technology maintenance, freeing JDRF to focus on the overall research strategy and coordination among various stakeholders. And, hosting and analyzing the datasets on the AWS Cloud means the teams are able to conduct collaborative research virtually from anywhere in the world.

The following architectural diagram illustrates the technical design of the platform, from the point of sourcing data to publication.

Figure 1. Architectural diagram for the D2 Platform. From left to right—data from various sources and producers are ingested through appropriate AWS services into AWS storages. This data is then processed, harmonized, and curated to be published and utilized in research. A key challenge is to get all the data originating from various sources into a common data dictionary format and store, and process in the subsequent phases to add it into the curated datasets. The flexibility of AWS services enables the accommodation of these processes as and when needed.

 

The D2 Platform supports multiple data producers—shared file systems, object storage, databases and internet of things (IoT) devices. Research partners make data available in shared filesystems and databases and send a message to Amazon Simple Queue Service (Amazon SQS) queue. AWS Lambda picks up a message to copy the data to Amazon Simple Storage Service (Amazon S3), and to Amazon Redshift. On a schedule, jobs in Amazon Redshift clean data to create curated datasets for downstream consumption. In the future state, IoT devices send events which are processed by AWS IoT rules and streamed into Amazon Kinesis Data Firehose to store on Amazon S3. Amazon EMR processes the data to create curated datasets in Amazon Redshift.

A vision beyond T1D

Looking ahead, the vision for the D2 Platform goes beyond its immediate use. First, there is still a plethora of clinically valuable data that is yet to be collected, such as those stored in medical devices worn by T1D patients. Second, with technological advancement in health research, there are similar efforts for other diseases that utilize genomic datasets to research for cures and treatment. There is great hope for these data sources to be aggregated and harmonized into an even more robust dataset, potentially allowing for even more impactful research efforts across the medical field. Cloud technology not only accelerates the research of individual diseases, but advances health research as an entire field.

*Partner Press Announcement*