For decades, biomedical research has struggled with issues of accessibility, interoperability and computational limitations. In 2018, the industry has taken a giant leap forward in each regard with Google Genomics API as a driving factor. By partnering with GCP, the medical research and development community at large now has access to Google’s cutting edge AI (Artificial Intelligence) and ML (Machine Learning) technologies to pair with genomic analysis.
The Genomics API Breaks Down Barriers in Scalability, Access and Cost
Of the barriers to cloud adoption for genomics, scalability is a major concern. After all, genomics data is the exemplar of big data – a human genome is comprised of 3 billion base pairs, and one chromosome alone ranges from 50,000,000 to 300,000,000 base pairs. To address scalability, Google has leveraged its own implementation of HTSget to enable the easy access and sharing of data without copying huge files over from various VMs. The Genomics API also makes the (decreased level of) effort required to load genomics data worth it by enabling customers to monetize access to data via Requester Pays Buckets, documentation available here.
As we mentioned recently, the National Institutes of Health (the largest public funder of biomedical research), announced Google as its first partner for its STRIDES (Science and Technology Research Infrastructure for Discovery, Experimentation, and Sustainability) initiative. This high profile partnership gives 2,500 academic institutions across the the US access to Google Cloud’s storage, compute and infrastructure. Though Google Cloud’s implications for healthcare are more far reaching than just large, well-funded public organizations: The Genomics API makes genome data freely available from initiatives such as 1000 Genomes, The Cancer Genome Atlas, Illumina Platinum Genomes, and even the 1000 Cannabis Genomes Project. This ready availability of genomics data (both human and non-human), heralds a level of innovation accessible to small genetics-based tech startups, not just large, established players.
From a cost perspective, Google Genomics doesn’t charge for loading and exporting of data – only for storage. Customers can expect a 30x whole human genome to be billed at around $25 a year – prospective customers can forecast their custom costs using the cost calculator here. It’s worth a reminder that time is money, and variant analysis that used to take hours or even days can now be run using BigQuery, Python or your bioinformatics tool of choice in just seconds.
New Tech Powered by ML for More Than Just Researchers and Labs
These past twelve months, Google has added much more to the Genomics API than just storage and accessibility. Its open source DeepVariant technology applies image recognition machine learning techniques to more accurately identify SMPs in a sequenced genome. Nascent industries such as precision medicine and telegenetics see the increased ease of variant analysis as a business opportunity. All of a sudden, clinicians, researchers and even consumers (not just biostatisticians trained in Python and R) could potentially run genetic analyses through off-the-shelf, GUI-driven tools on demand.
What does this mean for the future of the Genomics API and Google’s healthcare offerings in general? For Google, the focus is on building out its infrastructure to be more welcoming for organizations of all sizes and business needs. Though Google already offers to enter into a BAA on behalf of its comprehensively HIPAA compliant Cloud Platform, organizations need guidance and assurance that they are not alone in the move to cloud. It helps that companies follow trends of customer demand, and DNA screening-as-a-service companies such as Counsyl and 23andMe have warmed public perception to the sharing of genetic data. Companies seeking an audience for direct-to-consumer genetics services need look no further.
Industry Experts Lead Towards a Smart, Connected Cloud Implementation
There’s still the issue of interoperability which guards the gateway to practical, everyday clinical use – will my cloud data play nicely with our systems already in use? The Healthcare API’s commitment to FHIR, DICOM, and HL7v2 standards bridges the gap between AI/Genomics workloads possible in the cloud with the patient’s clinical workup generated on the ground.
The problem facing emerging health tech is that all factors – security, interoperability, scalability, and cost-effectiveness – must be in place for the idea of smart, interconnected research or patient care to take place. One such company that is trying to connect all the pieces is Verily, Alphabet’s own “in-house” life sciences arm. Not only is the company taking on sci-fi-sounding projects, such as glucose-detecting contact lenses – they are also behind a lot of the open source development, such as DeepVariant, which feeds back into the Genomics API and Healthcare API. With former high ups from MIT’s Broad Institute and the National Institute for Mental health, Verily should help make companies dipping their toe in cloud-based genomics research feel comfortable.
The beauty of living in 2018 with the Genomics API at our fingertips is you you can test out its functionality and analyze your first genome in just a 30-minute Qwiklabs tutorial here. If you like what you see and can infer the impact that the Genomics API and Healthcare API can have on our organization, reach out to SADA System’s team of engineers today. Not only are we up to date on the latest health tech offerings at Google Cloud, we can help you architect a solution and forecast the costs of making your vision a reality.