Behind the scenes at Sano: Where genomic data meets engineering

A bioinformatician's view

At Sano, data is at the core of everything we do. From mining our patient database to identify patients for trials to analyzing sequencing data, our engineering and clinical teams use data to help us generate impact for patients in precision medicine. In this blog, we follow the journey of genomic data behind the scenes with a Sanosaur. 

Jonny Talbot-Martin is a bioinformatician on our clinical genetics team. His work helps ensure that Sano’s platform delivers high-quality, reproducible results across all our studies. We sat down with him to explore what a typical week looks like, how the team tackles challenges, and how data and automation come together to support better outcomes.

A week in the life: Balancing precision, speed, and scale

Jonny’s weeks are structured around routine meetings, ongoing studies requiring data analysis, proactive process improvement, and occasional time-sensitive fixes. Mornings begin with a stand-up in which the genetics team aligns on tasks, projects, or questions that need attention. The majority of Jonny’s day is dedicated to what he calls “deep focus time,” which might involve: 

  • Returning results from recent sequencing runs or genotype array data
  • Making incremental updates to Sano’s internal pipelines to accommodate study-specific needs
  • Working on proactive development as part of longer term initiatives and goals

Weekly, he’s part of a “Coffee & code” meeting with other coders at Sano from the back-end engineering team and platform team. This is typically followed by “Coffee & genetic code” with the genetics team, providing an accurate summary of the core of Jonny’s work (powered by caffeine). Another aspect of Jonny’s role is acting as a bridge between the clinical genetics team and the engineering and platform teams, communicating technical requirements and relevant study updates to ensure the pipeline and platform evolve in sync with scientific and clinical needs.

Translating complexity into meaningful results

A large part of Jonny’s work centers on active development and maintenance of the return of results pipeline, which is a critical component of Sano’s genetic reporting infrastructure, orchestrating processes for handling the various types of genetic data that Sano utilises. For example, whole genome sequencing data may be routed to an external provider for interpretation, while genotype array data might be analyzed within the Sano platform, forming the basis for a report generated internally, and returned to a participant. This helps reduce manual steps and ensures consistency across studies.

In collaboration with other members of the team, one of Jonny’s recent accomplishments involved consolidating and unifying all main coding scripts into an automated pipeline that runs on a core platform. This means every analysis is run in exactly the same computational environment, which is an important step toward making results more reliable and reproducible across the board.

This was part of a wider goal to maximize reproducibility and increase automation. In line with this, the team recently developed a dashboard to monitor all samples that interact with Sano’s platform with more granularity, maximizing visibility of any issues that may arise so they can be flagged earlier in the process. This enhanced visibility can prevent samples from falling through the cracks. 

The positive effect on speed and efficiency is already clear: average turnaround time (measured from when a test kit arrives at the sequencing facility) has decreased by 25% compared to 2024. In addition, the number of reports returned to patients in 2025 has increased by 89%. 

Fusing biology with technology

At the core of Sano’s approach is a commitment to reproducible, transparent science. That’s why the bioinformatics team leans on community-developed tools like nf-core and Nextflow, and builds internal pipelines that are version-controlled and easily auditable.

Jonny also leads development of Sano’s variant calling pipeline, which, in some cases, uses deep learning models like DeepVariant, developed by Google. Jonny explains, “We’re highly selective in the way we choose tools when needed, ensuring that they have been rigorously validated for our specific use case. DeepVariant, for instance, is widely regarded as one of the most accurate tools for identifying germline variants, with extensive literature showing superior performance to many other callers in terms of precision and recall, especially for SNVs and indels in challenging regions.”

Drawing on both the literature and extensive hands-on experience with genomic data, Jonny has developed a strong understanding of genomic regions that present particular challenges for variant calling. His deep domain knowledge and analytical rigor enable him to critically evaluate the quality of variant calls and determine whether the evidence is sufficient to support a confident, conclusive interpretation.

He illustrates this with an example of a gene located at a locus that is notoriously difficult to interpret, due to high sequence similarity with a nearby pseudogene. This homology often results in ambiguous read alignment and potential miscalls. As Jonny explains, “In this case, we have to determine what we can reliably extract from the data.”

He also highlights the importance of collaboration with the clinical genetics team, who define which genes or regions are relevant for each study. That guidance shapes what the bioinformatics team extracts from the raw data and how results are structured for return.

Jonny’s long-term mission is to continue to upgrade and streamline Sano’s bioinformatics infrastructure in line with evolving industry standards and tools. Whether it’s through rethinking pipelines, refining developer experience, or embedding smart monitoring systems, Jonny and the team are making sure that data flows seamlessly and leads to meaningful insights for patients, delivered with speed and confidence.

Get in touch