Skip to content
3 minutes

About the client

imec is a world-leading R&D and innovation hub in nanoelectronics and digital technologies. elPrep is developed by ExaScience Life Lab, a division of imec that focuses on scalable software solutions for data-intensive and high-performance computing problems, primarily in life sciences.

Project overviewDNA strands

elPrep is a high-performance tool for analyzing .sam/.bam files (up to and including variant calling) in sequencing pipelines.

A comparison between elPrep and GATK has already been published, but a comparison to other software such as freebayes and DeepVariant has not been done yet. The goal of this project was to compare the run time of elPrep to other variant callers.

As an added bonus, performing such a comparison allowed BioLizard to gain more insight into the capabilities of elPrep. BioLizard aims to use these insights to better assist future clients with questions regarding elPrep and setting up variant calling workflows.

Streamline your genomic research with an integrated tool that excels in speed and accuracy.”


- imec

 

Our approach

  • Set up ec2 instances for elPrep and its peer software (freebayes & DeepVariant).
  • Perform variant calling on a public WGS dataset from Genome in a Bottle. The process was performed on the full data set and a variety of subsampled data sets.
  • Collect runtimes of the different variant callers and runs, and normalize them to core hours.
  • Generate clear tables and graphs to compare the results.

Results

Run time comparison

 

Run time comparison between the different variant callers.

 

possible optimizations

 

Discovered further possible optimizations to improve user experience.

 

partnership-1

 

Partnership between imec and BioLizard to better assist clients.

Check out the imec website for more information on elPrep!

This work was performed in partnership with imec.

LOGO-IMEC

Recommended Reading