Developing predictive models to revolutionise diagnosis and treatment options
When asked about the approach of BioLizard, Volodimir Olexiouk, team lead for Data Analytics & AI, explains, “For each and every project, the right lizards are assembled in order to provide the best solution for our clients. We interact in order to combine the ‘right’ expertise for each task in order to deliver optimal results to our clients.” Today, we’ll be diving into one case in which the broad expertise of the Data Analytics & AI team was put to use in order to develop an end-to-end, data-driven solution for predicting clinical outcomes in transplantation patients.
The focus of this particular project was on using RNA sequencing (RNA-Seq) data to predict different types of organ transplant rejection in patients. When BioLizard joined this project, gene set panels from whole blood RNA-Seq data of kidney transplant recipients had already been used successfully to predict subclinical rejection or early acute cellular rejection of transplants. The goal of BioLizard in this project was to expand and improve the efficiency of this approach by providing an end–to–end workflow starting from the input of raw RNA sequencing data and spanning all the way to the prediction of transplantation rejection.
As always, BioLizard’s aim was to make the approach as data-driven as possible, and in the end, we successfully produced not one, but two state-of-the-art tests that could serve as predictive models for patient risk of early acute or subclinical rejection!
But let’s start at the beginning: when the Data Analytics & AI team first set out on the ambitious mission to create predictive models for both acute and long-term kidney rejection using an extensive dataset of 60,000 genes from 150 patients.
As often happens in the life sciences, having an abundance of valuable data creates huge possibilities, but also distinct challenges. In this case, the complexity of the data, compounded by the biological variability among different ethnicities, genders, and ages of the patients, made it pretty difficult to derive meaningful biological insights.
But the Data Analytics & AI team didn’t let that discourage them! They tackled the challenge head-on, utilising a wide range of bioinformatic and machine learning tools to identify biomarkers that could predict patient outcomes. They started with all of the available sequencing data, and from there narrowed things down to a more manageable set of genes that were found to have the highest predictive value. And to ensure that this sub-selection of genes to be used in the model was of as high a quality as possible, the Data Analytics & AI team also took additional steps of preprocessing the raw data. This included, for example, filtering out contaminants, removing low read count genes, and data normalisation across the different patient samples.
Once the smaller sub-selection of fewer genes was made, the Data Analytics & AI team then worked together with the expert client to choose a subset of clinical features that would also be included in the predictive models. This way, the resulting models would take into account not only information related to gene expression, but also other potentially predictive variables, such as the presence of patient comorbidities.
And then, finally, it was time to build and test the predictive models.
The outcome was truly remarkable.
In the end, the Data Analytics & AI team worked together with their clients to develop an end-to-end pipeline spanning from the initial processing of RNASeq data on the Illumina Connected Analysis platform, all the way to data-driven predictive models, which took into account both clinical variables and differential gene expression as well as the interactions between those two features. BioLizard iteratively improved and tested the models over time, and the final result was a highly standardised and data-driven, end-to-end solution that is now being validated for use in a clinical setting to predict patient outcomes.
In short: the predictive models developed by our Data Analytics & AI team have the potential to revolutionise diagnosis and treatment options for patients suffering from acute and long-term kidney rejection.
This is just one example of how the expertise of the Data Analytics & AI team can be put to use to the benefit of patients – but it captures the spirit of BioLizard’s ultimate goal of transforming raw data into clear assets, and leaving our clients with standardised, reusable, data-driven, and future-proof platforms to continue to get the most out of their data.