Case study

Unlocking immunogenicity insights from clinical reports with NLP

Most information about the immunogenicity of therapeutic peptides is locked in unstructured clinical reports. The client wanted to extract and structure this data to inform development of safe and effective treatments.

Our Approach

We built a robust extraction pipeline, leveraged large language models for information extraction and evaluated tools for scaling.

  • Flexible extraction pipeline: Designed a robust pipeline to handle diverse clinical report structures.
  • LLM-powered information extraction: Developed and validated a proof-of-concept pipeline using large language models to extract immunogenicity data into structured tables.
  • Tool evaluation & scaling: Evaluated PDF tools, executed the pipeline on a larger set of reports and outlined future opportunities for scaling.

The Outcome

  • Provided the client with reliable, structured immunogenicity data extracted from unstructured clinical reports.
  • Validated and executed the LLM-powered pipeline, giving confidence in its accuracy.
  • Delivered insights that support data-driven decision-making and highlight opportunities for further automation.

Why It Matters

AI-powered NLP can unlock valuable information hidden in clinical reports. This approach accelerates therapeutic development by delivering critical data without manual extraction.

Let’s discuss how we can turn your data into real scientific impact.

Contact us >

Why bioinformatics workflows require experienced software engineers

Why bioinformatics workflows require experienced software engineers

Bioinformatics pipelines break for the smallest reasons: package updates, shifting dependencies, or “it only works on my machine.” This post explains why experienced software engineers and DevOps practices (Git, CI/CD, IaC) are essential to keep workflows reproducible, stable, and scalable.