Understanding data management: Save time, money, and headaches in life sciences
07 Dec 2022
While most people will agree with the statement, “data management is important”, it might be harder to agree on exactly how or why scientists should implement a solid data management plan. One of the reasons behind this disconnect is that data management means a lot of different things to different people with different specialties. But one thing is for sure: no matter what “data management” might mean to you, regardless of your specialty, having a solid data management plan should accomplish the same thing: a good data management plan allows you to focus on what you are really good at, instead of spending your time finding, organizing, and annotating your data.
Poor data management can take time away from your real focus at work. For instance, like many scientists, you may have spent hours of your life manually transferring raw data from an excel sheet into another software system, or vice versa. Or, perhaps you’ve revisited an experiment that you performed a year previously, only to realise that you never finished fully documenting the details of your protocol because your documentation system is not very intuitive.
In this blog series on data management, we are going to cover the what, why, and how of data management, so that you can move toward implementing a data management plan that, instead of demanding work from you, instead will work for you
What is data management, anyways?
Data management is an important part of managing and stewarding the availability, usability, integrity, and security of data in an organization. In other words, data management is the work that needs to be done to really get the most out of the data that you produce, and let your data work for you instead of the other way around.
When data management goes wrong, it can be for a multitude of reasons. Some of the most common pitfalls that we see are:
➤ Lack of communication between R&D and IT teams, which can create data silos, or repositories of data that are controlled by one department and isolated from the rest of the organization. Siloed data is often stored in a fragmented system, and incompatible with other data within the organization.
➤ Inflexible data structures, which can contribute to the creation of, or compound the impacts of, data silos.
➤ Lack of documentation. When working with many different types of data that each have their own internal standards for storage and documentation, it’s easy to forget to document properly or in a timely manner.
➤ Lack of automation, which costs both IT and R&D teams a lot of time that could be better spent on their true specialties.
This all sounds like quite the headache… So, how did the field of life sciences get to this point?
Well, historically R&D teams managed just fine with each person working in their own data space and using their own organic data management approach, simply because in past decades there were fewer data generated per experiment. But those days are gone, and pen-and-paper solutions are no longer an option.
Nowadays, on average researchers “generate 10000x more data per #experiment than they did a decade ago”, but “spend 30%–40% of their time searching for, aggregating, and cleansing data, due to the exponential increase in dataset size. On top of that, collaboration is now the norm but can become difficult when scientists cannot easily access each others’ data. Nevertheless, many scientific researchers naturally find it difficult to transition away from using this more individualistic approach that they were trained to use.
In our experience, communication between IT and R&D teams is key for adapting to the new, data-rich world of life sciences & drug development. But oftentimes this communication in itself can become a roadblock, because “data management” might mean something very different to R&D versus IT professionals. So, what does “data management” mean to these two types of specialists, both of whom are necessary for the successful implementation and use of a data management plan?
Data management and the R&D world
If you’re working in the wet lab, you might have found yourself unintentionally putting data management second to data collection. When you’re wildly pipetting and racing from instrument to instrument (which, of course, all use different data formats), your first priority is executing your experiments and gathering data. It’s natural, but it means that data management can become an afterthought.
Data management may also seem like a nebulous concept. Probably, navigating the technical ins and outs of how to build a data management plan is not second nature to you, in the same way, that understanding the biological systems that you’re working with is. The very idea of data management may seem vague, intangible, and rather overwhelming. You may well understand the potential value that a better data management system could bring, and perhaps you even see the potential of applying advanced analytics, AI, or machine learning in order to access new and improved biological insights. But how do these futuristic tools connect to your cell cultures?
Indeed, in our experience, R&D scientists often feel that changing their entrenched data management systems, even if it would mean gathering better data, would come at a heavy cost. It’s no wonder that they feel this way because most R&D scientists were never trained to work with cutting-edge data management systems – rather, they focus on cutting-edge laboratory equipment and techniques. But, as laboratory equipment and techniques start to provide larger and larger datasets, R&D scientists inevitably start to see the limits of pen-and-paper solutions. For instance, if you’ve ever worked with big data, you know exactly how cumbersome it is to manually deal with large datasets. And this cumbersomeness of large datasets is something for which R&D might turn to IT departments for help.
Data management and the IT world
If you work in IT, you might already be a driving force behind the push to improve data management at your organization… But you don’t have the power to do so. The problem is that, although IT professionals working in the life sciences often clearly see the potential opportunities (and challenges) for improving data management, they don’t own the data. In biotech, the data is owned by R&D, whereas IT is responsible for the systems that collect and report data.
As an IT professional, you have the technical knowledge to foresee the opportunities that could come from a solid data management system that would facilitate structured methods for collecting and maintaining data. You may also expect that integrating AI and machine learning could provide great leaps in scientific insights, which are hidden within the currently messy biological data harboured within your organization. Maybe you have already looked into updating your data management system or applying advanced analytics. But to do so effectively, you need to have knowledge about the specific data types and formats for seamless data integration into one overarching system – which R&D might tell you is simply not possible given the complexity of the biological systems they are investigating, and the diversity in instruments that they are using to collect data. And so, a lot of the potential of improving data management and analytics gets lost in translation.
In our experience, we see that IT professionals are challenged by the messy biological data that they are asked to keep in check because IT teams are often involved too late in the process of data management. If fragmented data management solutions are already in place for each different type of data generated by each individual wet lab scientist, the challenge of making sense of data is then more difficult for IT teams, and some data quality may well be lost. However, if people who understand information technology are brought in earlier in the data collection process, and are able to direct how to store and collect data from the get-go, this often has great benefits for data management, and thereby on data quality. Unfortunately, this is still often not the case in the life sciences industry. As a result, gaps in understanding between R&D and IT teams can delay or prevent the design and implementation of effective data management strategies, and cause lost time, money, effort, and even data.
Bridging the gaps in data management
Ok, now that we know what data management means to R&D specialists versus IT professionals, how do we make sure that they understand each other in order to put effective data management strategies in place?
We’re so glad that you asked.
This is exactly where BioLizard comes in. We Lizards are able to act as translators, and ensure that both parties see the same values in data management, and are aligned on how to implement a great data management plan. Lizards have a unique skill set that includes extensive experience in information technology and computer sciences alongside theoretical and practical biological expertise, meaning that we are able to speak the language of both parties.
At BioLizard, we believe that your data is one of your most valuable assets, so it needs to be managed effectively. We make sure to match your organization’s individual use cases to targeted technical capabilities, to ensure that your data is leveraged to the highest possible level. Ideally, BioLizard should come in early in your data management process, to allow for better planning and implementation of data collection, management, and analysis downstream. However, whenever BioLizard comes in, we have a proven ability to help you:
➤ Generate better and more accurate scientific insights, because we will implement a less biased approach to data collection and analysis.
➤ Make sure that no data is left behind. For instance, we often see that ‘failed experiments’ are excluded from analyses, but this data should be safeguarded just like ‘experimental successes’ because they can generate unexpected insights later on.
➤ Save a lot of time and effort, and focus on what you’re really good at. A great data management plan allows you to spend your work days on your true area of expertise, instead of worrying and wondering how to store, or where to find, your data.
In our next blog in this series, we will introduce you to our best practices for implementing a successful data management plan. But if you are already convinced that it is time to invest in a data management plan that will work for you, make sure to reach out to BioLizard!