24 July 2023

How AI is revealing nature’s secrets by supercharging species identification

AI is taking the world by storm, and science is no exception. Our scientists are using AI to rapidly identify digitised herbarium specimens that underpin crucial research.

Dr Isabel Larridon pic
Paul Figg pic

By Dr Isabel Larridon and Paul Figg

Herbarium specimens sit to the side of a diagram of a brain

Believe it or not, one crucial factor can (and does) derail vital research toward issues such as climate change and the biodiversity crisis hindering progress on combatting these challenges – yet it is often overlooked. 

In fact, it can stop research in its tracks, preventing publication and potentially invalidating years of science. All because it is based on a herbarium specimen that has been misidentified, i.e., it's mistaken for a species that it is not.  

That’s it. That’s all it takes to bring hard-earned results crumbling to inaccurate rubble. Largely because even similar species can radically differ in the biomolecules that make them up.  

But this makes sense when you consider that correctly identified species are the foundation for a lot of research within certain fields; the bedrock that the rest of the work is built on.  

Unfortunately, with a collection spanning hundreds of years and sometimes taking 50 years for a species to be identified, the potential for a small portion of specimens being mis- or unidentified was always going to be an issue.  

That is until now.  

Person looks at plant specimens that on a table in Kew's Herbarium.
Herbarium specimens from Kew's Herbarium © RBG Kew

Enter, AI

Artificial intelligence (AI) seems to be influencing the world in an ever-growing number of ways, and science is no exception. It’s now crept into the field of nature-based research and, with it feeding on big data as its primary fuel, our colossal collection of plants and fungi specimens create a perfect database to apply this technology.  

Right now, we don’t know exactly how many of our dried herbarium and fungarium specimens are misidentified, and the only way to find out would be for our taxonomists to go through our entire collection of over eight million dried plants and fungi which would likely take the rest of their careers to complete. 

Diagram of a brain hovers in front of a laptop
Machine learning technology is having a positive impact in many areas of science.

Old meets new: Exposing ancient specimens to futuristic technology  

AI often shocks users with its rapid speed and incredible human-like ability. At Kew, however, we’ve found a superhuman application. An ongoing project here at Kew is harnessing the power of machine learning and digitised specimens to revolutionise the identification of species. 

In a pilot, our scientists are applying this technology to the specimen images obtained as part of our multi-million pound Digitisation Project to help accelerate the identification of our physical specimens. 

With only around one million specimens digitised to date, in theory, if implemented fully this technology could flag to our taxonomists in real time every single specimen that requires further investigation regarding its assigned species classification as the other seven million are imaged. 

This means that instead of having to check every specimen in our collection for misidentifications, AI could whittle this down to, for example, just 10% that are very likely to have been wrongly classified. Of course, having human input would be essential in confirming, reassessing or determining the specimen’s identification.  

Despite what you think of the emergence of AI, this is at least one great example of a perfect human-machine collaboration for the potential benefit of all.  

The ultimate promise of this pilot project lies in its potential to expedite the identification process significantly, ushering in a transformative era for the analysis and use of herbarium collections. 

Computer showing lines of code on the screen
The machine learning tool used by our scientists requires the input of parameters to work in the desired manner.

New species hiding in plain sight 

It is believed that of the world’s plant species that are yet to be named new to science (around 80,000 - 100,000 in total) half of these are currently sitting in collections within herbaria around the world either unidentified or misidentified.  

One such example of this that took the media by storm last year was Victoria boliviana: a species of giant waterlily that sat within our Herbarium at Kew for over 177 years before being noticed and named as a new species.   

There could be around 50,000 more examples of this at our fingertips that we’re currently unaware of; an entire world waiting to be explored, hidden secrets to be discovered. They may even hold the key to pressing challenges such as new plant-derived compounds to fight cancer or sustainable crops to feed people with less impact on the planet. 

For the first time in Kew's history, you can now help us unlock nature's secrets by donating to our Digitisation Project, aiding the ambitious effort to make all 8 million herbarium and fungarium specimens accessible to everyone around the world, accelerating research into pressing global issues. 

Mountain range view with sun rising in the background, illuminating the entire scenery or vegetation-covered ground.

You can now immortalise a piece of botanic history. Donate to digitise a plant today and help us unlock nature's secrets.

When it comes to herbarium specimens, misidentifications can have tangible consequences, from misleading agricultural research to failing in the protection of threatened species that urgently require conservation efforts. 

By using machine learning, this project not only addresses this problem of misidentification but also aids in determining species boundaries, enabling informed decisions regarding species classification and grouping. 

What does the future look like?  

The potential benefits of this project extend far beyond the confines of Kew's Herbarium. If the pilot proves viable, the door towards an all-encompassing database of correctly identified species opens, not only within Kew but across herbaria worldwide. Especially with global institutions joining forces to unite over a billion specimens, as was announced earlier this year. 

A snapshot of the collections at the Smithsonian Institution showing many full draws of specimens.
A snapshot of the collections at the Smithsonian Institution © Chip Clark, Smithsonian Institution

The project’s long-term vision is nothing short of achieving 100% certainty in species identification for every herbarium collection, whether within or beyond Kew's domain. This collaborative endeavor has the potential to exponentially enhance our understanding of biodiversity and unlock more effective conservation strategies on a global scale. 

Kew’s Herbarium: The perfect big dataset for AI 

Currently in its pilot and training phase, the project heavily relies on an extensive dataset of digitised specimens to train the machine learning tool, underscoring the criticality of ongoing digitisation efforts within the scientific community. 

A rich dataset strengthens the accuracy and efficacy of machine learning algorithms, allowing them to identify specimens with unparalleled precision. It is this combination of a robust dataset (i.e. our over eight million expert-identified and -curated specimens) and advanced machine learning techniques that forms the foundation for achieving the project's objectives and unlocking the full potential of AI in species identification. 

Digitiser holding barcoder to digitise specimen as part of Kew's Digitisation Project

Learn about our Digitisation Project, a groundbreaking endeavour to digitise over eight million plants & fungi specimens.

The dawn of a new era? 

This project highlights the transformative potential of merging computer vision, machine learning, and digitised collections in the realm of species identification. 

By harnessing AI technologies to work hand-in-hand with expert-taxonomists, misidentified specimens can be rectified, species boundaries can be delineated, and herbarium collections worldwide can be accurately cataloged - ultimately aiding research that is unraveling the mysteries of our natural world.   

The significance of our herbarium collection in global scientific research cannot be overstated. It forms the bedrock for critical studies pertaining to biodiversity crises and climate change. Yet, if these studies rely on misidentified specimens, the integrity of their findings becomes compromised, posing a serious threat to the entire project and hindering vital research efforts.   

Through this visionary project, our researchers are accelerating the identification process, enabling more efficient formulation of conservation action plans, and paving the way for a deeper understanding of global biodiversity.  

As we continue to digitise and harness the power of AI, the day when every herbarium collection is unequivocally identified draws closer, unlocking a wealth of knowledge and insights for the betterment of our planet's future. 

 Acknowledgements 

This pilot project is a collaborative partnership with Royal Holloway, University of London and Royal Botanic Garden Edinburgh. 

Help us digitise our prestigious collections

Get involved with these new opportunities

  • Person looks at plant specimens that on a table in Kew's Herbarium

    Volunteer

    Become part of Kew's ambitious project and help make one of the largest collections in the world freely accessible to everyone around the world.

  • Person holds barcode scanner while imagining a herbarium specimen.

    Donate

    Donate today and immortalise a piece of botanic history that can aid research into urgent global challenges - helping protect our planet for future generations.

  • Person looks at a plant specimen on a computer screen

    Join

    See what job opportunities are available to digitise our collection and play a part in helping scientists across the world access our invaluable specimens.

Read & watch