Advances in science

The Antibody Crisis: Leveraging machine learning for evidence-based antibody search

October 22, 2018

SHARE THIS

If ever granted a wish for superpowers I would want to be able to actually see biomolecules. Wouldn’t it be so convenient if we could just tweeze out our protein of interest from a cell under a microscope? Alas, until that dream comes true antibodies are the answer!

**Illustration courtesy: Vinita Bharat, PhD- Fuzzy synapse**

Are you a researcher in the Life Sciences? If so, regardless of your basic or translational background, you would have most probably used antibodies either in research, diagnostics, or in therapeutics. From selection, purification, and identification, antibodies help us address our research questions, and by extension, us, like our knight in shining armor (or maybe in a lab coat). For several decades, biochemists and molecular biologists have sworn by routine procedures such as, Western blots, Pull downs, ELISAs, Flow cytometry, Immunohistochemistry and so on. Simultaneously, antibodies have revolutionized the field of diagnostics by detecting infectious agents and biomarkers.

At a time when a major genre of the world relies heavily on antibodies it has become most crucial, now than ever, to have access to antibodies best suited for our purpose. On that note, a Toronto-based startup, BenchSci, is an artificial intelligence (AI) driven antibody search platform that is committed to enabling scientists to conveniently (being the operative word) make an informed decision about their antibody of interest before purchase.

**Illustration courtesy: Bhrugu Yagnik, PhD**

The Antibody Crisis: Why should you care?

The progress of science is built on reproducible results, and reproducible results require collective effort from relevant stakeholders. As scientists plan their experiments, they need to apply rigor to ensure robust and unbiased experimental design, methodology, analysis, interpretation, and reporting. On the other side of the table, both publishing and grant agencies need to enforce well-documented protocols, including reagents, as part of the manuscript and/or grant submissions. This allows fellow scientists to replicate and validate findings. Reagent vendors also need to ensure that their products, such as antibodies, chemical inhibitors, cell lines, are properly validated to maintain consistent quality for their appropriate use.

Over the past 3 years, several articles came out that highlighted the role of antibodies in research reproducibility. This is because of factors, both intrinsic and extrinsic to the antibodies.

Intrinsic Factors

Antibody selectivity is context dependent. Depending on the sample treatment, an antibody may or may not detect the right target. For example, if an antibody detects an epitope on the native conformation of your target, the antibody may no longer work if the sample was denatured, as often is the case in western blots. This is why often times if an antibody worked for westerns, it may not work for immunohistochemistry, as the antibody may recognize an epitope that is exposed only when the sample is denatured. Therefore, one should never assume that if an antibody worked for one technique, it will also work for other techniques. It entirely depends on the context.
Antibody cross-reactivity exists when the same epitope is present on multiple targets, leading to the antibody detecting multiple proteins. A bad case scenario is when an antibody detects an off-target protein at a similar molecular weight, leading to incorrect conclusion of your findings. Therefore, all antibodies should be validated before use to ensure they are tagging the right protein.
There is batch-to-batch variability for polyclonal antibodies. Different batches of antibodies generated from the same animal may not target the same epitope, and thus a new batch of antibody may no longer work for your context. In contrast, monoclonal antibodies with the same clone will always target the same epitope.

Extrinsic Factors

In addition to the above inherent traits of antibodies, what makes the selection process more difficult is incomplete data regarding product specifications and usage.

There are over 300 companies that sell antibodies, but not all of them manufacture the antibodies themselves. Some companies purchase antibodies from original equipment manufacturer (OEMs) and relabel them under their own brand, and unfortunately, most end users won’t know this, unless they are very experienced in working with a particular protein, and have tested several antibodies themselves. For vendors that don’t manufacture their own antibodies, often times specific information on the immunogen will be missing, such as what the sequence was, whether the antibody was raised against a native, denatured, or fragmented antigen etc. So the product specifications are highly variable between vendors, and most experienced scientists tend to stay away from certain vendors.
But the fault doesn’t rely solely on the antibody vendors. If an antibody cannot be traced back to its source from a publication, the results cannot be reproduced. It is estimated that only 44% of all antibodies mentioned in publications can be identified at all, and this fraction does not correlate with the impact factor of the journal. Fortunately, more journals are becoming aware of this issue and are starting to act on it. For example, the STAR method implemented by Cell Press replaced the traditional Materials and Methods section to provide more detail on the reagents and protocols used.
There is no centralized database that collects antibody validation data. Very few organizations are dedicated to validating antibodies, and individual labs lack the incentive to publish work on antibody validation because such studies are rarely selected by high impact factor journals. To address this issue NIH recently put together an Antibody Portal that lists antibodies that have been validated along with the validation data.

Leveraging machine learning (ML) to build a comprehensive antibody database based on published evidence

While all antibodies need to be validated before use to ensure they are detecting the right target (more on validation practices here), one question remains: How do you decide which antibodies to validate?

**A screenshot from BenchSci’s webpage showing filtered data. Picture courtesy: BenchSci**

The number of commercial antibodies have grown at the rate of Moore’s law over the past 15 years, from 10,000 in 2003 to over 4.6 million in 2018. Looking through BenchSci’s database, it was found that the number of commercially available antibodies for a given target could range from tens of thousands to 37. Even on the lowest end, it would be difficult, if not impossible, to validate all 37 antibodies.

To decide the antibody candidates to test, the current standard practice is to spend time doing extensive literature research, find out published data that are relevant, and compare them to decide which antibodies are most likely to work. However, this process can be tedious, and given the number of papers available, it would be impossible to manually do a comprehensive literature analysis. The goal of BenchSci is to leverage machine learning (ML) technology to screen the literature for antibody usage, and present the data to scientists in a searchable and filterable manner, helping them save time and experimental cost to find published figures and compare antibodies.

So why ML? The main advantage of ML is that it can be trained to perform a single task at a massive scale. BenchSci’s ML algorithm first screens papers for ones that contain antibody citation information. Next, two specific branches of ML are applied:

Supervised deep learning- to understand the context surrounding the use of an antibody
Convolutional neural network- to label figures with the correct technique.

The end result is an open access resource for academic scientists on BenchSci’s website. BenchSci has the largest antibody database worldwide at 5.1 million antibodies. The papers analyzed include those from both open- and closed-access journals, including SpringerNature, Wiley, and Karger, among others. Last but not least, BenchSci’s technology recently received funding support from Google.

Interested to learn how you can leverage BenchSci for your antibody search? Join us in the upcoming webinar on BenchSci’s Facebook page on Friday, Oct 26th at 1pm EST, and/ or create a free account and give it a try here.

Author & Blog design:

Dolonchapa is a Postdoctoral Fellow at NYU Langone working on Infectious disease with a focus on cell wall metabolism to identify new targets for therapeutic attacks by Pseudomonas aeruginosa, a common opportunistic human pathogen. She also serves as the Co-Chair of National Postdoctoral Association’s Outreach Committee. She believes in the power of technical storytelling as an effective tool for scientific outreach and looks forward to practicing this art as an editor at Club SciWri. Follow her on Twitter.

Editor:

Rajamani is currently a Neuroscience Ph.D. student at the University of Connecticut Health, Farmington, CT. Her research focuses on understanding the interactions between growth factors and endocannabinoids in modulating acute synaptic transmission in the brain. Post graduation, she is interested in pursuing a career in medical communications. She is passionate about communicating STEM education and outreach to middle and high schoolers. She is also a mentor for 1000 girls 1000 futures program, New York Academy of Sciences. Away from science, she is an artist and enjoys leisure travel. Follow her on LinkedIn.

Illustrators:

Bhrugu Yagnik, PhD

Bhrugu is a post-doc at Emory Vaccine Centre, Yerkes National Primate Research Centre, Emory University, Atlanta, GA and works on the development of a HIV/AIDS vaccine. His doctoral research focused on the development of vaccines against Shigella using food grade Lactococcus lactis as an antigen delivery vehicle. Bhrugu has many awards to his credit which includes Lady Tata Memorial Fellowship Award 2014 (India), 2016 International Society Vaccine (ISV) Congress Award (Boston, MA), Dr. G. P. Talwar Young Scientist Award 2017 (Indian Immunology Society, India) and AIDS Vaccine 200 (AV200) Fellowship Award (Atlanta, GA). He is passionate about communicating the science in creative ways. In his free time, Bhrugu indulges himself in spirituality where he attempts to bring an amalgamation of science and spirituality. Connect with him on LinkedIn or ResearchGate.

Vinita is a post-doc at Stanford University, USA and had been a PhD student at International Max Planck Research School (IMPRS, Göttingen, Germany). Her research area focuses on cellular and molecular neuroscience. Other than enjoying ‘being a scientist’, she has also been working on science education. Presenting science in an easy and fun way is what she loves doing through her platform “Fuzzy Synapse”. She is a fun, enthusiastic and curious person, passionate about traveling, loves celebrations and bringing smiles around her. Follow her work as Fuzzy Synapse at Instagram, Facebook, and Twitter.

The contents of Club SciWri are the copyright of PhD Career Support Group for STEM PhDs (A US Non-Profit 501(c)3, PhDCSG is an initiative of the alumni of the Indian Institute of Science, Bangalore. The primary aim of this group is to build a NETWORK among scientists, engineers and entrepreneurs).

This work by Club SciWri is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License