page loader

Use of Generative AI in Life Sciences

Drug discovery is a pharmacological process where time, cost, and accuracy are crucial. A successful drug discovery process can span a decade or more, costing a staggering $1.1 billion, and with a high failure rate of 90% in clinical testing. There are 1023-1060 estimated drug-like molecules and only 108 have been synthesized. Deep-learning models in the field of drug discovery offer an alternative to experimental design relative to the search for drug candidates. Generative Adversarial Network (GAN) based frameworks such as the deep adversarial autoencoder structure have been utilized to develop and identify novel compounds for anticancer therapy with chemical and biological datasets.

Drug discovery is no longer solely reliant on traditional experimental design. Molecular generative models such as Molecular generative adversarial network (MolGAN) are emerging as a powerful tool that repurposes generative adversarial networks to interact directly with graph-structured data. This model is enhanced with reinforcement learning so generated molecules have particular chemical attributes. MoIGAN circumvents the need for expensive graph-matching procedures and has been shown to create nearly 100% valid molecules.

The MolGAN architecture consists of three main components: a generator, a discriminator, and a reward network.

  • The generator generates an annotated graph, representing a molecule.
  • The discriminator compares it with the input dataset.
  • The reward network optimizes metrics associated with the generated molecule, using reinforcement learning, so the model is trained to generate valid molecules.

Other deep generative models relying on SMILES (Simplified Molecular Input Line Entry System) to represent molecules are prone to generation of spurious molecules. Evaluations using MoiGAN model with the QM9 chemical database produced nearly 100% valid chemical compounds.

Another area to highlight is the use of molecular generative models leveraging the Conditional Variational Autoencoder (CVAE) frameworks, which enforce particular molecular or attributes on the model. It is a generative model that can impose certain conditions in the encoding and decoding processes. The desired molecular properties are set within a condition vector, so can be embedded in a target molecular structure, improving efficiency.

CVAE frameworks have been shown to generate molecular fingerprints that encapsulate the desired molecular properties. Additionally, CVAE has been shown to have promising results in optimizing the search space.

Recent developments in explicit 3D molecular generative models have garnered interest, given their main advantage of optimizing a molecule’s 3D properties. While this provides advantages over traditional 1D/2D models using QSAR, such as considering polarizability and bioactivity, it comes at a computational cost, taking 25 seconds per molecule vs generation of 10,000 SMILES per second.

Clearly, we are in the midst of an innovative chapter in drug research, leveraging generative AI. However, it's crucial to emphasize that further dedicated research is imperative to unlock the full potential and establish effective paradigms in this exciting intersection of artificial intelligence and pharmaceutical innovation.

About Modak

Modak is a solutions company dedicated to empowering enterprises in effectively managing and harnessing their data landscape. They offer a technology, cloud, and vendor-agnostic approach to customer datafication initiatives. Leveraging machine learning (ML) techniques, Modak revolutionizes the way both structured and unstructured data are processed, utilized, and shared. 

Modak has led multiple customers in reducing their time to value by 5x through Modak’s unique combination of data accelerators, deep data engineering expertise, and delivery methodology to enable multi-year digital transformation. To learn more visit or follow us on LinkedIn and Twitter

David Paget Brown
Senior Vice President, Head of Operations, North America at Modak

Leave a Reply

Your email address will not be published. Required fields are marked *