WSJ - Google DeepMind’s AI Model Scours Our Genes to Guess Who Might Get Sick

Moderators: Site Moderators, FAHC Science Team

Post Reply
Posts: 32
Joined: Thu Feb 14, 2008 11:54 pm
Hardware configuration: [img][/img]
Location: Romeo, MIchigan

WSJ - Google DeepMind’s AI Model Scours Our Genes to Guess Who Might Get Sick

Post by tcphillips »

Google DeepMind’s AI Model Scours Our Genes to Guess Who Might Get Sick

By Jo Craven McGinty

Updated Sept. 19, 2023 1:12 pm ET
A machine-learning model evaluated 71 million variations in human proteins for their likelihood to cause disease
One of the greatest challenges biologists face is figuring out which of the myriad variations in a person’s genetic code might make them sick. Artificial intelligence is helping them solve the problem.

A machine-learning model developed by DeepMind Technologies, a subsidiary of Google parent Alphabet, has cataloged 71 million genetic mutations in the structure of proteins that could cause disease in the human body.

Proteins make a critical contribution to the function of human tissues and organs. Each has a unique structure based on a sequence of amino acids that determines what it does and how it works. Often no harm comes from variations in a protein’s structure, but some mutations lead to diseases.

An abnormal form of hemoglobin, a protein that carries oxygen in the blood, causes sickle-cell anemia. Cystic fibrosis is caused by mutations in the protein that is responsible for regulating the flow of salt and fluids in and out of the cells.

AlphaMissense, DeepMind’s AI model, evaluates structural variations in proteins and predicts the likelihood that a mutation will cause harm. The model looks for “missense” mutations in which a protein’s composition varies by a single amino acid.

“This is the most frequent type of variance you see,” said Jun Cheng, research scientist and project lead at Google DeepMind, and co-author of the study published Tuesday in the journal Science.

The model evaluated 216 million possible single amino-acid changes across more than 19,000 human proteins and predicted 71 million missense variations. Relying on patterns in biological data, the model predicted the probability of a variant being able to cause disease. The researchers found 32% of the variants were likely to cause disease and 57% were likely to be benign.

In comparison, of the four million missense variants that had been directly observed in humans, 2% had been classified as either benign or capable of causing disease. The remainder were unclassified.

AlphaMissense builds upon previous research in which DeepMind scientists used artificial intelligence to predict the structure of proteins. That project, AlphaFold, catalogs the three-dimensional structures of more than 200 million proteins based on the sequence of their amino acids.

With AlphaMissense, the researchers set out to assess the potential effect of changes in these structures. Pushmeet Kohli, vice president of research for Google DeepMind and one of the study’s co-authors, compared the process to choosing the right words for a sentence.

“If you substitute a word from an English sentence, you can immediately see if the word substitute changes the meaning of the sentence,” he said.

The researchers tested their model against four benchmarks including a database curated by experts and experimental tests that measure the effects of genetic mutations—approaches that are expensive and labor intensive. Their model, they said, showed strong agreement and performed better than other similar AI tools.

DeepMind is making its catalog of missense mutations publicly available to help molecular biologists, geneticists and doctors improve rare-disease diagnosis and develop treatments that target the genetic causes of these diseases.

In a related article in Science, Joseph A. Marsh, chair of computational protein biology at the University of Edinburgh, and Sarah A. Teichmann, head of cellular genetics at the Wellcome Sanger Institute, who weren’t involved with the project, applauded the work but said its current utility is minimal.

“Current computational predictors are not considered reliable enough to be used by themselves for genetic diagnosis,” Marsh said.

Write to Jo Craven McGinty at

1 x i5 - stock GTX 1070 ti + RTX 3070
1 x e3550 - stock 1 x GTX 750 + 1 x GTX 950

...recovering engineer.
Post Reply