Overweighed Europeans

People of European descent have become increasingly overrepresented in genome study data (left chart), compared to the world's population (right bar).

Alicia Martin

Comparing variation in human genomes, or complete sets of individuals’ DNA, allows researchers to learn which variations are linked to certain traits, such as height, and to risks for certain diseases, including breast cancer and type 2 diabetes. These genome-wide association studies inform calculations known as polygenic risk scores (PRS), which use an individual’s genetic data to predict risks for particular diseases. While PRS could help tailor medical care, a new report has highlighted a major limitation: the scores are much better at forecasting disease risks among those of European descent, who are overrepresented in many genome studies.

An international team led by scientists at Massachusetts General Hospital and the Broad Institute of MIT and Harvard reported that the problem of European overrepresentation has been worsening since 2014. About 79 percent of participants in genome studies from 2008 to the present have European ancestry, despite representing only 16 percent of the world’s population. As a result, insights from those studies and the risk scores they produce could be less useful for people of other ancestries.

To test this point, the research team developed genetic prediction scores for a range of seventeen traits and five diseases using two large biobanks of genomic data, one from the UK and one from Japan. When genetic data were drawn solely from the UK Biobank, PRS were twice as accurate for people of European descent compared to people of East Asian descent. PRS for people of East Asian descent were significantly improved by using data from BioBank Japan rather than the UK Biobank. Risk scores for individuals of African descent, however, were marginally-to-no better than random chance, regardless of whether UK or Japanese biobank data were used.

The researchers note that early attempts to diversify the populations of genome studies are yielding promising results, even when occurring on much smaller scales. According to geneticist and lead author Alicia R. Martin, additional data can “improve genetic prediction accuracy for everyone, and most rapidly for underrepresented populations.” (Nature Genetics)