Newswise – The most common analytical method in population genetics is deeply flawed, according to a new study from Lund University in Sweden. This may have led to incorrect results and misconceptions about ethnicity and genetic relationships. The method has been used in hundreds of thousands of studies, affecting the results of medical genetics and even commercial ancestry testing. The study is published in Scientific reports.

The rate at which scientific data can be collected is increasing exponentially, leading to massive and highly complex datasets, dubbed the “Big Data Revolution”. To make this data more manageable, researchers use statistical methods that aim to compact and simplify the data while retaining most of the key information. Perhaps the most widely used method is called PCA (Principal Component Analysis). By analogy, think of PCA as an oven with flour, sugar, and eggs as input data. The oven can always do the same thing, but the result, a cake, depends mainly on the proportions of the ingredients and how they are combined.

“This method is expected to give decent results because it is so frequently used. But this is neither a guarantee of reliability nor statistically robust conclusions,” says Dr. Eran Elhaik, associate professor of molecular cell biology at Lund University.

According to Elhaik, the method helped create old perceptions about race and ethnicity. It plays a role in shaping historical narratives about who and where people came from, not only by the scientific community but also by commercial ancestry companies. A famous example is when a prominent American politician took an ancestry test ahead of the 2020 presidential campaign to support his ancestral claims. Another example is the misconception of Ashkenazi Jews as an isolated race or group driven by APC results.

“This study demonstrates that these results were not reliable,” says Eran Elhaik.

PCR is used in many fields of science, but Elhaik’s study focuses on its use in population genetics, where the explosion in the size of datasets is particularly acute, due to the reduced costs of sequencing. DNA.

The field of paleogenomics, where we want to learn more about ancient peoples and individuals such as Copper Age Europeans, relies heavily on PCA. PCR is used to create a genetic map that positions the unknown sample alongside known reference samples. Until now, unknown samples have been assumed to be related to the reference population with which they overlap or lie closest on the map.

However, Elhaik discovered that the unknown sample could be made to lie near virtually any reference population simply by changing the number and types of the reference samples (see illustration), generating historical versions with virtually no end, all mathematically “correct”, but only one. may be biologically correct.

Previous

Health News | New Method for Exporting Bacterial Polysaccharides in Pathogens

Next

Have the Lions awakened the best of the Tigers' form?

Check Also