Eric Choi, Seoul International School, Gyeonggi-do, Seongnam-si, South Korea
Next-generation sequencing (NGS) provides massive sequencing data with ultra-high output and speed (Slatko et al., 2018). This revolutionary technology provided large-scale genomic data to identify single nucleotide polymorphisms (SNPs) that represent the association with many traits and diseases (Kumar et al., 2012). However, most genetic studies are based on those of European descent, who contributed their DNA for research purposes. While genomic data has been updated with many different populations recently, European-based genomic data still consisted of approximately 80% in genome-wide association studies (GWAS). Clearly, Europe only represents a small fraction of the total human population, so the remaining population is misrepresented. Consequently, the lack of diversity in genomics studies limits the scientific understanding of human diseases; this may lead to health inequities and failure to understand disease prevention and treatment.
The lack of ethnic diversity may mislead the genetic research-based clinical practices. Neil Risch, professor at the University of California San Francisco School of Medicine, tried to find a rare gene variant in a child with a disease that had been undiagnosed. However, when he compared the child’s DNA sequence data through a rare-variant database, which is mostly based on European ancestry, he faced difficulties identifying the genetic variants that cause symptoms in patients without European ancestry (Risch et al., 2002). If the disease is caused by novel variants, it is usually not specified as pathogenic, which makes it hard to identify the genetic link to disease symptoms. If the genetic information is limited to only European ancestry, not all groups are positioned to benefit from new findings that are associated with many diseases. To solve this problem, the establishment of an indigenous background variant library of genetic variation from a diverse group is needed.
Genetic variants in one population do not always cause disease in other populations. For example, 70% of Europeans with Cystic Fibrosis (a genetic disease that causes mucus to build up in the lungs and digestive system, notated CF) have F508, a pathogenic variant in the CFTR gene. However, this variant was identified in only 29% of cases in an African population (Wang et al., 1996). Due to the limited genetic information on non-Europeans, CF diagnosis does not sufficiently include the variants present in the non-European population (Schrijver et al., 2016). This causes delayed diagnosis that leads to postponed treatment and clinical deterioration. To improve the health of all patients with CF, more genetic factors that cause CF in the non-European population must be identified.
Homo sapiens emerged from Africa between 300,000 and 200,000 years ago. After they spread from Africa and scattered to other places, this burst of migration promoted ethnic diversity to flourish worldwide. Every time a small group of humans moved to another location, they carried only a small diversity of gene pools. The population from the stem tends to have bigger genomes linked together. Therefore, the different genomic linkage can cause trouble for comparing across populations. This lack of genomic diversity causes misunderstanding of human disease. This may mislead the direction of the development of novel drugs that benefit all human races. Therefore, researchers need to understand the value of studying a diverse population. If genetic markers are well-identified within a diverse population, it would lead to accurate diagnoses of diseases and enable the best treatment for patients.
In recent years, efforts to collect multi-ethnic data have increased to solve the health inequities in genomic studies. For example, UK Biobank, a collection of genomic data from British people accessible to researchers, allows more than 35,000 DNA samples from non-European to be accessible (Sudlow et al., 2015). Also, the use of diverse ethnic data resources is increasing amongst researchers, who challenge to analyze the data from ancestrally diverse samples (Wojcik et al., 2019). As more research analyzed the data from diverse ethnic data resources, it may provide clear guidance and allow other researchers to continue performing follow-up research to clarify the genetic disease within the non-European samples.
All scientists, teachers, students, genetic counselors, and policymakers should work together as a team to reduce the current disparities and underrepresentation. The community must work together to raise the attention for the establishment of large-scale genetic epidemiology studies in underrepresented groups. The educators should reach all populations to inform the importance of genomic medicine advances in care for non-European groups especially those in under-served areas. The government should increase support for research funding that all underrepresented groups can easily participate in genetic epidemiology studies. All these partnerships may remediate health disparities in genomic medicine.
References
Kumar, S., Banks, T. W., & Cloutier, S. (2012). SNP Discovery through Next-Generation Sequencing and Its Applications. International Journal of Plant Genomics, 2012, 831460. https://doi.org/10.1155/2012/831460
Risch, N., Burchard, E., Ziv, E., & Tang, H. (2002). Categorization of humans in biomedical research: genes, race and disease. Genome Biology, 3(7), comment2007. https://doi.org/10.1186/gb-2002-3-7-comment2007
Schrijver, I., Pique, L., Graham, S., Pearl, M., Cherry, A., & Kharrazi, M. (2016). The spectrum of CFTR variants in nonwhite cystic fibrosis patients: implications for molecular diagnostic testing. The Journal of Molecular Diagnostics, 18(1), 39–50. https://doi.org/10.1016/j.jmoldx.2015.07.005
Slatko, B. E., Gardner, A. F., & Ausubel, F. M. (2018). Overview of Next-Generation Sequencing Technologies. Current Protocols in Molecular Biology, 122(1), e59. https://doi.org/10.1002/cpmb.59
Sudlow, C., Gallacher, J., Allen, N., Beral, V., Burton, P., Danesh, J., Downey, P., Elliott, P., Green, J., Landray, M., Liu, B., Matthews, P., Ong, G., Pell, J., Silman, A., Young, A., Sprosen, T., Peakman, T., & Collins, R. (2015). UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Medicine, 12(3), e1001779. https://doi.org/10.1371/journal.pmed.1001779
Wang, W., Okayama, H., & Shirato, K. (1996). [Genotypes of cystic fibrosis (CF) reported in the world and polymorphisms of cystic fibrosis transmembrane conductance regulator (CFTR) gene in Japanese]. Nippon Rinsho = Japanese Journal of Clinical Medicine, 54(2), 525–532.
Wojcik, G. L., Graff, M., Nishimura, K. K., Tao, R., Haessler, J., Gignoux, C. R., Highland, H. M., Patel, Y. M., Sorokin, E. P., Avery, C. L., Belbin, G. M., Bien, S. A., Cheng, I., Cullina, S., Hodonsky, C. J., Hu, Y., Huckins, L. M., Jeff, J., Justice, A. E., … Carlson, C. S. (2019). Genetic analyses of diverse populations improves discovery for complex traits. Nature, 570(7762), 514–518. https://doi.org/10.1038/s41586-019-1310-4
Comments