2013 IEEE 13th International Conference on Data Mining Workshops (2012)
Brussels, Belgium Belgium
Dec. 10, 2012 to Dec. 10, 2012
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/ICDMW.2012.74
Variation in the Human Leukocyte Antigen (HLA) gene system is very important. It is one of the most polymorphic regions of the human genome and one of the most extensively studied regions due to its association with autoimmune, infectious, and inflammatory diseases, such as rheumatoid arthritis, celiac disease, multiple sclerosis and Type I diabetes. The HLA gene system also plays a crucial role in hematopoietic stem cell transplantation, where patients and donors are matched with respect to their HLA genes in order to maximize the chances of a successful transplant. Having complete HLA data is therefore of great use to clinicians and researchers. However, due to its polymorphism, obtaining it is highly time- and cost-prohibitive. Genome-wide association studies finding strong associations within HLA region would ideally like to identify the exact HLA alleles responsible for association in order to determine the causal genes/variants. Here we propose a method to infer HLA alleles from widely available and affordable SNP genotype data. Our method takes into account the high linkage disequilibrium that exists in the region. We demonstrate that this additional information is an imporant asset in HLA prediction problem.
Sociology, Frequency estimation, Bioinformatics, Biological cells, Genomics, Humans, SNP data, Human Leukocyte Antigen, HLA imputation, multi-label prediction
Vanja Paunic, Michael Steinbach, Vipin Kumar, Martin Maiers, "Prediction of HLA Genes from SNP Data and HLA Haplotype Frequencies", 2013 IEEE 13th International Conference on Data Mining Workshops, vol. 00, no. , pp. 964-971, 2012, doi:10.1109/ICDMW.2012.74