New Multiethnic Parameter Estimates

Data

 

For the following parameter estimates, published by Degenhardt and Wendorff et al. (1), two different data collections were used:

1.) IKMB dataset: A new collection of 312 African American, 162 German, 140 Chinese, 143 Indian, 132 Iranian, 189 Japanese, 122 South Korean and 160 Maltese samples genotyped with the Illumina Immunochip (all but Malta) with 196,524 markers addressing immune relevant genes or the Illumina Infimum ImmunoArray 24 (Malta only) with 253,702 markers.

Full context four-digit level information for all classical HLA I and HLA II genes HLA‑A, ‑B, ‑C, ‑DQA1, ‑DQB1, ‑DPA1, ‑DPB1, ‑DRB1 as well as ‑DRB3/4/5 were available through NGS-based typing as published by Wittig et al.(2, 3).

Table 1: Dataset IKMB (4-digit, full context)

Populations

AA

CHN

GER

IND

IRN

JPN

KOR

MLT

Σ

# samples

312

140

162

143

132

189

122

160

1360

 

 

2.) 1KG dataset: The Phase 3 [version from 20130502] 1000 Genomes reference data set with 174,538 phased SNPs that overlap with the Illumina Immunochip (162 samples of African Ancestry, 193 samples of South American Ancestry, 260 samples of East Asian ancestry and 322 samples of European ancestry). The allele information for this dataset is only publicly available on 4-digit G group level and does not include HLA‑DPA1, ‑DPB1, ‑DQA1 and ‑DRB3/4/5 allele calls.

(Side note: The HapMap data is a part of the 1000 Genomes data set.)

Table 2: Dataset 1KG (4-digit, G groups)

Populations

AFR

AMR

EAS

EUR

Σ

# samples

162

193

260

322

937

Subpopulations

ASW

LWK

YRI

CLM

MXL

PUR

CHB

CHS

JPT

CEU

FIN

GBR

TSI

 

# samples

41

75

46

67

56

70

82

92

86

52

95

86

89

937

 

Models

(i) Primary model: Multi-ethnic reference panel in full four-digit context (multiethnic_IKMB.RData)

(ii) Multi-ethnic reference panel combined with the 1000 Genomes data set on G group level (multiethnic_IKMB_1KG.RData)

(iii) Multi-ethnic reference panel on G group level (multiethnic_IKMB_g.RData)

 

For the quality control of the data set and other details please see (1) (i) Figure 3 & Table 3 & Table 2; (ii) Supplementary Material, Fig. S2 & Table S1; (iii) Supplementary Material, Fig. S3 & Table S2.

References:

1) Degenhardt, F., Wendorff, M., Wittig, M., Ellinghaus, E., Datta, L.W., Schembri, J., Ng, S.C., Rosati, E., Hubenthal, M., Ellinghaus, D. et al. (2018) Construction and benchmarking of a multi-ethnic reference panel for the imputation of HLA class I and II alleles. Hum Mol Genet, in press.

2) Wittig, M., Anmarkrud, J.A., Kassens, J.C., Koch, S., Forster, M., Ellinghaus, E., Hov, J.R., Sauer, S., Schimmler, M., Ziemann, M. et al. (2015) Development of a high-resolution NGS-based HLA-typing and analysis pipeline. Nucleic Acids Res, 43, e70.

3) Wittig, M., Juzenas, S., Vollstedt, M. and Franke, A. (2018) High-Resolution HLA-Typing by Next-Generation Sequencing of Randomly Fragmented Target DNA. Methods Mol Biol, 1802, 63-88.