Imputation-based HLA typing with GWAS SNPs: Model training

Introduction
Materials
Prepare SNP data for a specific genotyping platform
Build HIBAG models with default settings
GPU computing
Compute clusters
Execution time for model building
Session information

Introduction

SNP-based imputation approaches for human leukocyte antigen (HLA) typing take advantage of the extended haplotype structure within the major histocompatibility complex (MHC) to predict HLA classical alleles using dense SNP genotypes, such as those available on array-based panels in genome-wide association study (GWAS). These methods enable HLA analyses of classical alleles on existing SNP datasets genotyped in GWAS at no extra cost. Here, we describe the workflow of HIBAG, an imputation method with attribute bagging, for obtaining a sample’s HLA classical alleles using SNP data.

In this exercise, the 1000 Genomes SNP and HLA data are used as a training set to build an HIBAG model for HLA–A. Model training is a time-consuming process, and we will discuss various approaches to accelerate these calculations.

Figure 1: Computational workflow for HLA imputation using SNP data: model building with a training dataset of SNPs and HLA classical alleles, where multiple cores and graphics processing units (GPU) can be used.

Materials

To impute the HLA classical alleles of a sample in a GWAS, the installation of the HIBAG package and supplementary software tools on a computer is necessary. The pre-built classifiers tailored for a GWAS genotyping platform can be downloaded from the HIBAG website (https://hibag.s3.amazonaws.com/index.html).

1. Hardware

The HIBAG method is publicly available in an R/Bioconductor package, and it can be installed and executed on multiple operating systems, e.g., Linux, MacOS and Windows. To utilize the GPU functionalities, an OpenCL-compatible hardware accelerator should be installed on your laptop, desktop, or workstation, e.g., Apple’s M1 Max chip and NVIDIA Tesla V100 for general-purpose computing on graphics processing units (GPGPU). Internet access is not required if the datasets and pre-built models have been downloaded.

2. Software dependencies

The R programming environment (http://www.r-project.org) should be installed and R_v4.0 or a newer version is recommended. HIBAG v1.38.0 on Bioconductor is required to complete the exercises in this chapter (https://bioconductor.org/packages/HIBAG). The kernel of the HIBAG package is coded in C++. To install the package from the source codes, a C++ compiler should be available on your operating system, e.g., GNU g++ compiler. To enable a GPU application, it is necessary to install an appropriate driver, which should include the OpenCL headers and libraries. These components are typically available within the OpenCL SDK provided by your preferred vendor. The HIBAG.gpu package (v0.99.1) can be downloaded and installed from github (https://github.com/zhengxwen/HIBAG.gpu).

3. HLA and SNP datasets

The HLA classical alleles in the 1000 Genomes Project are publicly available in the two recent studies (Gourraud et al., 2014 & Abi-Rached et al., 2018). We harmonized these two sets of HLA classical alleles (n=2963) according to the P groups (two-field and protein level) defined by IPD-IMGT/HLA 3.54.0 (https://www.ebi.ac.uk/ipd/imgt/hla/). The SNP genotypes of 2702 individuals in the extended MHC region were called from the Illumina NovaSeq 6000 sequencing with a targeted depth of 30X (# of SNVs=98,427). Both data files are included in the data file “data_package.zip”.

Prepare SNP data for a specific genotyping platform

Download and unzip the file “data_package.zip” from the HIBAG website (https://hibag.s3.amazonaws.com/2024_tutorial_for_mimb/index.html), which contains the harmonized HLA classical alleles defined according to the P groups, and the SNP genotypes of the 1000 Genomes project in the extended MHC region (xMHC) from the recent Illumina high-coverage sequencing.

library(HIBAG)

## HIBAG (HLA Genotype Imputation with Attribute Bagging)

## Kernel Version: v1.5 (64-bit)

library(SeqArray)

## Loading required package: gdsfmt

# HLA classical alleles (P groups)
hla <- read.csv("1000g_hla_harmonized_pcode_alleles.csv")
nrow(hla)  # number of samples: 2963

## [1] 2963

head(hla)

##   Region Population Sample.ID HLA.A.1 HLA.A.2 HLA.B.1 HLA.B.2 HLA.C.1 HLA.C.2 HLA.DQB1.1 HLA.DQB1.2
## 1    EUR        GBR   HG00096  01:01P  29:02P  08:01P  44:03P  07:01P  16:01P     02:01P     02:01P
## 2    EUR        GBR   HG00097  03:01P  24:02P  07:02P  07:02P  07:02P  07:02P     03:01P     06:02P
## 3    EUR        GBR   HG00098  01:01P  02:01P  08:01P  44:02P  05:01P  07:01P     02:01P     03:01P
## 4    EUR        GBR   HG00099  01:01P  68:01P  08:01P  44:02P  07:01P  07:04P     02:01P     03:01P
## 5    EUR        GBR   HG00100  01:01P  01:01P  08:01P  57:01P  06:02P  07:01P     02:01P     03:03P
## 6    EUR        GBR   HG00101  11:01P  32:01P  27:05P  57:01P  02:02P  06:02P     03:02P     06:02P
##   HLA.DRB1.1 HLA.DRB1.2
## 1     03:01P     07:01P
## 2     13:03P     15:01P
## 3     04:01P     03:01P
## 4     03:01P     11:01P
## 5     03:01P     07:01P
## 6     04:04P     15:01P

table(hla$Region)  # counts of each geographic region

## 
## AFR AMR EAS EUR SAS 
## 798 402 632 588 543

# MHC SNP genotypes of 1000 Genomes on hg38
gds_fn <- "1kGP_HC_Illumina.chr6.SNV.xMHC.hg38.gds"
# load SNP genotypes from a GDS file
mhc_snp <- hlaGDS2Geno(gds_fn, assembly="hg38")

## Open '1kGP_HC_Illumina.chr6.SNV.xMHC.hg38.gds'
## Import 77036 SNPs within the xMHC region on chromosome 6
## 
[..................................................]  0%, ETC: ---    
[==================================================] 100%, completed, 5s

mhc_snp

## SNP genotypes: 
##     2702 samples X 77036 SNPs
##     SNPs range from 28723668bp to 33400443bp on hg38
## Missing rate per SNP:
##     min: 0, max: 0, mean: 0, median: 0, sd: 0
## Missing rate per sample:
##     min: 0, max: 0, mean: 0, median: 0, sd: 0
## Minor allele frequency:
##     min: 0.00185048, max: 0.5, mean: 0.120753, median: 0.0555144, sd: 0.138105
## Allelic information:
##   A/G   T/C   G/A   C/T   A/C   T/G   C/G   G/C   C/A   G/T   T/A   A/T 
## 15062 14871 10962 10898  3757  3626  3507  3374  2828  2797  2703  2651

In the example, the platform that we are working on is the Illumina Infinium Multi-Ethnic Global. The manifest file can be downloaded from https://support.illumina.com. The SNP list on the build 38 of human genome reference is also available in our data package.

mhc_illum <- read.csv("Illumina_Infinium_MEG_v1.0_MHC.csv.xz", comment="#")
head(mhc_illum, n=3)

##                            IlmnID           Name IlmnStrand   SNP AddressA_ID
## 1 1KG_6_29012322-0_P_F_2115831830 1KG_6_29012322       PLUS [I/D]     5654494
## 2 1KG_6_29054783-0_M_R_2115818292 1KG_6_29054783      MINUS [I/D]    97703910
## 3 1KG_6_29599174-0_P_F_2255362572 1KG_6_29599174       PLUS [D/I]    88770989
##                                     AlleleA_ProbeSeq AddressB_ID AlleleB_ProbeSeq GenomeBuild Chr  MapInfo
## 1 TCAGCACAGCTTTGGCAATGTAGCCATAGGATATAAGAATAAGGATGAGA          NA                           38   6 29044545
## 2 AATCTCTCCATCTTAGATCTCTGCTATACCACAACTACAGTCCCTCATAT          NA                           38   6 29087006
## 3 GAAAAGGAATAGTCTTGAAGAGGGGAGGGGCTTCCGAGGCTACTCACCAC          NA                           38   6 29631397
##    Ploidy      Species  Source SourceVersion SourceStrand
## 1 diploid Homo sapiens Unknown             0         PLUS
## 2 diploid Homo sapiens unknown             0         PLUS
## 3 diploid Homo sapiens Unknown             0         PLUS
##                                                                                                                                                                                                         SourceSeq
## 1  TgagaTCcacaGGTATTCATTGCttttCGCTGgcttgcttTTGACTTCGTtctcAGcacaGCTTTGGCAATGTAGCCATAGGatatAAGAATAAGGATGAG[-/A]AGgtgtGAGGACAATtataATGCCTAAAGCGaaaaCAGACATTTCAACtgttgtGgtgtCTacacAAGCTATCTTGACCagagCTGGCAACTcacaCAAG
## 2 GAAGgagaCActctGTAGCACCTAGGGCCAGGAAgatgatGAGGTGGGCcacacagccagcATAGCTGATGGTCttttTGTTGCAACCAatatTTACCAAC[-/AT]atatGAGGGACTGTAGTtgtgGtataGCagagATCTAAGATGgagaGATTAGTgagaAAGAAATACATGGGAGTATGAAGTTTGGGATCCAGAATGcaca
## 3  ATACTGGATTTAAAGTGCTGACTTGCAAGCAGTAATGCTAAGTTTGCGGCAGGaaaaGGAATAGTCTTGAagagGGGAggggCTTCCGAGGCTACTCACCA[-/C]CAGCGGCTGGgtgtGTCCatatCTGTCCAGGAGCCGTTGGCCAGGCACTTGCGGACCTTGGGccccaccaccTcgcgCTccccccGGcacaCATACTCAA
##                                                                                                                                                                                                     TopGenomicSeq
## 1  TgagaTCcacaGGTATTCATTGCttttCGCTGgcttgcttTTGACTTCGTtctcAGcacaGCTTTGGCAATGTAGCCATAGGatatAAGAATAAGGATGAG[-/A]AGgtgtGAGGACAATtataATGCCTAAAGCGaaaaCAGACATTTCAACtgttgtGgtgtCTacacAAGCTATCTTGACCagagCTGGCAACTcacaCAAG
## 2 GAAGgagaCActctGTAGCACCTAGGGCCAGGAAgatgatGAGGTGGGCcacacagccagcATAGCTGATGGTCttttTGTTGCAACCAatatTTACCAAC[-/AT]atatGAGGGACTGTAGTtgtgGtataGCagagATCTAAGATGgagaGATTAGTgagaAAGAAATACATGGGAGTATGAAGTTTGGGATCCAGAATGcaca
## 3  ATACTGGATTTAAAGTGCTGACTTGCAAGCAGTAATGCTAAGTTTGCGGCAGGaaaaGGAATAGTCTTGAagagGGGAggggCTTCCGAGGCTACTCACCA[-/C]CAGCGGCTGGgtgtGTCCatatCTGTCCAGGAGCCGTTGGCCAGGCACTTGCGGACCTTGGGccccaccaccTcgcgCTccccccGGcacaCATACTCAA
##   BeadSetID Exp_Clusters RefStrand
## 1      1124            3         +
## 2      1152            3         -
## 3      1152            3         +

Find an intersect of SNP data between the 1000 Genomes and the Infinium Multi-Ethnic Global platform.

# the SNP intersect
snp <- intersect(mhc_snp$snp.position, mhc_illum$MapInfo)
length(snp)

## [1] 11898

# subset the SNP data
mhc_snp_sub <- hlaGenoSubset(mhc_snp, snp.sel=match(snp, mhc_snp$snp.position))
mhc_snp_sub

## SNP genotypes: 
##     2702 samples X 11898 SNPs
##     SNPs range from 28724551bp to 33399495bp on hg38
## Missing rate per SNP:
##     min: 0, max: 0, mean: 0, median: 0, sd: 0
## Missing rate per sample:
##     min: 0, max: 0, mean: 0, median: 0, sd: 0
## Minor allele frequency:
##     min: 0.00185048, max: 0.5, mean: 0.175568, median: 0.140266, sd: 0.143722
## Allelic information:
##  A/G  T/C  C/T  G/A  A/C  T/G  C/A  G/T  C/G  G/C  A/T  T/A 
## 2443 2358 1914 1902  561  524  461  449  386  345  280  275

# save the SNP genotypes for this platform
saveRDS(mhc_snp_sub, "1kGP_HC_Illumina_Infinium_MEG.rds")

Build HIBAG models with default settings

1. Load SNP and HLA genotypes

Start an R session and import the SNP and HLA genotypes, which consist of 2962 individuals in the 1000 Genomes project. In the exercise, we build an ethnicity-specific model using the samples of European ancestry.

# load SNP data
mhc_snp_sub <- readRDS("1kGP_HC_Illumina_Infinium_MEG.rds")

# use European ancestry as an example
hla <- hla[hla$Region=="EUR", ]
# HLA-A classical alleles (using hg38 coordinate)
hla_a <- hlaAllele(hla$Sample.ID, hla$HLA.A.1, hla$HLA.A.2, locus="A", assembly="hg38")
hla_a

## List of 5
##  $ locus    : chr "A"
##  $ pos.start: int 29942470
##  $ pos.end  : int 29945884
##  $ value    :'data.frame':   588 obs. of  3 variables:
##   ..$ sample.id: chr [1:588] "HG00096" "HG00097" "HG00098" "HG00099" ...
##   ..$ allele1  : chr [1:588] "01:01P" "03:01P" "01:01P" "01:01P" ...
##   ..$ allele2  : chr [1:588] "29:02P" "24:02P" "02:01P" "68:01P" ...
##  $ assembly : chr "hg38"
##  - attr(*, "class")= chr "hlaAlleleClass"

2. Flanking SNP data

The flanking SNPs are used in the model training, and the size of 500kb on each side is recommended. It is important to specify the human reference genome “hg38” according to the SNP data, since the location of the specified HLA gene is required to determine the SNPs in the flanking region.

# SNP IDs in the flanking region
snpid <- hlaFlankingSNP(mhc_snp_sub$snp.id, mhc_snp_sub$snp.position, "A", flank.bp=500*1000, assembly="hg38")

# SNP genotypes to be used in model traning
train_geno <- hlaGenoSubset(mhc_snp_sub, snp.id=snpid)
train_geno

## SNP genotypes: 
##     2702 samples X 3293 SNPs
##     SNPs range from 29442924bp to 30445755bp on hg38
## Missing rate per SNP:
##     min: 0, max: 0, mean: 0, median: 0, sd: 0
## Missing rate per sample:
##     min: 0, max: 0, mean: 0, median: 0, sd: 0
## Minor allele frequency:
##     min: 0.00185048, max: 0.5, mean: 0.161904, median: 0.134345, sd: 0.134621
## Allelic information:
## A/G T/C C/T G/A T/G A/C G/T C/A G/C C/G T/A A/T 
## 678 666 563 480 155 154 124 120 110  98  77  68

3. Run the model building

The function hlaAttrBagging() is used to fit the model parameters. By default, 100 individual classifiers will be generated to build an ensemble model, however this process is very time-consuming and will take ~8 hours to run on a single core (Intel Xeon Gold 6248 CPU @2.50GHz). Multithreading can be used to speed up the calculations, and it takes 30 minutes when 16 threads are used.

To complete this exercise and prevent lengthy wait times, you could try building just a classifier (nclassifier=1) instead of 100. The approaches for accelerating the calculation will be discussed in the following subheadings.

# don't forget to set a seed since HIBAG is a randomized algorithm
set.seed(100)

# fit the model using 16 threads
model <- hlaAttrBagging(hla_a, train_geno, nclassifier=100, nthread=16)

## Build a HIBAG model with 100 individual classifiers:
##     MAF threshold: NaN
##     excluding 141 monomorphic SNPs
##     # of SNPs randomly sampled as candidates for each selection: 57
##     # of SNPs: 3152
##     # of samples: 559
##     # of unique HLA alleles: 30
## CPU flags: 64-bit, AVX512BW
## # of threads: 16
## [-] 2023-11-27 20:28:58
## === building individual classifier 1, out-of-bag (218/39.0%) ===
## [1] 2023-11-27 20:29:11, oob acc: 97.48%, # of SNPs: 42, # of haplo: 268
=== building individual classifier 2, out-of-bag (204/36.5%) ===
## ...
## Accuracy with training data: 99.91%
## Out-of-bag accuracy: 98.77%

model  # display the model

## Gene: HLA-A
## Training dataset: 559 samples X 3152 SNPs
##     # of HLA alleles: 30
##     # of individual classifiers: 100
##     total # of SNPs used: 1914
##     avg. # of SNPs in an individual classifier: 50.61
##         (sd: 8.36, min: 33, max: 71, median: 50.00)
##     avg. # of haplotypes in an individual classifier: 378.50
##         (sd: 95.40, min: 180, max: 669, median: 364.50)
##     avg. out-of-bag accuracy: 98.77%
##         (sd: 0.61%, min: 97.16%, max: 99.76%, median: 98.83%)
## Matching proportion:
##         Min.     0.1% Qu.       1% Qu.      1st Qu.       Median      3rd Qu.         Max.         Mean 
## 5.499657e-27 1.450782e-24 1.145472e-21 1.265863e-11 1.234457e-06 4.764091e-05 9.080884e-03 1.850363e-04 
##           SD 
## 7.075602e-04 
## Genome assembly: hg38

4. Model saving and visualization

After the model training completes, we save this model into an R object file for future uses. The HIBAG model can be visualized in a scatterplot showing the relationship between the frequency of SNP uses in an individual classifier and genome coordinate.

mobj <- hlaModelToObj(model)
saveRDS(mobj, file="hla_a_model.rds")

plot(model)  # visualization

GPU computing

To complete the exercise in this section, a graphics processing unit should be installed on your computer as well as the OpenCL headers and libraries. After installing the HIBAG.gpu package, we start a new R session and run the following scripts:

library(HIBAG.gpu)

Here is a typical welcome message that appears after loading the package:

## Available OpenCL device(s):
##     Dev #1: NVIDIA Corporation, Tesla V100-SXM2-32GB, OpenCL 3.0 CUDA
##
## Using Device #1: NVIDIA Corporation, Tesla V100-SXM2-32GB
##     Device Version: OpenCL 3.0 CUDA (>= v1.2: YES)
##     Driver Version: 535.129.03
##     EXTENSION cl_khr_global_int32_base_atomics: YES
##     EXTENSION cl_khr_fp64: YES
##     EXTENSION cl_khr_int64_base_atomics: YES
##     EXTENSION cl_khr_global_int64_base_atomics: NO
##     CL_DEVICE_GLOBAL_MEM_SIZE: 34,079,899,648
##     CL_DEVICE_MAX_MEM_ALLOC_SIZE: 8,519,974,912
##     CL_DEVICE_MAX_COMPUTE_UNITS: 80
##     CL_DEVICE_MAX_WORK_GROUP_SIZE: 1024
##     CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS: 3
##     CL_DEVICE_MAX_WORK_ITEM_SIZES: 1024,1024,64
##     CL_DEVICE_LOCAL_MEM_SIZE: 49152
##     CL_DEVICE_ADDRESS_BITS: 64
##     CL_KERNEL_WORK_GROUP_SIZE: 256
##     CL_KERNEL_PREFERRED_WORK_GROUP_SIZE_MULTIPLE: 32
##     local work size: 256 (Dim1), 16x16 (Dim2)
##     atom_cmpxchg (enable cl_khr_int64_base_atomics): OK
## GPU device supports double-precision floating-point numbers
## Training uses a mixed precision between half and float in GPU
## By default, prediction uses 64-bit floating-point numbers (since EXTENSION cl_khr_int64_base_atomics: YES).

If there are more than one GPU devices available in the same machine, we can switch to the other device, e.g., run hlaGPU_Init(2) to use the second GPU device.

When the computing equipment is ready, the model training is as simple as calling the function hlaAttrBagging_gpu() in the HIBAG.gpu pacakge:

model <- hlaAttrBagging_gpu(hla_a, train.geno, nclassifier=100)
## Accuracy with training data: 99.91%
## Out-of-bag accuracy: 98.73%

The GPU computing process takes 6.7 minutes. It is about 50 times faster than the single-threaded implementation, and 4.6 times faster than the 16-threaded implementation. Further, hlaAttrBagging_MultiGPU() can be used to leverage the GPU implementation when there are more than one GPU devices installed. The multiple GPU devices are not required to be homogeneous, and it could run with an unbalanced workload.

Compute clusters

The HIBAG ensemble model consists of independent individual classifiers, therefore the parallel implementation is straightforward, i.e., building individual classifiers independently on different nodes. The job-level parallelism on loosely coupled compute clusters is supported via communication over sockets or MPI (Message Passing Interface). Using the framework implemented in the R package “parallel”, a compute cluster object can be passed to the first argument of hlaParallelAttrBagging().

For example,

# when OpenMPI is installed in the cluster and run with mpirun
library(Rmpi)
cl <- snow::makeMPIcluster(50)

model <- hlaParallelAttrBagging(cl, hla_a, train.geno, nclassifier=100)

Execution time for model building

In the benchmark, we used the samples in the 1000 Genomes project (n=2702) to build platform-specific HIBAG models. The SNP markers were selected according to the Illumina Infinium Multi-Ethnic Global platform. The HLA classical alleles were harmonized from two data sources according to the P groups (two-field and protein level). The SNP genotypes in the extended MHC region were called from the Illumina NovaSeq 6000 sequencing with a targeted depth of 30X. Both data files are included in our data package “data_package.zip”.

As shown in Table 1, the single-threaded implementation took about 18 days on average to build a model for a classical HLA gene. It is important to take advantage of multithreading, multiple cores and/or GPUs to speed up the calculations. The speed of the multithreading implementation nearly increases in proportion to the number of threads. In the GPU settings, we tested 3 GPU devices which were released in the past 5 years. Nvidia A100 is the fastest device among the 3 GPUs, while Apple M1 Max is the slowest. The GPU implementation with Apple M1 Max is ~170 times faster than the single-threaded process, and Nvidia A100 is even faster (~250x for an A100 device). The calculation can be further accelerated using multiple GPU devices simultaneously, e.g., the performance of 4 Nvidia V100s is ~3.5 faster than that of one V100. Compared with the price of Nvidia A100, Apple’s M-series processors are much more affordable.

Table 1: Computation time for building multi-ethnic HIBAG models of 100 individual classifiers with the 1000 Genomes data \(^{1,2}\).

	HLA-A	HLA-B	HLA-C	HLA-DRB1	HLA-DQB1	Speed-up
# of samples	2701	2695	2699	2700	2698
# of SNPs	3293	2750	2858	3722	3837
# of unique alleles	80	153	52	69	25
Elapsed time (hour):
1 thread	329.5	588.3	288.3	509.0	428.7	1
16 threads	20.9	37.3	18.3	32.3	27.3	15.7
1x Nvidia V100	2.02	2.01	1.66	2.19	2.33	210.0
4x Nvidia V100	0.54	0.56	0.46	0.61	0.65	760.2
1x Nvidia A100	1.91	1.74	1.53	1.96	2.02	234.0
4x Nvidia A100	0.50	0.51	0.42	0.54	0.56	847.4
Apple M1 Max (32-core GPU)	2.09	3.15	1.93	2.76	2.53	172.1

\(^1\): the SNP markers were restricted to those available on the Illumina Infinium Multi-Ethnic Global platform.

\(^2\): CPU: Intel Xeon Gold 6248 CPU @2.50GHz; CPU memory usage: less than 1GB; per-GPU memory usage: less than 128MB.

Session information

sessionInfo()

## R version 4.3.2 (2023-10-31)
## Platform: aarch64-apple-darwin20 (64-bit)
## Running under: macOS Sonoma 14.1
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib 
## LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## time zone: America/Chicago
## tzcode source: internal
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] ggplot2_3.4.4   SeqArray_1.41.4 gdsfmt_1.36.1   HIBAG_1.36.4    setwidth_1.0-4 
## 
## loaded via a namespace (and not attached):
##  [1] sass_0.4.7              utf8_1.2.4              generics_0.1.3          bitops_1.0-7           
##  [5] digest_0.6.33           magrittr_2.0.3          evaluate_0.22           grid_4.3.2             
##  [9] fastmap_1.1.1           jsonlite_1.8.7          GenomeInfoDb_1.36.4     fansi_1.0.5            
## [13] scales_1.2.1            Biostrings_2.68.1       textshaping_0.3.7       jquerylib_0.1.4        
## [17] cli_3.6.1               rlang_1.1.1             crayon_1.5.2            XVector_0.40.0         
## [21] munsell_0.5.0           withr_2.5.2             cachem_1.0.8            yaml_2.3.7             
## [25] tools_4.3.2             parallel_4.3.2          dplyr_1.1.3             colorspace_2.1-0       
## [29] GenomeInfoDbData_1.2.10 BiocGenerics_0.46.0     vctrs_0.6.4             R6_2.5.1               
## [33] stats4_4.3.2            lifecycle_1.0.3         zlibbioc_1.46.0         S4Vectors_0.38.2       
## [37] IRanges_2.34.1          ragg_1.2.6              pkgconfig_2.0.3         RcppParallel_5.1.7     
## [41] bslib_0.5.1             pillar_1.9.0            gtable_0.3.4            glue_1.6.2             
## [45] systemfonts_1.0.5       highr_0.10              xfun_0.40               tibble_3.2.1           
## [49] GenomicRanges_1.52.1    tidyselect_1.2.0        rstudioapi_0.15.0       knitr_1.45             
## [53] farver_2.1.1            htmltools_0.5.6.1       rmarkdown_2.25          labeling_0.4.3         
## [57] compiler_4.3.2          RCurl_1.98-1.12