bHIVE

B-cell Hybrid Immune Variant Engine

Overview

bHIVE is an R package implementing a modular Artificial Immune System (AIS) framework for clustering and classification. Built on AI-Net (de Castro & Von Zuben 2001), bHIVE extends the classical algorithm with biologically-grounded modules drawn from modern immunology: somatic hypermutation, idiotypic network regulation, germinal center selection, and microenvironment-driven adaptation.

Performance-critical operations (affinity/distance matrices, clonal selection, network suppression, mutation) are implemented in C++ via RcppArmadillo, with parallelization support through BiocParallel.

Key Features

Two tasks – clustering and classification on numeric matrices
C++ backend – BLAS-optimized bulk affinity/distance computation
Two APIs – functional (bHIVE()) for quick use and R6 (AINet$new()) for full module composition
Multilayer architecture – honeycombHIVE() for hierarchical prototype refinement across layers
Hyperparameter tuning – swarmbHIVE() with grid search and BiocParallel
Gradient refinement – refineB() post-processing with 5 optimizers and several classification-aware loss functions
Composable immune modules – mix and match biological mechanisms via dependency injection
caret compatible – bHIVEmodel and honeycombHIVEmodel for cross-validation workflows

Installation

devtools::install_github("BorchLab/bHIVE")

Quick Start

Functional API

The simplest way to use bHIVE. Works like any R modeling function:

library(bHIVE)
data(iris)
X <- as.matrix(iris[, 1:4])

# Clustering
res <- bHIVE(X, task = "clustering", nAntibodies = 30, maxIter = 20)
table(res$assignments)

# Classification
res <- bHIVE(X, y = iris$Species, task = "classification",
             nAntibodies = 30, maxIter = 20)
table(Predicted = res$assignments, Actual = iris$Species)

R6 API with Modules

For full control, compose an AINet with any combination of immune modules. Each module is optional and slots into the fit loop at the appropriate biological stage:

# Adaptive mutation + idiotypic network regulation
model <- AINet$new(
  nAntibodies = 20,
  maxIter = 30,
  shm = SHMEngine$new(method = "adaptive", base_rate = 0.1),
  idiotypic = IdiotypicNetwork$new(theta_low = 0.01, theta_high = 0.5),
  verbose = FALSE
)
model$fit(X, iris$Species, task = "classification")
table(model$result$assignments)

# Predict on new data
preds <- model$predict(X[1:10, ])

A richer composition uses microenvironment-aware exploration, Tfh-mediated quality selection, density-driven isotype switching, and a persistent memory pool:

me  <- Microenvironment$new()                              # zone classification
cs  <- ClassSwitcher$new(alpha_IgM = 0.1, alpha_IgG = 5)   # zone -> kernel width
gc  <- GerminalCenter$new(nTfh = 8, selectionPressure = 0.6)
mp  <- MemoryPool$new(archive_threshold = 0.05)            # persists across fits

model <- AINet$new(
  nAntibodies = 30, maxIter = 20,
  shm              = SHMEngine$new(method = "hotspot"),
  microenvironment = me,
  classSwitcher    = cs,            # requires microenvironment
  germinalCenter   = gc,
  memory           = mp,
  verbose          = FALSE
)
model$fit(X, iris$Species, task = "classification")

# Memory persists -- a second fit will recall relevant cells (clustering only)
# and continue archiving high-affinity antibodies.
mp$size()

Architecture

Algorithm

bHIVE evolves a population of antibody vectors to represent structure in data. Each iteration runs a subset of the following stages — modules attach to specific stages and are skipped when absent:

Initialization (once) – sample from data, random generation, kmeans++, or V(D)J combinatorial assembly via VDJLibrary. If a MemoryPool carries cells from a prior fit, relevant memory is recalled and merged into the starting repertoire (clustering only).
Activation gating – ActivationGate sets aside antibodies in over-dense neighborhoods or below an affinity floor so clonal selection runs on the sparse subset; gated antibodies rejoin after.
Clonal selection + SHM – top-k antibodies cloned per data point. Mutation dispatches through SHMEngine (uniform, airs, hotspot, energy, adaptive); the adaptive strategy threads per-antibody Adam-style moment matrices across iterations.
Germinal center selection – GerminalCenter runs Tfh-mediated quality selection; survivors weighted by clustering compactness or classification purity.
Microenvironment & class switching – Microenvironment classifies each antibody into stable / explore / boundary zones and applies density-dependent jitter. ClassSwitcher binds each zone to an isotype (IgM/IgG/IgA) and sets the next iteration’s kernel width.
Idiotypic regulation – IdiotypicNetwork runs bell-shaped Ab-Ab dynamics that cull both over-clumped and isolated antibodies. Falls back to a top-population safety net if ill-tuned thresholds would kill the repertoire.
Network suppression – removes near-duplicate antibodies within an epsilon-ball under the chosen distance metric.
Orphan pruning + final assignment (once) – antibodies that bind no training point are dropped; remaining cells produce cluster IDs or class predictions. MemoryPool archives high-affinity survivors back into the pool.

Immune Modules

Each module is an R6 class that can be injected into AINet via its constructor. All modules are optional – use only what you need.

Module	Biological Basis	What It Does
`SHMEngine`	Somatic hypermutation	5 mutation strategies: uniform, airs, hotspot, energy, adaptive
`IdiotypicNetwork`	Ab-Ab network regulation	Bell-shaped activation dynamics replacing epsilon-threshold suppression
`GerminalCenter`	Tfh-B cell interaction	Task-aware quality selection with resource competition
`Microenvironment`	Tissue microenvironment cues	Density-dependent zone classification and mutation rate modulation
`VDJLibrary`	V(D)J recombination	Combinatorial gene library initialization (PCA, cluster, random partition)
`ActivationGate`	Two-signal activation	Costimulatory filtering (density, danger signal, or label entropy)
`MemoryPool`	Immunological memory	Archive high-affinity antibodies and recall on distribution shift
`ClassSwitcher`	Isotype class switching	IgM (broad) / IgG (specific) / IgA (boundary) kernel width modulation
`ConvergentSelector`	Public clonotypes	Cross-repertoire consensus for ensemble methods

Multilayer & Tuning

# honeycombHIVE: hierarchical prototype refinement
res <- honeycombHIVE(X, y = iris$Species, task = "classification",
                     layers = 3, nAntibodies = 30,
                     refine = TRUE, refineOptimizer = "adam")

# swarmbHIVE: hyperparameter grid search (parallelizable)
grid <- expand.grid(nAntibodies = c(15, 30), beta = c(3, 5), epsilon = c(0.01, 0.1))
best <- swarmbHIVE(X, y = iris$Species, task = "classification",
                   grid = grid, metric = "accuracy", maxIter = 20)
best$best_params

Fine-tune antibody positions after training with refineB():

res <- bHIVE(X, y = iris$Species, task = "classification",
             nAntibodies = 20, maxIter = 20)

# Adam-based refinement with cross-entropy loss
A_refined <- refineB(res$antibodies, X, y = iris$Species,
                     assignments = res$assignments,
                     task = "classification",
                     loss = "categorical_crossentropy",
                     optimizer = "adam", steps = 10, lr = 0.01)

Affinity & Distance Functions

Affinity	Formula	Use Case
`gaussian`	exp(-alpha \|\|x - a\|\|^2)	General purpose (default)
`laplace`	exp(-alpha \|\|x - a\|\|)	Heavier tails than Gaussian
`polynomial`	(x . a + c)^p	Non-Euclidean similarity
`cosine`	(x . a) / (\|\|x\|\| \|\|a\|\|)	Direction-based similarity
`hamming`	1 - (mismatches / d)	Categorical/binary features

Distance	Notes
`euclidean`	Default; L2 norm
`manhattan`	L1 norm
`minkowski`	Generalized Lp (parameter p)
`cosine`	1 - cosine similarity
`mahalanobis`	Accounts for feature covariance (requires Sigma)
`hamming`	Count of differing features

Bug Reports / Feature Requests

If you run into any issues or bugs please submit a GitHub issue with details of the issue. If possible please include a reproducible example. Any requests for new features or enhancements can also be submitted as GitHub issues.

Contributing

We welcome contributions to the bHIVE project! To contribute:

Fork the repository.
Create a feature branch (git checkout -b feature-branch).
Commit your changes (git commit -m "Add new feature").
Push to the branch (git push origin feature-branch).
Open a pull request.