Performs hyperparameter tuning for the bHIVE algorithm over a grid of hyperparameter values or an externally provided data frame of parameter combinations. Evaluates each combination using different metrics:
Usage
swarmbHIVE(
X,
y = NULL,
task = c("clustering", "classification", "regression"),
grid,
metric = NULL,
maxIter = 50,
BPPARAM = SerialParam(),
verbose = TRUE
)Arguments
- X
A numeric matrix or data frame of input features (rows = observations, columns = features).
- y
Optional. A target vector: factor for classification, numeric for regression. If
NULL, clustering is performed.- task
Character. One of
"clustering","classification", or"regression".- grid
A data frame specifying the hyperparameter combinations. Should have columns:
nAntibodies,beta,epsilon. (Optionally more if you want to pass other arguments tobHIVE().)- metric
Character. Name of the evaluation metric. Options:
Classification: "accuracy", "balanced_accuracy", "f1", "kappa"
Regression: "rmse", "mae", "r2"
Clustering: "silhouette", "davies_bouldin", "calinski_harabasz"
- maxIter
Integer. Maximum iterations for each
bHIVErun (default 50).- BPPARAM
Character. A BiocParallel::bpparam() object that can be used for parallelization. The function supports
SerialParam,MulticoreParam,BatchtoolsParam, andSnowParam.- verbose
Logical. If
TRUE, prints progress messages.
Value
A list:
best_params: A list (row) of the best hyperparameters.results: A data frame with the full grid search results, including themetric_valuefor each combination.
Details
- **Classification**: "accuracy", "balanced_accuracy", "f1", "kappa" - **Regression**: "rmse", "mae", "r2" - **Clustering**: "silhouette", "davies_bouldin", or "calinski_harabasz"
**Note**: Some metrics require additional packages or assumptions (e.g., multi-class classification for "f1" is calculated as a macro-average).
Examples
data(iris)
X <- as.matrix(iris[, 1:4])
y <- iris$Species # classification
# Define hyperparameter grid
grid <- expand.grid(
nAntibodies = c(10, 20),
beta = c(3, 5),
epsilon = c(0.01, 0.05)
)
# Perform hyperparameter tuning for classification
tuning_results <- swarmbHIVE(X = X,
y = y,
task = "classification",
grid = grid,
metric = "balanced_accuracy",
maxIter = 10)
#> Starting swarmbHIVE with 8 parameter combinations (task=classification, metric=balanced_accuracy).
#> Evaluating combo 1/8: nAntibodies=10, beta=3, epsilon=0.010
#> Evaluating combo 2/8: nAntibodies=20, beta=3, epsilon=0.010
#> Evaluating combo 3/8: nAntibodies=10, beta=5, epsilon=0.010
#> Evaluating combo 4/8: nAntibodies=20, beta=5, epsilon=0.010
#> Evaluating combo 5/8: nAntibodies=10, beta=3, epsilon=0.050
#> Evaluating combo 6/8: nAntibodies=20, beta=3, epsilon=0.050
#> Evaluating combo 7/8: nAntibodies=10, beta=5, epsilon=0.050
#> Evaluating combo 8/8: nAntibodies=20, beta=5, epsilon=0.050
#> Best parameters found:
#> nAntibodies beta epsilon metric_value
#> 8 20 5 0.05 0.8733333
# For clustering with silhouette
set.seed(42)
X_clust <- matrix(rnorm(100 * 5), ncol = 5)
grid_clust <- expand.grid(nAntibodies = c(5, 10),
beta = c(3, 5),
epsilon = c(0.01, 0.05))
res_clust <- swarmbHIVE(X_clust,
task = "clustering",
grid = grid_clust,
metric = "silhouette")
#> Precomputing distance matrix for clustering metrics.
#> Starting swarmbHIVE with 8 parameter combinations (task=clustering, metric=silhouette).
#> Evaluating combo 1/8: nAntibodies=5, beta=3, epsilon=0.010
#> Evaluating combo 2/8: nAntibodies=10, beta=3, epsilon=0.010
#> Evaluating combo 3/8: nAntibodies=5, beta=5, epsilon=0.010
#> Evaluating combo 4/8: nAntibodies=10, beta=5, epsilon=0.010
#> Evaluating combo 5/8: nAntibodies=5, beta=3, epsilon=0.050
#> Evaluating combo 6/8: nAntibodies=10, beta=3, epsilon=0.050
#> Evaluating combo 7/8: nAntibodies=5, beta=5, epsilon=0.050
#> Evaluating combo 8/8: nAntibodies=10, beta=5, epsilon=0.050
#> Best parameters found:
#> nAntibodies beta epsilon metric_value
#> 3 5 5 0.01 0.1047978
res_clust$best_params
#> nAntibodies beta epsilon metric_value
#> 3 5 5 0.01 0.1047978
