How to build a soft-sizing entry gate from a logistic regression in 200 lines
Annotated walkthrough of vectra-ml's entry classifier. 10 features, 6.4k training samples, sklearn-style fit/predict, exported to JSON for the live engine. No frameworks.
Vectra's entry classifier is a logistic regression. 10 features, 6.4k training samples, ~200 lines of Rust including the export path the live engine consumes. No frameworks, no Python, no scikit-learn dependency at runtime.
This post walks through every step of building one. By the end you can train, evaluate, and ship a comparable model end-to-end in a Rust workspace.
The 10 features
The features describe the bar at which the rule chain produced an Open(side) decision. Each is normalised to roughly mean-0 std-1 across the training corpus.
pub struct VolMomEntryFeatures {
pub signal_zscore: f64, // effective signal at entry
pub consecutive_signal: f64, // bars in same-sign streak
pub vol_scalar: f64, // vol-target sizing multiplier
pub breadth_count: f64, // bullish symbols this bar
pub regime_bull: f64, // 0/1 from heuristic classifier
pub regime_bear: f64,
pub above_sma: f64, // 0/1: price > 100-bar SMA
pub funding_rate: f64, // current 8h funding rate
pub bar_in_session_pct: f64, // 0..1 progress through UTC day
pub recent_winrate_5: f64, // last 5 trades on this symbol
}rustTraining
Standard binary cross-entropy with L2 regularisation. The label is whether the trade closed positive (1) or negative (0) — measured against the rule chain's actual execution, not a simulated optimum.
pub fn train_lr(
samples: &[Sample],
learning_rate: f64,
epochs: usize,
l2: f64,
) -> Weights {
let mut w = vec![0.0; FEATURES + 1]; // +1 for bias
for _ in 0..epochs {
let mut grad = vec![0.0; w.len()];
for s in samples {
let z = dot(&w, &s.features) + w[FEATURES];
let p = sigmoid(z);
let err = p - s.label as f64;
for i in 0..FEATURES {
grad[i] += err * s.features[i];
}
grad[FEATURES] += err;
}
for i in 0..w.len() {
grad[i] /= samples.len() as f64;
grad[i] += l2 * w[i];
w[i] -= learning_rate * grad[i];
}
}
Weights { w }
}rustHyperparameters that mattered
learning_rate = 0.05,epochs = 400,l2 = 0.01. Found via 5-fold CV on the first 80% of the training data; validated on the last 20% before going to OOS test.- Don't standardise the bias. Easy bug — feed the full feature vector through the same scaler as the data and you'll inadvertently shift the intercept. We feed bias as a separate weight, undeleted.
- Class weighting. Our 6.4k sample set is 53/47 win/loss. Adding inverse-frequency class weights moved OOS AUC from 0.523 to 0.529. Small but real.
Evaluation
We evaluate via 5-fold time-series CV (no shuffling — leakage in order-of-time data is brutal). The headline metric is OOS Brier score; AUC and accuracy are secondary because we don't make a binary decision in production (see the soft-sizing post).
pub fn brier(preds: &[f64], labels: &[u8]) -> f64 {
let n = preds.len() as f64;
preds.iter().zip(labels)
.map(|(p, &y)| (p - y as f64).powi(2))
.sum::<f64>() / n
}rustOur v6 model lands at Brier 0.247 on OOS — versus 0.250 for a constant 0.5 prediction. Small, defensible, ships.
Export
The live engine reads model weights from a JSON file at startup (VECTRA_VOL_MOM_ENTRY_MODEL=...path...). Export is a single serde write:
#[derive(Serialize, Deserialize)]
pub struct ExportedModel {
pub weights: Vec<f64>,
pub feature_names: Vec<String>,
pub trained_at: String,
pub training_n: usize,
pub oos_brier: f64,
}
impl ExportedModel {
pub fn save(&self, path: &Path) -> Result<()> {
let json = serde_json::to_string_pretty(self)?;
std::fs::write(path, json)?;
Ok(())
}
}rustThe runtime side uses the same struct with Deserialize. No version negotiation; if the JSON schema changes, old models fail loudly at startup with a serde error. We prefer that to silent miscalibration.
Why not a deep model
We've been asked. The honest answer is: 6.4k samples, 10 features, and we're already nowhere near the Bayes-optimal predictor on this feature set. A deep model wouldn't help — feature quality would. And our deep-model regression test (a 32-unit MLP, same features, same training setup) lifted Brier by 0.002. Within noise. Not worth the runtime weight.
When we have a fundamentally richer feature surface — order-book microstructure, cross-asset signals, on-chain — we'll revisit the architecture decision. Until then, logistic regression is honest and fast and we know how to debug it.
Published by Floris V. · Vectra operator
April 8, 2026