Binary Logistic Regression - ByteDance ML Coding Interview

Question Description

You are asked to implement a binary logistic regression classifier from scratch using NumPy only. The model should accept X (n, d) and binary labels y (n,) and expose learned parameters as public attributes (for example, coef_ and intercept_). The task covers building a class-based API with a training loop that performs a forward pass, computes binary cross-entropy (BCE) loss, backpropagates to compute gradients, and updates parameters via gradient descent across epochs.

The interview flow typically starts with the model signature and data shapes, then moves to implementing fit() with the BCE objective, deriving and coding gradients for weights and bias, and finally writing predict_proba() and predict() that use the learned coef_ and intercept_. You should demonstrate correct vectorized NumPy operations (no external ML libraries), numerical stability in the sigmoid/log loss (e.g., clipping or log-sum-exp style care), and clear public attributes (coef_ shape (d,), intercept_ scalar).

Skill signals interviewers look for: solid understanding of logistic regression and binary cross-entropy, ability to derive and implement gradients, NumPy vectorization and numerical stability, training-loop correctness (learning rate, epochs, convergence checks), and clear API design (fit, predict_proba, predict). Be ready to discuss extensions: regularization, mini-batch or stochastic gradient descent, handling class imbalance, and multiclass generalization via softmax.

ByteDance ML: Binary Logistic Regression (NumPy)

Question Description

Common Follow-up Questions

Related Questions

Explore More Questions

Practice This Question with AI