JAX experiments on tiny character-level language models with clean baselines.

Goal

Study how capacity and training choices affect small character-level language models, using a consistent pipeline and reproducible baselines.

My Contributions

  • End-to-end script - Built a single runner to execute all experiments and log results
  • Data pipeline - Implemented fixed-length context/target sampling from a character corpus
  • Report - Produced plots, timing tables, and sample generations in the project report

Experiments

  • Regression warm-up - Quadratic fit with signed gradient descent
  • Baselines - Constant and linear context models with cross-entropy loss
  • Nonlinear models - Single-hidden-layer MLP and a two-layer ReLU variant
  • Generation - Sampled text from each model to compare structure and coherence

Technical Stack

  • Python, Jupyter - Experimentation and tooling
  • JAX - Model training
  • NumPy, Matplotlib - Data handling and plots