ueaj

Machine Learning & AI Research

Major AI/ML Projects

Multiscale Muon¹

successful

Optimizer with multiple momentum buffers at different timescales. Inspired by further evidence from other groups of hierarchical structures like those initially explored in 'Hyperbolic Space' and multi-temporal memory in humans.

Hierarchical Marginalization

theory

Mathematical proof of the exactness of sub-graph partitioned marginalization in factor graphs using 'port nodes'. Proves existence of exact message passing algorithms through hierarchical cutset conditioning for probabilistic inference.

¹MacroGPT-JAX

completed

My own pretraining script written from scratch, specialized for hackability and OOD research. Reaches tps parity with llm.c.

Hyperbolic Space

theory successful

Evidence that neural networks learn multiscale hierarchical structures, confirmed by statistical analysis of model weights showing possible multiscale structures in MLP matrices.

BPTT got hands¹

null theory

Exploration of the consequences backpropagation through time (BPTT)'s "impossible triangle" for linear sequence modeling. Evaluates a novel surrogate gradient method for TTT modules. Partial success, demonstration of validity of the theory but insufficient for training.

Associative Transformer¹

null

Attempted local learning rule where each transformer layer acts as its own momentum buffer, treating layers as compressions of past gradients. Though theoretically interesting, empirically failed to converge.

Education

M.S. in ML/AI - Northeastern University (Expected: May 2027)

B.S. in Computer Science - University of Texas at Dallas (May 2025)

IBM Professional Data Science Certification - September 2023

Relevant Coursework: Advanced Calculus, Intro to Machine Learning, Intelligent Systems Analysis/Design, Extragalactic Astrophysics, Probability, Quantum Physics for Programmers and Engineers, Big Data Management and Analytics, Undergraduate Research, Artificial Intelligence

smaller projects

Who art thou, 2.5pro

philosophy

Philosophical exploration of AI consciousness and identity through dialogue with language models.

Low Precision Low LR

null theory

Investigation into overcoming quantization barriers in low-precision training through collective precision methods.

ScMoE

theory highlight

A highlight of a paper that beat me to an idea I had for low-mem bandwidth (i.e. mac, CPU) specialized architectures. Namely using speculative expert decoding (moving the router to before the attn mechanism) to preload experts.

ParaRNN and beyond

highlight theory pedagogy

Highlighting a new paper that represents a new abstraction, detailing how I extracted this abstraction from the paper, how it can be applied in novel ways. Also replicates the paper partially in JAX in a modular way that can be used for other projects.

Prompt Ensembles

theory null

Testing if randomized persona prompts can increase LLM output diversity and reduce entropy collapse. Benchmarked across major model families using coin flips and dice rolls - works best for Anthropic models, less effective elsewhere, suggesting architectural approaches may be needed.

Optimal Configuration Code

pedagogy

The objectively perfect configuration system for Python, balancing specificity with readability and re-use for research code.

Fast Min-P

theory

Mathematical derivation of a faster Min-P sampling algorithm that avoids full softmax computation through clever use of log-space operations.

Cyberspace Evolutionary Automata

proposal

Hyperspace cellular automata adapted to GPU cluster geometry, enabling evolution of computationally efficient organisms through local learning rules.

Scaling Laws

pedagogy theory

Comprehensive analysis of neural scaling laws and their implications for model performance and efficiency.

Parameter Scaling

pedagogy theory

Deep dive into how model performance scales with parameter count across different architectures.

regular software dev

Jerraria

incomplete complex

Proof-of-concept multithreaded Terraria clone written from scratch in Java. Features highly concurrent programming, distributed computing design, and custom graphics with LWJGL/OpenGL.

ARRP

completed 🚀 impact

Advanced Runtime Resource Packs for Minecraft modding. Over 7 million downloads, enabling dynamic resource generation at runtime.

OpenEasterEggsLib

completed complex

Novel Minecraft mod concept featuring encrypted recipes for developers. Implements public-private cryptography, hashing, and creative cybersecurity solutions.

Fabric Transfer API

completed 🚀 impact

Essential contributor to a 2-year design discussion for FabricMC modloader API. Led theoretical development and initial implementations, culminating in the final production API.

Amalgamation

completed

Gradle plugin for setting up FabricMC development environments in record time. Features highly concurrent programming and high-performance I/O optimizations.