Chess AI with Deep Reinforcement Learning

Click to expand

Click to expand

Click to expand
Category:AI/Machine Learning
Client:Personal Project
Duration:January 2026 - Present
Year:2026
My Approach:
Crafting Digital
Excellence
A neural network-based chess engine built with PPO and adaptive optimization, trained through 27 iterative versions to achieve strategic play against Stockfish.
Architecture:
SE-Residual Network with 6 blocks and 128 filters
Dual-head design: Policy Head (move selection) + Value Head (position evaluation)
12-channel board encoding including castling rights and turn indicator
Training Pipeline:
Supervised pre-training on Lichess games dataset
Self-play reinforcement learning with PPO
Stockfish-guided learning with depth 10 evaluations
Policy distillation from Stockfish best moves
Adaptive Optimization Features:
Learning rate warmup & cosine annealing
Dynamic gradient clipping (global norm, per-parameter, adaptive)
Entropy scheduling for exploration-exploitation balance
Auto-freeze mechanism to prevent model collapse
Key Technical Challenges Solved:
Fixed BatchNorm issues causing policy degradation during RL training
Solved draw loops with asymmetric self-play strategy
Improved checkmate execution using tactic puzzles dataset
Evaluation:
Win rate vs random opponents: 93-99%
Win rate vs Stockfish depth 0: 15-35%
Deployment:
Flask-based web interface for human vs AI gameplay
Deployed to Hugging Face Spaces as backend API
Mobile-responsive UI with click-to-move support
Tech Stack: Python, PyTorch, python-chess, Flask, Stockfish, Matplotlib, TensorBoard



