🧑🏫 60 Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠
RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.
Tutorials on implementing a few sequence-to-sequence (seq2seq) models with PyTorch and TorchText.
A collection of important graph embedding, classification and representation learning papers with implementations.
An ultimately comprehensive paper list of Vision Transformer/Attention, including papers, codes, and related websites
Get updates on the fastest growing repos and cool stats about GitHub right in your inbox
Once per month. No spam.