ai-algorithms

ai-algorithms

First-principle implementations of groundbreaking AI algorithms using a wide range of deep learning frameworks, accompanied by supporting research papers.

Stars: 90

Visit
 screenshot

This repository is a work in progress that contains first-principle implementations of groundbreaking AI algorithms using various deep learning frameworks. Each implementation is accompanied by supporting research papers, aiming to provide comprehensive educational resources for understanding and implementing foundational AI algorithms from scratch.

README:

AI Algorithms

This repo is a work in progress containing first-principle implementations of groundbreaking AI algorithms using a wide range of deep learning frameworks. Each implementation is accompanied by its supporting research paper(s). The goal is to provide comprehensive educational resources for understanding and implementing foundational AI algorithms from scratch.

Implementations

  • mnist_self_compressing_nns - Pytorch implementation of "Self-compressing Neural Networks". The paper shows dynamic neural network compression during training - reduced size of weight, activation tensors and bits required to represent weights.
  • mnist_ijepa - Simplified image-based implementation of JEPA (Joint-Embedding Predictive Architecture) - an alternative to auto-regressive LLM architectures pioneered by Prof. Yann LeCun. I-JEPA predicts image segment representations (Target) based on representations of other segments within the same image (Context).
  • nns_are_decision_trees - Simplified implementation of “Neural Networks are Decision Trees”. Showing that any neural network with any activation function can be represented as a decision tree. Since decision trees are inherently interpretable, their equivalence helps us understand how the network makes decisions.
  • mnist_the_forward_forward_algorithm - Implements the Forward-Forward Algorithm proposed by AI godfather Geoffrey Hinton. The algorithm replaces the forward and backward passes in backpropagation with two forward passes on different data with opposite objectives. The positive pass uses real data and adjusts weights to increase goodness in every hidden layer. The negative pass uses "negative data" and decreases goodness.
  • sigmoid_attention - Implements newly introduced Sigmoid Self-Attention by Apple.
  • DIFF_Transformer - Lightweight implementation of newly introduced “Differential Transformer”: Proposes differential attention mechanism which computes attention scores as a difference between two separate softmax attention maps thereby reducing noise in attention blocks. Paper by microsoft.
  • triton_nanoGPT.ipynb - Implements custom triton kernels for training Karpthy's nanoGPT (more improvements needed).
  • generating_texts_with_rnns.ipynb - Implements "Generating Text with Recurrent Neural Networks" - trains a character-level multiplicative recurrent neural network model (~250k params) for 1000 epochs on 2pac's "Hit 'em Up" lol, sample was fun:)
  • deep_pcr.ipynb - Implements "DeepPCR: Parallelizing Sequential Operations in Neural Networks" - a novel algorithm which parallelizes typically sequential operations in order to speed up inference and training of neural networks.
  • seq2seq_with_nns - Lightweight implementation of the seminal paper “Sequence to Sequence Learning with Neural Networks”. Built, trained and eval a 2 layer deep seq2seq LSTM-based model (~10M params) on German-English corpus of Multi30K dataset. In honor of Ilya sutskever et al for winning this year’s NeurIPs Test of Time paper award.
  • discrete_flow_matching - Implements from first-principles a discrete flow matching (DFM) model for code generation. In particular we trained a small sized 2d dfm model on two variations of code for binary search. DFM is a non-autoregressive generative modeling framework recently introduced in this paper by meta.
  • byte_latent_transformer - Here we implement a charcter-level BLT (Byte Latent Transformer) model from scatch under 500 lines of code. The Byte Latent Transformer architecture is a tokenizer-free architecture that learns from raw byte data, recently introduced in this paper by meta.

For Tasks:

Click tags to check more tools for each tasks

For Jobs:

Alternative AI tools for ai-algorithms

Similar Open Source Tools

For similar tasks

For similar jobs