Timeline: 6th December, 2024 - PRESENT
Description: This documentation site serves as a personal repository of notes and practical implementations from my journey of learning and building GPT models, inspired by Andrej Karpathy's Neural Networks: Zero to Hero series. It is organized into two sections: Set-1 focuses on foundational concepts like backpropagation and language modeling, while Set-2 explores advanced topics such as transformer architectures, tokenizers, and GPT-2 reproduction. Designed as a resource for both revision and inspiration, it is also open for others to reference and learn from.
Timeline: 11th - 21st February, 2025
Description: This project is a ground-up implementation of a GPT-style transformer, following Andrej Karpathy’s tutorial. It begins with a naive bigram language model and gradually evolves into a full transformer architecture with multi-head self-attention, feedforward layers, residual connections, and layer normalization. The implementation provides hands-on insight into self-attention, positional encodings, and the key building blocks of modern large language models. By the end, the model can generate text based on learned patterns, demonstrating the power of transformers in natural language processing.
Timeline: 8th - 9th February, 2025
Description: In this phase, we transform a basic 2-layer MLP into a deep, tree-like convolutional architecture inspired by DeepMind's WaveNet (2016), leveraging convolutional layers to effectively capture hierarchical patterns in the data. This implementation shifts from fully connected layers to a more complex network structure, delving into the inner workings of torch.nn
in PyTorch, and highlighting the iterative deep learning development process.
Timeline: 15th January - 6th February, 2025
Description: In this we take the MLP implemented in the Language Model-3 project and backpropagate through it manually without using PyTorch autograd's loss.backward()
. The aim is to get a strong intuitive understanding about how gradients flow backwards through the compute graph and on the level of efficient Tensors, not just individual scalars like in the Micrograd project.
Timeline: 6th - 14th January, 2025
Description: Focused on implementing Batch Normalization within a neural network framework, emphasizing its role in stabilizing activations and gradients during training. Covered techniques like Kaiming initialization to enhance weight scaling and prevent saturation of activation functions. Also analysed the effects of Batch Normalization on convergence speed and overall model performance. Visualizations have also been used to monitor activations and gradients, providing valuable insights into the training dynamics.
Timeline: 26th November - 11th December, 2024
Description: Implemented a MLP language model from the 'Bengio et al. 2003' research paper but on a character-level based prediction following Andrej Karpathy's approach and even slighly improved the final loss value.
Timeline: 4th - 22nd November, 2024
Description: Worked on implementing a bigram character level language model from scratch to generate text, exploring key concepts in natural language processing such as normalisation, probability distributions, sampling new words and evaluating the model based on the Negative Log Likelihood value. Also casted the same bigram problem into a neural network to produce a similar output but by using gradient based optimization to tune the parameters of the network, following Andrej Karpathy’s methodology.
Timeline: 2nd - 27th October, 2024
Description: Built a neural network from scratch by developing a micrograd library, implementing core concepts like backpropagation and gradient descent following Andrej Karpathy’s methodology.
Timeline: 8th - 24th September, 2024
Description: Developed a Portfolio site using Flask, HTML, CSS, and JavaScript. Along with a functioning chatbot developed using RASA.
Timeline: 21st August, 2024 - Present
Description: Project outlet for everything that I am learning from DeepLearning.AI's short courses
Timeline: 2nd - 18th August, 2024
Description: PrivateGPT is an Open Source project available on the internet. The purpose of this was to establish, run and use my own Private AI without worrying about my data getting leaked. Can even use this without internet connection after setup.
Timeline: 7th - 24th July, 2024
Description: Project outlet for the Angular Specialization Course (Temporarily Archived)
Timeline: Nov 2022 - June 2023
Description: Group project done as part of my Final Year Project during my UG Course. We developed an intelligent solution for creating custom question papers using custom designed Machine Learning Algorithms. Our website is designed to make the process of generating question papers quick, efficient and hassle-free.
Timeline: Sept 2022 - Oct 2022
Description: Developed a Stack Overflow website clone with an alternative design theme. This was done for the project implementation purpose while learning MERN Stack. MongoDB Atlas Cloud database was used.
Timeline: Dec 2021 - Feb 2022
Description: Developed a Video streaming platform (Similar to Netflix) as part of a Subject-based Project during my UG Course. The site has a high quality, responsive design along with custom made posters. Developed using HTML, CSS and JavaScript with PHP for Server-Side Scripting.
Timeline: June 2022 - Aug 2022
Description: Developed an App using JAVA in Android Studio as part of a Mini-Project during the Pre-Final Year of my UG Course. The app can detect and list any type of Audio files in your mobile device. The user could play/pause and change the songs along an active Seekbar (Implemented using Threads) for an interactive experience.