Timeline: 6th December, 2024 - PRESENT
Description: This documentation site serves as a personal repository of notes and practical implementations from my journey of learning and building GPT models, inspired by Andrej Karpathy's Neural Networks: Zero to Hero series. It is organized into two sections: Set-1 focuses on foundational concepts like backpropagation and language modeling, while Set-2 explores advanced topics such as transformer architectures, tokenizers, and GPT-2 reproduction. Designed as a resource for both revision and inspiration, it is also open for others to reference and learn from.
View Project Visit SiteTimeline: 15th January, 2025 - PRESENT
Description: In this we take the MLP implemented in the Language Model-3 project and backpropagate through it manually without using PyTorch autograd's loss.backward()
. The aim is to get a strong intuitive understanding about how gradients flow backwards through the compute graph and on the level of efficient Tensors, not just individual scalars like in the Micrograd project.
Timeline: 6th - 14th January, 2025
Description: Focused on implementing Batch Normalization within a neural network framework, emphasizing its role in stabilizing activations and gradients during training. Covered techniques like Kaiming initialization to enhance weight scaling and prevent saturation of activation functions. Also analysed the effects of Batch Normalization on convergence speed and overall model performance. Visualizations have also been used to monitor activations and gradients, providing valuable insights into the training dynamics.
View Project View Road-to-GPTTimeline: 26th November - 11th December, 2024
Description: Implemented a MLP language model from the 'Bengio et al. 2003' research paper but on a character-level based prediction following Andrej Karpathy's approach and even slighly improved the final loss value.
View Project View Road-to-GPTTimeline: 4th - 22nd November, 2024
Description: Worked on implementing a bigram character level language model from scratch to generate text, exploring key concepts in natural language processing such as normalisation, probability distributions, sampling new words and evaluating the model based on the Negative Log Likelihood value. Also casted the same bigram problem into a neural network to produce a similar output but by using gradient based optimization to tune the parameters of the network, following Andrej Karpathy’s methodology.
View Project View Road-to-GPTTimeline: 2nd - 27th October, 2024
Description: Built a neural network from scratch by developing a micrograd library, implementing core concepts like backpropagation and gradient descent following Andrej Karpathy’s methodology.
View Project View Road-to-GPTTimeline: 8th - 24th September, 2024
Description: Developed a Portfolio site using Flask, HTML, CSS, and JavaScript. Along with a functioning chatbot developed using RASA.
View Project Visit WebsiteTimeline: 21st August, 2024 - Present
Description: Project outlet for everything that I am learning from DeepLearning.AI's short courses
View ProjectTimeline: 2nd - 18th August, 2024
Description: PrivateGPT is an Open Source project available on the internet. The purpose of this was to establish, run and use my own Private AI without worrying about my data getting leaked. Can even use this without internet connection after setup.
View ProjectTimeline: 7th - 24th July, 2024
Description: Project outlet for the Angular Specialization Course (Temporarily Archived)
View ProjectTimeline: Nov 2022 - June 2023
Description: Group project done as part of my Final Year Project during my UG Course. We developed an intelligent solution for creating custom question papers using custom designed Machine Learning Algorithms. Our website is designed to make the process of generating question papers quick, efficient and hassle-free.
View Project Visit Website View Research PaperTimeline: Sept 2022 - Oct 2022
Description: Developed a Stack Overflow website clone with an alternative design theme. This was done for the project implementation purpose while learning MERN Stack. MongoDB Atlas Cloud database was used.
View Project Visit WebsiteTimeline: Dec 2021 - Feb 2022
Description: Developed a Video streaming platform (Similar to Netflix) as part of a Subject-based Project during my UG Course. The site has a high quality, responsive design along with custom made posters. Developed using HTML, CSS and JavaScript with PHP for Server-Side Scripting.
View ProjectTimeline: June 2022 - Aug 2022
Description: Developed an App using JAVA in Android Studio as part of a Mini-Project during the Pre-Final Year of my UG Course. The app can detect and list any type of Audio files in your mobile device. The user could play/pause and change the songs along an active Seekbar (Implemented using Threads) for an interactive experience.
View Project