Machine Learning Engineer
CurrentAfter preprocessing a research paper dataset of 1.5 M through the nougat models inner working, it was used to develop a custom tokenizer of gpt2 with a vocab size of 80365 tokens. Then using our custom tokenizer we fine-tuned our very own custom gpt2 small model with a loss of 1.67.A meta-llama/Meta-Llama-3-70B 16 bit model was fine-tuned on our dataset to create a mindfulness coachchat-bot. Which was then saved on hugging face to load the pretrained model for inference.Another new style of prompting was integrated called MEMWALKER or MEMORY MAZE or MEMORYNAVIGATOR, this method was used from a research paper published by Meta AI. in this method we createan agent that decides whether the model has done a specific task or not.Two databases were used, Mongo and Firebase, to store the data was required. Firebase was mainly used for authentication and MongoDB Compass was used to store chats. Python APIs were created to send messages and store data from the server to the app.