Hi there, Iβm Sunit!
π¨βπ»Iβm a Machine Learning Engineer with 7 years of full-time work experience in Machine Learning, Natural Language Processing and Data Science. I graduated with my Masterβs in Computer Science from Ohio State University in 2024, wherein my research focus was conversational AI. My highlight projects during grad school are in Distributed Systems and Advanced Database Systems.
π¬I have demonstrable experoence in Natural Language Processing and Content Recommendation Systems. With the advent of LLMs and their use in resource constraint environments, I focused on parameter efficient fine-tuning techniques(PEFT) in LLMs for downstream tasks in dialogue systems, and representation learning. This efficiently helped apply the domain specific knowledge of LLMs to a single task/multi task setting.
πCheck out my Masterβs Thesis on Task Oriented Dialogue Systems here.
Selected Experience
π§π»βπΌMachine Learning Engineer at Orbit Systems : Created the NER-Engine to classify and annotate news articles with entities indicating useful leads for JobsOhio to approach businesses looking to invest and expand operations. Designed and developed asynchronous architecture using Apache Kafka to prompt large language model(GPT-4o) in a parallelized event micro-batching approach to tag continuous stream of scraped articles.
π€Research Associate at Ohio State University : Co-led the universityβs team in the Amazon Alexa Taskbot Challenge. We developed a versatile and personalizable dialogue agent, hosted on AWS and deployed as an Alexa Skill, to assist users with Cooking & DIY tasks. We published our approach in the ACL/SIGDIAL 2023 conference Publication, Github
ποΈSenior Machine Learning Engineer at Compass : Created the Similar Homes Recommendation System at Compass to recommend house listings to users based on their preferences. I developed a scorer model that was used in a Learning-to-Rank style pipeline to generate house recommendations.
π§π»βπΌSenior Data Scientist at Salesken : Created a Sales Conversations Auto-Pilot tool to assist sales agents drive sales conversations in a productive and streamlined manner. Blog
Projects
πKnowledge Distillation for Dialogue QA - Trained a specialist LLM model using knowledge distillation from open-source GPT 3.5 (ChatGPT) on task related QA dataset. Built RAG pipelines using Parent Document Retriever and Contextual Compression Retriever and performed TruLens evaluation for the RAG Triads(context relevance, groundedness and answer relevance) GitHub
π€RLHF summary generation - Created a Reinforcement Learning from Human Feedback(RLHF) based text summarizer to remove toxic(offensive/bias) content from summaries. GitHub Architected a novel cross-attention based network to generate summaries focusing on questions asked from the passage. GitHub
ποΈLLM-Prompting/ElasticSearch based Recommendation System - Created an end-to-end property recommendation system using LLM prompting for dialogue with user, and ElasticSearch to generate recommendations based on user preferences captured through the dialogue with the system. GitHub
ποΈComprehensive Study of Modern Data Analytics and Storage Engines - Performed a detailed comparitive study of 10 state-of-the-art data analytics and storage engines used as commercial big data warehousing and data processing solutions. Report
Courses and Certifications
Check out my certifications! I am always learning!!
π AWS Cloud Technical Essentials - AWS Certificate
π Generative AI with Large Language Models - DeepLearning.ai Certificate
π Introduction to Retrieval Augmented Generation (RAG) - Duke University Certificate
π Building and Evaluating Advanced RAG - Short course by DeepLearning.ai Accomplishment
π Natural Language Processing in TensorFlow - DeepLearning.ai Certificate
π Deep Learning Specialization - DeepLearning.ai Certificate
π Neural Networks for Machine Learning - University of Toronto Certificate
π Transfer Learning for NLP with TensorFlow Hub - Coursera Certificate
π Introduction to TensorFlow for Artificial Intelligence, Machine Learning, and Deep Learning - DeepLearning.ai Certificate
π Machine Learning Foundations: A Case Study Approach - University of Washington Certificate
π Siamese Network with Triplet Loss in Keras - Coursera Certificate
π Sentiment Analysis with Deep Learning using BERT - Coursera Certificate
