Boris Zubarev

Boris Zubarev Email and Phone Number

Machine Learning Consultant @ beSirius
Tbilisi, Georgia
Boris Zubarev's Location
Tbilisi, Georgia, Georgia
Boris Zubarev's Contact Details

Boris Zubarev personal email

n/a
About Boris Zubarev

As a seasoned professional with over six years in Conversational AI, Deep Learning and Natural Language Processing, I excel in roles as a Machine Learning Engineer, Research Engineer, Team Lead and Mentor. I am intrigued by the potential of Large Language Models (LLMs) and looking for opportunities to utilise my talents in this area. Although I'm based in Armenia, I am globally mobile and prepared for remote work arrangements.My achievements:- Repeatedly achieved significant success in product development: increased customer satisfaction rates in open domain dialog system (from 20% to 75%), reduced translation costs for various B2B clients (from 15% to 45%), and improved machine translation quality (by up to 50%).- Developed an open-domain dialog system that outperformed competitors under significantly limited resources, which was integrated into VK, reaching 86% of Russian users (101.7 million).- Built various projects from scratch: machine translation quality estimation, automatic error correction in translation, 4 open domain and goal-oriented dialog assistants.- Managed a total of 15 people in different companies (the largest team is 4 people).- Trained, distil and deployed in production LLMs (7B, 13B, 30B) using novel approaches such as LoRA.- Conducted a course that had been awarded for 4 consecutive years and graduated 4 cohorts of students (more than 80 people). Many of the students work in positions like Team Lead, Senior, PhD, Middle in leading technology companies and universities in Russia and beyond.- Led and actively participated in student research. Master thesis projects: LLM Dialog Model Capable of Understanding and Sending Images (2023); Semantic Model Distillation (2020); Using Memory in Dialog Models (2023); Augmentation Methods NN Models (2020); Multi-domain Sentence Embeddings (2020).- Pioneered the development of one of the first banking chatbots in Russia.- Accomplished in developing solutions using the GPT-family API, including dataset generation using advanced methods.Tools: Python, PyTorch, transformers, Docker, FastAPI, GCP, LLMs, ONNX, Triton Inference Server, unit tests, Git, MongoDB, Redis, Celery, Amplitude, Github Actions, Circle CI, Sentry, codecov, mypy, black, pre-commit hooks, faiss, Notion, Agile, Scrum, prompt engineering, CoT, data labeling and collecting using LLMs.Contact me directly at bobazooba@gmail.com

Boris Zubarev's Current Company Details
beSirius

Besirius

View
Machine Learning Consultant
Tbilisi, Georgia
Website:
besirius.com
Employees:
33
Boris Zubarev Work Experience Details
  • Besirius
    Machine Learning Consultant
    Besirius
    Tbilisi, Georgia
  • Aiphoria
    Machine Learning Engineer
    Aiphoria Nov 2023 - Present
    Cyprus
  • Bird Speak
    Founder
    Bird Speak Oct 2022 - Present
    Your Personal AI Native Speaker20,000+ happy usersPreviously Papaya AI
  • Komplete Ai
    Machine Learning Consultant
    Komplete Ai Jul 2023 - Nov 2023
    Consulting projects and the development of my and open-source projectsConsulting project about summarization of multiple news articles into one structured document. The main metric was the percentage of summaries approved by moderators, which was increased from 69% to 78%. This work also significantly sped up the operations of the content team, which previously wrote summaries on their own. The primary model used for generating the final document was Llama 2 13B, which was inferred using HuggingFace's TGI. Several models were also developed to evaluate individual sections of the summarized document.X—LLM - 350+ stars open-source library for easy training LLM with advanced methods (QLoRA, Flash Attention 2, Gradient checkpointing, bitsandbytes, DeepSpeed, FSDP, GPTQ). Developed independently from scratchPapaya AI — AI-powered English practice chatbot with 15,000+ users and subscribers. Built an entirely original product from the ground up that provides a fun and easy way for individuals to practice English. Developed a backend (Fast API, ChatGPT, MongoDB, Redis, Celery, Digital Ocean)https://papaya-ai.comDeveloped an open-source LLM for open domain conversation. Based on Mistral 7B. For training I used xllm library, QLoRA, gradient checkpointing, DeepSpeed for multi-gpu training. The model was trained using 1,112,000 dialogs for 10,000 steps (total 334 mln tokens) with a batch size of 128. Maximum length at training was 2048 tokensOriginal: https://huggingface.co/BobaZooba/Shurale7B-v1GPTQ quantized version: https://huggingface.co/BobaZooba/Shurale7B-v1-GPTQTale Quest - An interactive text-based game with dynamic AI characters. Developed a backend using Fast API, ChatGPT, MongoDB, Redis, Celery, Digital Ocean and LLMOps using TGI, GPTQ quantized modelhttps://t.me/talequestbot
  • Modelfront
    Machine Learning Tech Lead
    Modelfront Apr 2022 - Jul 2023
    Yerevan, Armenia
    Startup in machine translation quality estimation field. Making quality translation radically cheaper.Working on a full-cycle machine translation quality estimation system.— Reduced translation costs for various B2B clients in different domains up to 35%— Developed custom classification metrics from which the business value of the model can be directly calculated. As a result, the time-to-market of new models reduced from weeks to a day— Developed from scratch an internal AutoML service for training translation quality estimation models in a large scale for each client— Developed a inference system for serving multiple models (30+ on a single A100 40gb) using ONNX and Triton Inference Server— Trained using LoRA and deployed in to production a translation error correction model (T5 3.7B)— Increased the quality of automatic translations by 37% through a translation error correction system— Responsible for machine learning in a company, team size is 2 people— Developed a management system, project management and communication in the company
  • National Research University — Higher School Of Economics
    Research Advisor
    National Research University — Higher School Of Economics Oct 2022 - Jun 2023
    Supervised three masters students in Computational Linguistics. My tasks included: to manage a process, to draw up a document with a detailed idea, to make a plan of experiments, to help in the implementation of ideas, to track the progress of students, to help conduct experiments and all other assistance.Theses:— Generative Dialog Model Capable of Understanding and Sending Images— Using Memory in Generative Dialog Models
  • National Research University — Higher School Of Economics
    Lecturer
    National Research University — Higher School Of Economics Sep 2019 - Jun 2023
    A public research university which has been recognized as the best in Russia many times.Deep Learning in Natural Language Processing Course for Computational Linguistics MSc's. Graduated 4 streams of students, that is, more than 80 machine learning engineers who work in large companies in Russia and the world. Graduates start working in companies: Yandex, SberDevices, Sberbank, Skoltech, Higher School of Economics, etc. Many of them became team leaders, seniors and PhD's.Official achievements by students voting:• Category "Best Course for Career Development"• Category "Best Course for Broadening Horizons and Diversity of Knowledge and Skills"• Category "Best Course for New Knowledge and Skills"
  • National Research University — Higher School Of Economics
    Research Advisor
    National Research University — Higher School Of Economics Dec 2019 - Jul 2020
    Supervised three masters students in Computational Linguistics. All students have successfully completed their studies and received high marks. My tasks included: to draw up a document with a detailed idea, to make a plan of experiments, to help in the implementation of ideas, to track the progress of students, to help conduct experiments and all other assistance.Theses:— Semantic distillation from cross-encoder to bi-encoder— Multi-domain sentence embeddings— Transfer learning using NN-based augmentation techniques
  • Sberdevices
    Senior Research Engineer
    Sberdevices Aug 2021 - Mar 2022
    Moscow, Moscow City, Russia
    A largest Russian company developing machine learning systems. Main product: devices with a family of voice assistants.Developed generative models for open domain dialog.— Improved generative model responses (SSA by 7), which led to a significant improvement in the user experience— Developed an original evaluation methodology and an automatic dialog model evaluation system, which led to a significant reduction in time-to-market, improved user experience, increased dialog length
  • Mail.Ru Group
    Research And Development Team Lead
    Mail.Ru Group Jan 2021 - Aug 2021
    Moscow, Moscow City, Russia
    A largest Russian IT company that serves 86% (101.7 million users) of Russian residents.Worked on improving the chatter skill in the voice assistant Marusya. This skill has the highest load and the highest number of users (MAU 90% of all users and 25% of all messages to Marusya).— Built an open-domain dialogue system that outperformed competitors under significantly limited resources (money spent, team size, training and deployment resources were up to 10 times less)— Increased customer satisfaction with the open domain dialogue system from 20% to 75%— Successfully integrated Marusya into the social network VK for all users (101.7 million, 86% of users in Russia)— Reduced time-to-market from months to weeks by completely overhauling and restructuring the data collection, model training, testing, and implementation into production process— Established an iterative process for labeling and training models, in part due to my own research in the area of active learning. This significantly increased labeling efficiency and reduced the time needed to achieve desired metrics from a month to a week— Headed chit-chat research team of 4 people
  • Mail.Ru Group
    Senior Research Engineer
    Mail.Ru Group Jan 2020 - Jan 2021
    Moscow, Moscow City, Russia
    Dialog Systems Group, Natural Language Understanding Team.Working on intent classifier and chit-chat skill in voice assistant Marusia:— Trained and fine-tuned metric learning models for intentclassification and retrieval chit-chat— Developed and distilled intent classifier metric learning models— Distilled retrieval and generative chit-chat models— Developed custom models (also implemented papers)— Optimized the markup process using models for chit-chat corpus(active learning)
  • Sberbank
    Middle Data Scientist
    Sberbank Jan 2019 - Jan 2020
    Moscow, Moscow City, Russia
    Working on voice assistant for corporate clients:— Developed and deployed a system from scratch, that classifies user intent through voice interactions with a bank— Text classification on a small dataset using neural networks and various augmentation techniques— Applying transfer learning and domain adaptation— Training and using pre-trained language models for transfer learning— Implementing state-of-the-art neural networks— Training metric learning models— Building multi-stage training pipeline for effective domain to domain learning— NER BERT based model distillation— Neural networks production deployment
  • Modulbank
    Middle Data Scientist
    Modulbank Sep 2017 - Oct 2018
    Ufa, Bashkortostan, Russia
    Launched one of the first dialogue assistant for banks in RussiaProjects:— Chatbot (prototyping, development intent classification, dialog manager, NER, microservice backend as API, production deployment, part of product management and business development like interaction design and Conversation UI)— Recommender system for acquiring clients (also microservice development and product management);— Scoring (EDA, NLP and geo data preprocessing, feature engineering, train, validation and optimization models like gradient boosting and random forest using LightGBM, XGBoost and scikit-learn)— Clustering (topic modeling on clients acquiring data using Big-ARTM)

Boris Zubarev Skills

Deep Learning Natural Language Processing Goal Oriented Dialog System Machine Learning Data Analysis Python Data Mining Programming Software Development Pytorch Pandas Keras Git Word Embeddings Seq2seq Scikit Learn Seaborn Numpy Big Artm Nltk Matplotlib Plotly Dssm

Boris Zubarev Education Details

Frequently Asked Questions about Boris Zubarev

What company does Boris Zubarev work for?

Boris Zubarev works for Besirius

What is Boris Zubarev's role at the current company?

Boris Zubarev's current role is Machine Learning Consultant.

What is Boris Zubarev's email address?

Boris Zubarev's email address is b.****@****mail.ru

What schools did Boris Zubarev attend?

Boris Zubarev attended Ufa State Petroleum Technical University.

What skills is Boris Zubarev known for?

Boris Zubarev has skills like Deep Learning, Natural Language Processing, Goal Oriented Dialog System, Machine Learning, Data Analysis, Python, Data Mining, Programming, Software Development, Pytorch, Pandas, Keras.

Not the Boris Zubarev you were looking for?

Free Chrome Extension

Find emails, phones & company data instantly

Find verified emails from LinkedIn profiles
Get direct phone numbers & mobile contacts
Access company data & employee information
Works directly on LinkedIn - no copy/paste needed
Get Chrome Extension - Free

Aero Online

Your AI prospecting assistant

Download 750 million emails and 100 million phone numbers

Access emails and phone numbers of over 750 million business users. Instantly download verified profiles using 20+ filters, including location, job title, company, function, and industry.