Small Language Model – Small Package, Big Results – AN IT DIARY of AI, Windows, MEM & PowerShell

Hello all

I hope this post finds you in good health and spirit. In this post we will talk about Small Language Model (SLM). Never heard of them? No problem, let’s start from beginning. 🙂

ChatGPT is talk of town ever since it was introduced. It crossed the million-user mark in five days of release and within two months, it skyrocketed to 100 million active users. Just to compare, TikTok took nine months to reach 100 million monthly users, and Instagram about 2.5 years. As ChatGPT made headlines across industries, people started to understand the technology and that’s how they came across to another buzzword Language Model, precisely Large Language Model (LLM). ChatGPT, Gemini or similar chatbots are all powered by LLMs.

What are language models and LLMs?

A language model is a computer program that’s been trained to understand and process human language. It uses Machine Learning (ML) techniques to understand meaning and context of what user is saying and then generate text, image, video as output. Fr example when you will say “Sky is _______”, it will understand the context and fill the blanks with “blue”. Generating this kind of response is very complex and resource-intensive work. This require model to be trained on “large” dataset of books, articles, codes, videos etc. The language models which are trained by large datasets and have billions of parameters are called Large Language Models or LLMs.

So how large is “large”? To give you idea, ChatGPT is powered by GPT 4.0 LLM which has over trillion parameters. Gemini another chatbot from Google is powered by LLM Bard which also has over 100 Billions parameters.

While LLMs have created exciting new opportunities but their size require significant computing resources to operate. So what about edge devices as mobile phones or devices without internet.? Where will complex computation run for them?

Enters Small Language Models (SLM)

SLMs are smaller counterpart of LLMs. They have million or a few billion parameters against LLMs with parameters ranging to Trillions as we discussed earlier. Its offers significant advantages as:

Need less computing power making it suitable for smaller and edge devices
It can be easily fine trained for specific industry or use case
Can be easily deployed to places where LLMs are not feasible due to their computation requirements

Below is comparison between SLM and LLM:

Parameter	SLM	LLM
Size	Millions to few Billions	Billions to Trillions
Computation Requirement	Minimal	Higher
Performance	Simple task	Complex task
Cost	Cost-effective	Expensive
Customization	Easier	Complex

SLM Use cases

SLM can be deployed for domain specific tasks, simple use cases, edge devices or the places which are not connected to cloud. Here are few use cases:

Chatbots to answer customer queries or to provides services
Edge devices as smart phones, traffic lights, car computers, smart sensors etc.
Sentiment analysis and gain insights from customer feedback and responding to it
Personal assistant with voice recognition, text prediction etc.
Realtime language translation

Example of few Popular SLMs

Phi-3 by Microsoft
TensorFlow Lite by Google
DistilBERT by Hugging Face
MobileBERT

Final words

So will SLM replace LLM? No, both have their advantages, limitation and unique use cases. While LLM will still be utilized for complex task, SLM has its place for computation on the edge and on-device, allowing tasks to be performed without complex computation. I will suggest you check Microsoft Phi-3 page to give you more insight on its capabilities: https://azure.microsoft.com/en-us/products/phi-3

Phew!!! that was all in this post. I will see you soon with some other interesting stuff. Till then, bye-bye 🙂

Small Language Model – Small Package, Big Results

Published by Vinit Pandey

Leave a comment Cancel reply

If you are helped lend your hand to others : Share it

Related

Published by Vinit Pandey

Leave a comment Cancel reply