Large Language Model [LLM] - Introduction

by Admin on March 28, 2025 in AI, LLM

LLM stands for Large Language Model. It is specifically a deep learning model, trained on massive amounts of text data to understand and generate human language, enabling tasks like text generation, translation. It often sing "Transformer" models which are neural networks that can process relationships within language.

Reasoning LLMs

Traditional LLM workflow

Traditional LLM model is refine a dataset into pretraining workflow. The pretraining send a data into fine tuning model and give a precise collected output data. It will send it to human feed back and correct incase of any mismatch with fine tuning model.

Traditional LLMs

Direct pattern based prediction
Quick but less reliable on complex tasks
No explicit reasoning steps

Reasoning LLM:

Language models are designed complex and multiple set problems
Break down tasks into logical sub tasks.
Generate intermediate reasoning steps "thought processes"

Key Capabilities of Reasoning LLMs:

1) Chain-of-Thought Reasoning

Internal dialogue approach

step-by-step problem solving

2) Self consistency

Verified own answers

Revisits problematic solutions

3) Structured Outputs

Organized reasoning steps

Practical Applications of Reasoning LLMs

Data Analysis

Medical diagnostics

Complex data interpretation

Anomaly detection

Background Processing

Batch processing workflows

Overnight analysis jobs

Evaluation Tasks

LLM as judge

Quality assessment

Verification workflows

Limitation of Reasoning LLM

Performance Trade-offs

* Increased latency : extended thinking process leads to significantly longer response times

* Higher resource requirements: ofent require more computational resoures

* cost-implications: More tokens and processing time translate to higher operational costs

DeepSeek:

DeepSeek applied supervised fine-tuning to refine the models' capabilities. This involved training on datasets containing reasoning and non-reasoning tasks. Notably, reasoning data was generated by specialized "expert models" trained for specific domains such as mathematics, programming, and logic. These expert models were developed through supervised fine-tuning on both original responses and synthetic data generated by internal models like DeepSeek-R1-Lite. The use of expert models allowed DeepSeek to generate high-quality synthetic reasoning data to enhance the primary model's performance.