Guide to Generative AI

RAG, Agentic AI, Multi-Agent Systems and Advanced Prompting
Estimated reading time: 15 minutes Author: Hailey Quach Source: IBM Skills Network

Introduction

Generative AI (GenAI) has evolved from simple text and image generation to complex systems such as AI agents, enterprise automation and reasoning engines. This guide covers the core concepts, tools and frameworks for developing modern GenAI applications, including RAG, multi-agent systems, prompt engineering and innovative libraries such as LangGraph.

Whether you are building chatbots, automation workflows or knowledge systems, this IBM guide provides an overview of the essential developments to help you navigate the GenAI landscape.

Core Concepts and GenAI Terminology
Fundamental Concepts
Term Definition Examples / Use Cases
LLM A type of AI model trained on vast amounts of text data to understand and generate human-like language. GPT-o1, Claude, LLaMA
Prompting A technique for designing input instructions to control LLM outputs. "Write a summary in 3 sentences", "Respond as a cybersecurity expert".
Prompt Templates Reusable, structured prompts with placeholders for dynamic inputs. "Explain {concept} as if I were 5 years old."
RAG Retrieval-Augmented Generation (RAG): Combines retrieval from external knowledge sources with LLM generation to improve factual accuracy. Answering questions with real-time data (e.g. RAG Paper)
Retriever A system component designed to retrieve relevant information from a dataset or database. Vector similarity search with FAISS, Elasticsearch
Agent An autonomous AI system that can plan, reason and execute tasks using tools. AutoGPT, LangChain agents
Multi-Agent System A framework in which multiple AI agents collaborate to solve complex tasks. Microsoft AutoGen, CrewAI
Chain-of-Thought A prompting technique that encourages models to break down problems into intermediate steps. "Let's work through this step by step ..."
Hallucination Mitigation Strategies to reduce false or fabricated outputs from LLMs. RAG, fine-tuning, prompt constraints
Vector Database A database optimised for storing and querying vector embeddings. Pinecone, Chroma, Weaviate
Orchestration Tools for managing and coordinating workflows involving multiple AI components. LangChain, LlamaIndex
Fine-tuning Adapting pre-trained models for specific tasks using domain-specific data. LoRA (Low-Rank Adaptation), QLoRA (quantised fine-tuning)
Tools and Frameworks
Model Development and Deployment
Tool Definition Examples / Use Cases Link
Hugging Face A platform hosting pre-trained models and datasets for NLP tasks. Access to GPT-2, BERT, Stable Diffusion
LangChain A framework for building applications with LLMs, agents and tools. Building chatbots with memory and web search
AutoGen A library for building multi-agent conversational systems. Simulating debates between AI agents
CrewAI A framework for assembling collaborative AI agents with role-based tasks. Task automation with specialised agents
BeeAI A lightweight framework for building production-ready multi-agent systems. Distributed problem-solving systems
LlamaIndex A tool for connecting LLMs with structured or unstructured data sources. Building Q&A systems for private documents
LangGraph A library for building stateful multi-actor applications with LLMs. Cyclical workflows, agent simulations
Retrieval and Infrastructure
Tool Definition Examples / Use Cases Link
FAISS A library for efficient similarity search across vector collections. Retrieving top-k documents for RAG
Pinecone A managed cloud service for vector database operations. Storing embeddings for real-time retrieval
Haystack An end-to-end framework for building RAG pipelines. Deploying enterprise search systems
Advanced Prompting Techniques
Concept Definition Example
Few-Shot Prompting Providing examples in the prompt to guide the model's output format. "Translate into French: 'Hello' → 'Bonjour'; 'Goodbye' → __"
Zero-Shot Prompting Directly prompting the model to perform a task without examples. "Classify this tweet as positive, neutral or negative: {tweet}"
Chain-of-Thought Encouraging step-by-step reasoning. "First calculate X. Then compare it with Y. Final answer: ___"
Prompt Chaining Breaking complex tasks into smaller, sequentially executed prompts. Prompt 1: Extract keywords → Prompt 2: Generate summary from keywords.
Key Architectures and Workflows

RAG Pipeline

  1. Retrieval: Query the vector database (e.g. Pinecone) for context.
  2. Augmentation: Combine the context with the user prompt.
  3. Generation: The LLM (e.g. GPT-4) produces the final output.

Multi-Agent System

  • Agents: Specialised roles (e.g. researcher, writer, critic).
  • Orchestration: LangGraph for cyclical workflows, AutoGen for conversations.
  • Tools: Web search, code execution, API integrations.
newsletter
the agentic banker

Keep reading – in your inbox every two weeks.

Capital markets insights, regulatory updates and AI trends. Concise, well-founded, free of charge.

GDPR-compliant. Unsubscribe at any time.

← Back to overview