Building Large Language Models with Python: A Developer's Guide to Sparse MoE, 1-Bit Quantization, Reasoning Systems and Multimodal AI

Author: Samuel Reynolds
Publisher: Independently Published
ISBN:

9798259295339

Pages: 142
Publication Date: 28 April 2026
Format: Paperback
Availability: Available To Order

We have confirmation that this item is in stock with the supplier. It will be ordered in for you and dispatched immediately.

Our Price $68.61 Quantity:

Share |

Building Large Language Models with Python: A Developer's Guide to Sparse MoE, 1-Bit Quantization, Reasoning Systems and Multimodal AI

Overview

Most LLM books teach you how to call an API. This one teaches you how to build what's behind it. As frontier AI shifts toward efficiency, sparsity, and on-device deployment, the engineers who understand the architecture not just the interface are the ones defining what comes next. Building Large Language Models with Python gives you that understanding, from the mathematics of attention to the deployment of a quantized, reasoning-capable model on local hardware. Written from hard-won production experience, each chapter pairs rigorous theory with complete Python implementations not toy examples, but the kind of code that holds up under the demands of real training runs and live inference pipelines. What you'll build: - A Grouped-Query Attention module with KV cache support - A Top-K sparse MoE layer with load-balancing auxiliary loss - A BitLinear layer implementing ternary {-1, 0, 1} weights from scratch - A Vision Transformer encoder with a multimodal projection layer - A Process Reward Model for step-level reasoning verification - A full DPO and GRPO training loop for alignment - A local-first MCP server for agentic tool use - A speculative decoding pipeline using a draft model Topics covered include: Rotary Positional Embeddings (RoPE) - FlashAttention-3 concepts - Quantization-Aware Training vs. Post-Training Quantization - Expert parallelism and All-to-All communication - FSDP vs. DDP distributed training - PagedAttention and KV cache optimization - On-device LoRA fine-tuning - Chain-of-thought reasoning architecture This book is for you if: You're a software engineer or ML practitioner comfortable with Python and PyTorch You understand how a basic transformer works and want to go significantly deeper You want to move beyond using models to building and owning them You're building for edge deployment, private AI, or resource-constrained environments The field is moving fast. This book is written for engineers who intend to move faster.

Full Product Details

Author: Samuel Reynolds
Publisher: Independently Published
Imprint: Independently Published
Dimensions: Width: 17.80cm , Height: 0.80cm , Length: 25.40cm
Weight: 0.259kg
ISBN:

9798259295339

Pages: 142
Publication Date: 28 April 2026
Audience: General/trade , General
Format: Paperback
Publisher's Status: Active
Availability: Available To Order

We have confirmation that this item is in stock with the supplier. It will be ordered in for you and dispatched immediately.

Reviews

Author Information

Tab Content 6

Author Website:

Countries Available

All regions

Latest Reading Guide

Shopping Cart

Your cart is empty

Mailing List

Building Large Language Models with Python: A Developer's Guide to Sparse MoE, 1-Bit Quantization, Reasoning Systems and Multimodal AI

9798259295339

Availability Information

Overview

Full Product Details

9798259295339

Table of Contents

Reviews

Author Information

Tab Content 6

Countries Available

Sign up now