|
|
|||
|
||||
OverviewCan efficient AI be powerful without requiring massive compute resources or costly cloud subscriptions? Engineering with Small Language Models answers this question by showing how Small Language Models (SLMs) deliver high-performance natural language processing in resource-constrained environments. While large language models dominate headlines, SLMs offer a compelling alternative: fast inference, low memory usage, and flexible deployment on CPUs, mobile devices, edge hardware, and affordable GPUs. With tools like Hugging Face, PyTorch, and advanced techniques such as quantization and federated learning, you can build production-ready AI systems that are lightweight, secure, and scalable. This comprehensive guide takes you through the entire SLM lifecycle, from design and training to optimization and deployment. Written for developers, AI engineers, and data scientists, it provides clear, practical workflows backed by real-world code and case studies. You'll learn how to fine-tune models with parameter-efficient methods like LoRA, compress them using 4-bit quantization and pruning, and deploy them on devices like Raspberry Pi or smartphones. The book also addresses critical topics like privacy, bias mitigation, and compliance, ensuring your AI systems are ethical and production-ready. What's Inside: Setting up and running SLMs with Hugging Face and PyTorch Fine-tuning with LoRA, QLoRA, and adapters for domain-specific tasks Compression techniques: 4-bit/8-bit quantization, GPTQ, AWQ, and pruning Exporting models to ONNX, TensorFlow Lite, and Core ML for edge deployment On-device inference for Raspberry Pi, Android, iOS, and IoT devices Federated learning and differential privacy for secure, privacy-preserving AI Building scalable inference APIs with FastAPI and TorchServe Kubernetes, serverless, and autoscaling strategies for cloud deployment Ethical AI: bias mitigation, interpretability, and accessibility best practices Case studies in chatbots, healthcare, finance, and IoT CI/CD pipelines, monitoring, and performance optimization workflows Appendices with scripts, datasets, and troubleshooting guides About the Reader: This book is for developers, AI engineers, data scientists, and advanced learners who want to build efficient, scalable NLP systems without relying on massive infrastructure. A working knowledge of Python and basic familiarity with machine learning concepts are all you need to get started. Whether you're a startup founder integrating AI into a mobile app, a researcher optimizing models for edge devices, or an engineer deploying secure APIs, this book equips you with practical tools and insights. SLMs are transforming AI by making it faster, lighter, and more accessible. From fine-tuning on a laptop to deploying on constrained IoT devices, Engineering with Small Language Models is your definitive resource for creating impactful AI solutions. Get your copy today and start building smarter, more efficient systems-one small model at a time. Full Product DetailsAuthor: Cal RowePublisher: Independently Published Imprint: Independently Published Dimensions: Width: 17.80cm , Height: 1.20cm , Length: 25.40cm Weight: 0.413kg ISBN: 9798298559843Pages: 234 Publication Date: 17 August 2025 Audience: General/trade , General Format: Paperback Publisher's Status: Active Availability: Available To Order We have confirmation that this item is in stock with the supplier. It will be ordered in for you and dispatched immediately. Table of ContentsReviewsAuthor InformationTab Content 6Author Website:Countries AvailableAll regions |
||||