Generative Ai

Industrial insight and articles from Winder.AI, focusing on the topic Generative Ai

Early Adopter Release of Kodit: MCP server to index external repositories

Published: Jun 9, 2025
Author: Dr. Phil Winder
CEO

A lot of my work today is assisted with AI. From editing blog posts to developing proposals. But the number one use case is AI-assisted coding. Tools like Cursor,l Cline, Roo, Aider, Claude Code, etc. have disrupted software engineering to levels I hadn’t anticipated. But still, it’s not perfect.

This post is about one particular set of problems and a new open source tool I developed to alleviate it.

Scaling GenAI to Production: Strategies for Enterprise-Grade AI Deployment

Published: Feb 17, 2025
Author: Natalia Kuzminykh
Associate Data Science Content Editor

The article examines the challenges of moving GenAI from prototypes to production. It highlights issues such as resource constraints, performance monitoring, cost management, and security, and suggests strategies for efficient scaling, robust guardrails, and continuous monitoring to ensure sustainable enterprise-grade deployments.

Best LLMOps Tools: Comparison of Open-Source LLM Production Frameworks

Published: Oct 21, 2024
Author: Natalia Kuzminykh
Associate Data Science Content Editor

Discover how to deploy open-source LLMs using LLM agent frameworks, orchestration frameworks, and LLMOps platforms. Learn about serving frameworks like vLLM and Ollama, and explore LLMOps tools that enhance language model performance in production environments.

A Comparison of Open Source LLM Frameworks for Pipelining

Published: Aug 1, 2024
Author: Natalia Kuzminykh
Associate Data Science Content Editor

Discover top open source LLM frameworks and orchestration tools. Explore popular LLM projects, including LangChain and LlamaIndex, for seamless integration. Learn about Python LLM libraries, LLM agent frameworks, and the best tools for LLM development and orchestration.

Testing and Evaluating Large Language Models in AI Applications

Published: Jul 17, 2024
Author: Dr. Phil Winder
CEO

With the rapidly expanding use of large language models (LLMs) in downstream products, the need to ensure performance and reliability is crucial. But with random outputs and non-deterministic behaviour how do you know if you application performs, or works at all? This webinar offers a comprehensive, vendor-agnostic exploration of techniques and best practices for testing and evaluating LLMs, ensuring they meet the desired success criteria and perform effectively across varied scenarios.

Retrieval-Augmented Generation (RAG) Examples and Use Cases

Published: Jun 26, 2024
Author: Dr. Phil Winder
CEO

Watch our webinar to explore Retrieval Augmented Generation (RAG) and its integration with Large Language Models (LLMs). Learn about RAG use cases, advanced LLM architectures, and techniques to enhance AI applications. Ideal for professionals utilizing or interested in RAG and LLM-powered systems.

Build a Voice-Based Chatbot with OpenAI, Vocode, and ElevenLabs

Published: Jun 7, 2024
Author: Natalia Kuzminykh
Associate Data Science Content Editor

Why might we want to make an LLM talk? The concept of having a human-like conversation with an advanced AI model is an interesting idea that has many practical applications.

Voice-based models are transforming how we interact with technology, making interactions more natural and intuitive. By enabling AI to talk, we open the door to numerous practical applications, from accessibility to enhanced human-machine interactions. This guide explores how to create a voice-based chatbot using OpenAI, Vocode and ElevenLabs.

Revolutionizing IVR Systems: Attaching Voice Models to LLMs

Published: May 22, 2024
Author: Dr. Phil Winder
CEO

Discover how integrating voice models with large language models (LLMs) can revolutionize IVR systems enhancing the user experience. Learn about the data and security concerns and potential implementations.

LLM Architecture: RAG Implementation and Design Patterns

Published: Apr 25, 2024
Author: Dr. Phil Winder
CEO

This presentation investigates several common production-ready architectures for RAG and discusses the pros and cons of each. At the end of this talk you will be able to help design RAG augmented LLM architectures that best fit your use case.

Scaling StableAudio.com Generative Models Globally with NVIDIA Triton & Sagemaker

Published: Apr 10, 2024
Author: Enrico Rotundo
Associate Data Scientist

In an insightful session presented by Enrico Rotundo, we explore the innovative approach to scaling StableAudio globally. This presentation sheds light on the synergy between NVIDIA Triton and AWS SageMaker.