GetStream/Vision-Agents - OSS Alternative - Discover Top Open Source Alternatives to Popular Software
AI Agent Development Framework
7.7k 2026-05-01

GetStream/Vision-Agents

Build low-latency, multi-modal AI agents that process real-time video and audio using various LLMs and vision models.

Core Features

Real-time Multi-modal AI (Video & Voice)
Ultra-low Latency via Stream's Edge Network
Pluggable Video Processing Pipeline (YOLO, Roboflow)
Native LLM API Integrations (OpenAI, Gemini, Claude)
Tool Calling, RAG, and Persistent Memory

Quick Start

uv add vision-agents

Detailed Introduction

Vision Agents by Stream is an open-source framework designed for building intelligent, low-latency multi-modal AI agents that can watch, listen, and understand real-time video. It provides building blocks to integrate various LLMs (OpenAI, Gemini) and vision models (YOLO, Roboflow) with Stream's ultra-low-latency edge network. The platform offers features like real-time WebRTC, pluggable video processing, tool calling, RAG, and memory, enabling developers to create interactive AI experiences for diverse applications such as sports coaching, drone monitoring, and physical therapy.

OSS Alternative

Explore the best open source alternatives to commercial software.

© 2026 OSS Alternative. hotgithub.com - All rights reserved.