Tags: #computer-vision - OSS Alternative - Discover Top Open Source Alternatives to Popular Software

Tags: #computer-vision

AI Framework
159.9k

huggingface/transformers

A comprehensive library providing state-of-the-art pre-trained models for various machine learning tasks across text, vision, audio, and multimodal domains, facilitating both inference and training.

AI-powered UI Automation Framework
Node.js
12.8k

web-infra-dev/midscene

An AI-powered, vision-driven UI automation framework that enables natural language control and scripting across web, mobile, and custom interfaces.

AI-powered Web Automation Platform
python
21.4k

Skyvern-AI/skyvern

Automates complex browser-based workflows using LLMs and computer vision, providing a resilient and adaptive solution for web interaction.

Multimodal AI Chatbot Framework
3.3k

OpenGVLab/Ask-Anything

An advanced multimodal AI chatbot framework that enables conversational interaction and deep understanding of video and image content, integrating various large language models.

Replaces:
Details
Educational Code Repository
opencv
22.9k

spmallick/learnopencv

A comprehensive repository offering C++ and Python code examples for computer vision, deep learning, and AI research articles from LearnOpenCV.com.

Machine Learning Data Library
Python
21.5k

huggingface/datasets

A lightweight library providing one-line dataloaders and efficient pre-processing tools for a vast hub of AI datasets, supporting various ML frameworks.

AI Agent / Computer Automation Model
python
4.9k

microsoft/fara

An ultra-compact 7B parameter AI agent designed by Microsoft to automate multi-step computer tasks through visual perception and direct interface interaction.

AI/ML Model Fine-tuning Tool
conda
2.0k

adobe-research/custom-diffusion

Enables fast and efficient multi-concept customization of text-to-image diffusion models like Stable Diffusion using a few images.

Deep Learning Adaptation Framework
Python
1.5k

tianrun-chen/SAM-Adapter-PyTorch

A PyTorch-based framework to adapt Meta AI's Segment Anything Model (SAM) for improved performance on challenging downstream computer vision tasks using adapters and prompts.

Data Labeling and Annotation Platform
Docker
1.2k

xtreme1-io/xtreme1

An all-in-one open-source platform for multimodal data labeling and annotation, supporting 3D LiDAR, image, and LLM training data with AI-fueled tools.

AI Utility / Prompt Engineering Tool
python
3.7k

Hunyuan-PromptEnhancer/PromptEnhancer

A prompt rewriting tool that refines user prompts into clearer, structured versions to enhance the quality of text-to-image generation and image-to-image editing.

AI/ML Foundation Model
2.3k

OpenGVLab/InternVideo

A series of video foundation models and large-scale datasets designed for comprehensive multimodal video understanding and generation.

Deep Learning Library / Computer Vision Library
PaddlePaddle
3.6k

ZhaoJ9014/face.evoLVe

A high-performance, comprehensive face recognition library built on PaddlePaddle and PyTorch.

Resource Collection / Awesome List
2.4k

Yutong-Zhou-cv/Awesome-Text-to-Image

A comprehensive curated list of resources, papers, datasets, and projects related to text-to-image generation and manipulation.

Foundation Model Research Hub
22.1k

microsoft/unilm

A comprehensive research hub for large-scale self-supervised pre-training of foundation models across diverse tasks, languages, and modalities.

Machine Learning Automation Framework
python
2.7k

autodistill/autodistill

Autodistill automates the process of training small, fast supervised models from unlabeled images by leveraging large foundation models, eliminating the need for manual data labeling.

AI Model / Research Project
2.5k

X-PLUG/mPLUG-Owl

A family of powerful multi-modal large language models (MLLMs) designed to advance AI's understanding and generation capabilities across various data types.

AI Inference Toolkit
MNN
4.4k

xlite-dev/lite.ai.toolkit

A lightweight C++ toolkit for deploying over 100 AI models across various inference engines.

Curated Research List
2.2k

wangkai930418/awesome-diffusion-categorized

A meticulously categorized collection of research papers on diffusion models, spanning various subareas from visual illusions to image restoration and text-guided editing.

AI-powered Image Restoration and Upscaling Tool
python
5.5k

Fanghua-Yu/SUPIR

SUPIR is an AI-driven project focused on developing practical algorithms for photo-realistic image restoration and upscaling in real-world scenarios.

Multimodal AI Model Suite
HuggingFace
10.0k

OpenGVLab/InternVL

A pioneering open-source multimodal large language model family aiming to match or exceed commercial models like GPT-4o/GPT-5 in performance.

Machine Learning Framework
pytorch
3.8k

open-mmlab/mmpretrain

MMPreTrain is an OpenMMLab project providing a comprehensive, open-source PyTorch-based toolbox for pre-training and benchmarking various computer vision and multi-modal models.

OSS Alternative

Explore the best open source alternatives to commercial software.

© 2026 OSS Alternative. hotgithub.com - All rights reserved.