Ecosystem & Stack: vmware
AI Agent Benchmarking Platform
python
2.8k
xlang-ai/OSWorld
OSWorld is a benchmark and environment for evaluating multimodal AI agents on open-ended tasks within real computer operating systems.
OSWorld is a benchmark and environment for evaluating multimodal AI agents on open-ended tasks within real computer operating systems.