AI-powered File Type Detection Tool
16.6k 2026-04-24
google/magika
Magika is a fast and highly accurate AI-powered tool for identifying file content types, crucial for security and content routing.
Core Features
Utilizes a custom, highly optimized deep learning model (few MBs) for efficient detection.
Achieves approximately 99% accuracy across 200+ content types, outperforming existing methods.
Provides rapid inference speeds of about 5ms per file on a single CPU, with near-constant time regardless of file size.
Offers a command-line interface (Rust) and APIs for Python, Rust, JavaScript/TypeScript, and GoLang.
Supports recursive directory scanning and configurable prediction confidence modes.
Quick Start
pipx install magikaDetailed Introduction
Magika is an innovative AI-powered solution for precise file content type identification, leveraging advanced deep learning. It employs a compact, optimized model that ensures rapid detection within milliseconds, even on a single CPU. Trained on a massive dataset of ~100 million samples covering over 200 content types, Magika boasts an impressive ~99% accuracy. This tool is vital for enhancing user safety by accurately routing files to appropriate security and content policy scanners, as demonstrated by its large-scale deployment at Google and integration with platforms like VirusTotal.