Tags: #data-processing
qax-os/excelize
A pure Go library for programmatically reading, writing, and manipulating Microsoft Excel spreadsheet files (XLAM, XLSM, XLSX, XLTM, XLTX).
datawhalechina/all-in-rag
A comprehensive, full-stack guide to Retrieval-Augmented Generation (RAG) technology, covering theory, practice, and engineering best practices for building LLM applications.
argoproj/argo-workflows
A container-native workflow engine for orchestrating parallel jobs and multi-step tasks on Kubernetes.
huggingface/datasets
A lightweight library providing a vast hub of ready-to-use datasets and efficient tools for data manipulation in AI and machine learning workflows.
ConardLi/easy-dataset
An application for generating high-quality datasets for LLM fine-tuning, RAG, and evaluation, featuring intelligent document processing and a comprehensive evaluation system.
Eventual-Inc/Daft
A high-performance data engine for AI and multimodal workloads, processing diverse data types at scale with Python and Rust.
vortex-data/vortex
Vortex is a next-generation, high-performance, and extensible columnar file format and toolkit designed for blazing-fast data processing and storage.
pditommaso/awesome-pipeline
A comprehensive, curated list of powerful pipeline toolkits and workflow management systems for various data processing and automation needs.
mbloch/mapshaper
A JavaScript-based tool for editing and transforming geospatial data formats like Shapefile, GeoJSON, and TopoJSON, offering both command-line and interactive web interfaces.