Alibaba Cloud OSS: From Object Storage to AI-Native Data Infrastructure with Vector Bucket & Metaquery

本文涉及的产品
对象存储 OSS,OSS 加速器 50 GB 1个月
简介: This article provides an in-depth look at how OSS builds an AI-native storage foundation to empower key scenarios including Retrieval-Augmented Generation (RAG), enterprise search, and AI-powered content management—helping you efficiently build next-generation intelligent applications.

By Justin See


As data volumes explode and AI becomes central to enterprise competitiveness, traditional object storage must evolve. Alibaba Cloud Object Storage Service (OSS) is leading this transformation—moving from a passive data carrier to an intelligent, AI-powered knowledge provider.

This blog explores OSS’s technical foundation, its evolution toward AI-native workloads, and deep dives into two major innovations: Vector Bucket and Meta Query.

In the AI era, Vector Data and the ability to query it, is the foundation of AI applications. This diagram illustrates how diverse unstructured data types—documents, images, videos —are transformed into dimensional vector representations through embedding. These vectors serve as the backbone for advanced AI capabilities such as content awareness and semantic retrieval. By converting raw data into structured vector formats, OSS enables intelligent indexing, search, and discovery across massive datasets, forming the basis for applications like retrieval-augmented generation (RAG), enterprise search, and AI content management.

image.png

1. OSS Architecture and Core Capabilities


Alibaba Cloud OSS is a distributed, highly durable object storage system designed for massive-scale unstructured data. It supports workloads ranging from web hosting and backup to AI training and semantic retrieval.

OSS Architecture Layers

Layer Components
Application Interfaces API, SDKs, CLI
OSS Service Control Layer, Access Layer, Metadata Indexing
Storage Infrastructure Erasure Coding, Zone-Redundant Storage
AI Integration MCP Server, AI Assistant, Semantic Retrieval Engine

Storage Classes and Cost Optimization

Class Use Case Retrieval Time
Standard Hot data Instant
IA Infrequent access Instant
Archive Cold data Minutes
Cold Archive Deep archive Hours
Deep Cold Archive Long-term retention Hours

Lifecycle policies and OSS Inventory now support predictive tiering and DryRun simulations to prevent accidental deletions.

Performance and QoS

OSS Resource Pool QoS enables:

● Shared throughput across buckets

● Priority-based scheduling (P1–P4)

● Guaranteed minimum bandwidth

● Real-time monitoring and dynamic allocation

This ensures stable performance across mixed workloads—online services, batch jobs, AI training, and data migration.

2. OSS Vector Bucket: AI-Native Storage for Embeddings


Vector data is the foundation of semantic search, recommendation engines, and retrieval-augmented generation (RAG). OSS Vector Bucket introduces native support for storing, indexing, and querying high-dimensional vectors.

image.png

Architecture

● Raw data stored in OSS Bucket

● Embedding service generates vectorized data

● Vectors stored in OSS Vector Bucket

● MCP Server enables semantic retrieval for AI Content Awareness

● AI Agent integrates with RAG and AI Semantic Retrieval

Difference with Traditional Vector Database

As enterprise AI workloads scale, the volume of vectorized data grows exponentially—driving up infrastructure costs and straining traditional storage architectures. OSS Vector Bucket offers a cost-effective alternative by decoupling compute from storage, allowing vector queries to be executed directly on OSS without relying on tightly coupled service nodes. In most AI use cases, customers are able to tolerate higher retrieval latencies—hundreds of milliseconds; OSS vector bucket has higher latencies but significantly reducing operational costs through pay-as-you-go pricing for both storage and query scans. By migrating to OSS Vector Bucket, enterprises can build retrieval-augmented generation (RAG) applications that are not only scalable and performant, but also financially sustainable.

Use Cases

● RAG Applications

● AI Agent Retrieval

● AI-powered Content Management Platform for Social Media, E-commerce, Media

Key Features

● Supports hundreds of billions of vectors per account

● Native OSS API with SDK, CLI, and console access

● Integrated with Tablestore for high-performance workloads

● Pay-as-you-go pricing: storage capacity, query volume

● Unified permission management via OSS bucket policies

Key Benefits

● Lower vector database costs

● Automatic scaling & limitless elasticity

● Reduces data silo and management overheads

Watch Demo: https://www.youtube.com/watch?v=xgY7k8hVS20

Documentation: Vector Bucket Documentation

3. OSS Metaquery: Content Awareness & Semantic Search


OSS Metaquery transforms raw unstructured data into an intelligent, searchable knowledge layer by automatically generating embeddings for every newly added object, eliminating the need for manual preprocessing or external vector pipelines. Once embedded, the data becomes immediately accessible through semantic search, allowing users to query their OSS buckets using natural language rather than rigid keywords or file paths. The system combines scalar filtering—such as metadata conditions on size, time, or tags—with vector-based similarity search to deliver highly relevant results through a hybrid retrieval engine. In production deployments, this approach has demonstrated precision‑recall rates of up to 85 percent, significantly outperforming traditional self‑built search solutions and enabling enterprises to build powerful retrieval‑augmented and content‑aware applications directly on top of OSS.

image.png

Use cases

● Intelligent Enterprise Search

● RAG applications

● AI-powered Content Management Platform

Key Features

● One-click enables automatic embeddings

● Semantic search using natural language

● No extra management overheads

Key Benefits

● Time to market significantly shortened without the need to build vector embedding & natural language search engine

● Higher accuracy search with multi-channel recall

● Lower costs by leveraging cost effective object storage, and no additional components to build or manage

Watch Demo: https://www.youtube.com/watch?v=xgY7k8hVS20

Documentation: Metaquery Documentation

4. OSS Accelerator and Other Tooling Upgrades


To support high-throughput AI workloads, OSS also introduced:

OSS Accelerator (EFC Cache) Upgrade

The OSS Accelerator introduces a high‑performance, compute‑proximate caching layer designed to dramatically improve data access speeds for AI, analytics, and real‑time workloads running on Alibaba Cloud. By deploying NVMe‑based cache nodes in the same zone as compute resources, the accelerator reduces read latency to single‑digit milliseconds and supports burst throughput of up to 100 GB/s with more than 100,000 QPS. It requires no application changes: all OSS clients, SDKs, and tools automatically benefit from acceleration through a unified namespace, with strong consistency maintained between cached data and the underlying OSS objects. Multiple caching strategies—including on‑read prefetch, manual prefetch, and synchronized prefetch—ensure that hot data is always available at high speed, while an LRU eviction policy manages cache capacity efficiently. This upgrade is particularly impactful for workloads such as model loading, BI hot table queries, and high‑frequency inference, enabling organizations to achieve low‑latency performance while keeping the majority of their data stored cost‑effectively across OSS Standard, IA, Archive, and Cold Archive tiers.

OSSFS V2 Upgrade

OSSFS V2 is a next‑generation, high‑performance mount tool designed to make OSS behave like a local file system for AI, analytics, and containerized workloads. Built with a lightweight protocol and deeply optimized I/O path, OSSFS V2 delivers substantial performance improvements over previous versions—achieving up to 3 GB/s single‑thread read throughput and significantly reducing CPU and memory overhead. It introduces a negative metadata cache to minimize redundant lookups, improving responsiveness for workloads that perform frequent directory scans or small‑file access. OSSFS V2 is fully compatible with Kubernetes environments through CSI integration, enabling seamless mounting of OSS buckets as persistent volumes in ACK and ACS clusters. This makes it ideal for applications that cannot be easily modified to use the OSS SDK directly, such as legacy data processing pipelines, distributed training frameworks, and containerized AI workloads. With strong consistency, elastic scalability, and support for high‑concurrency access, OSSFS V2 allows organizations to use OSS as a high‑throughput data layer across the entire AI lifecycle—from data ingestion and preprocessing to model training and inference.

OSS Connector for Hadoop V2 Upgrade

The OSS Connector for Hadoop V2 delivers a major performance and efficiency leap for data lake and big data analytics workloads running on Hadoop and Spark. This new version introduces an adaptive prefetching mechanism that eliminates redundant metadata operations, significantly improving read throughput—up to 5.8× faster in benchmark tests—and reducing end‑to‑end SQL query time by 28.5 percent. It also integrates seamlessly with OSS Accelerator, enabling hot data to be cached on NVMe storage close to compute nodes, which can further reduce query latency by up to 40 percent. Built on top of the next‑generation OSS Java SDK V2, the connector adopts default V4 authentication for stronger security and improved performance. These enhancements make OSS Connector for Hadoop V2 a high‑performance, cloud‑native storage interface for AI data lakes, supporting large‑scale ETL, interactive analytics, and machine learning pipelines with significantly lower overhead and higher throughput.

OSS Resource Pool QoS Upgrade

The OSS Resource Pool QoS upgrade introduces a unified performance management framework that allows multiple buckets and workloads to share a common throughput pool while maintaining predictable service quality. Instead of each bucket operating in isolation, enterprises can now allocate and prioritize throughput across business units, job types, or RAM accounts. QoS policies support both priority‑based dynamic control—ensuring critical online services receive the bandwidth they need during peak hours—and minimum guaranteed throughput, which protects lower‑priority batch or analytics jobs from starvation. This fine‑grained control enables stable performance for mixed workloads such as AI training, data preprocessing, online inference, and large‑scale data migration, all while maximizing overall resource utilization. With throughput pools scaling to tens of terabits per second, OSS QoS becomes a foundational capability for storage‑compute separation architectures and unified AI data lakes.

OSS SDK V2 Upgrade

The OSS SDK V2 upgrade delivers a comprehensive modernization of the OSS client experience, offering higher performance, stronger security, and broader language coverage for developers building AI, analytics, and cloud‑native applications. This new version introduces a fully asynchronous API architecture that significantly improves throughput for high‑concurrency workloads such as model training, data ingestion, and large‑scale ETL. It adopts default V4 authentication for enhanced security and more efficient request signing, reducing overhead for frequent or parallel operations. SDK V2 provides full language support—including Go, Python, PHP, .NET, Swift, and Java—ensuring consistent behavior and performance across diverse development environments. With improved error handling, streamlined configuration, and optimized network usage, SDK V2 enables developers to interact with OSS more efficiently while taking full advantage of the platform’s evolving AI‑native capabilities.

These upgrades enable OSS to serve as the backbone for AI data lakes and inference platforms.

Conclusion

Alibaba Cloud OSS is no longer just object storage—it’s a full-stack, AI-native data infrastructure. With Vector Bucket, Metaquery, Content Awareness, Semantic Retrieval, OSS Accelerator and other upgrades, OSS supports:

● AI training and inference

● Intelligent data discovery

● Semantic search and retrieval

● Cost-effective RAG applications

● AI-powered Content Management Platform

● Unified data lake architectures

Whether you're building AI Agents, managing enterprise digital assets, or scaling AI workloads, OSS is ready to power your next-generation data strategy.

相关实践学习
对象存储OSS快速上手——如何使用ossbrowser
本实验是对象存储OSS入门级实验。通过本实验,用户可学会如何用对象OSS的插件,进行简单的数据存、查、删等操作。
相关文章
|
存储 人工智能 自动驾驶
高性能存储CPFS在AIGC场景的具体应用
高性能存储CPFS在AIGC场景的具体应用
阿里云AI解决方案
阿里云AI整体解决方案概览
|
存储 人工智能 安全
CPFS深度解析:并行文件存储加速AI创新
在生成式AI的大潮中,并行文件系统作为高性能数据底座,为AI算力提供高吞吐、低延迟的数据存储服务。在本话题中,我们将介绍阿里云并行文件存储CPFS针对AI智算场景而提供的产品能力演进与更新,深入讲解在性能、成本、稳定、安全等方面的技术创新。
1377 0
|
人工智能 自然语言处理 算法
阿里云PAI大模型评测最佳实践
在大模型时代,模型评测是衡量性能、精选和优化模型的关键环节,对加快AI创新和实践至关重要。PAI大模型评测平台支持多样化的评测场景,如不同基础模型、微调版本和量化版本的对比分析。本文为您介绍针对于不同用户群体及对应数据集类型,如何实现更全面准确且具有针对性的模型评测,从而在AI领域可以更好地取得成就。
|
缓存 并行计算 负载均衡
大模型推理优化实践:KV cache复用与投机采样
在本文中,我们将详细介绍两种在业务中实践的优化策略:多轮对话间的 KV cache 复用技术和投机采样方法。我们会细致探讨这些策略的应用场景、框架实现,并分享一些实现时的关键技巧。
|
8月前
|
存储 监控 安全
告别版本混乱!同步盘让团队协作不再成为噩梦
同步盘是一种支持多成员、跨设备实时文件同步的企业云存储产品。用户只需在电脑、手机等设备上安装同步盘客户端,所有指定资料便会自动上传至云端,且随时随地同步到任意终端,实现企业高效协同办公,减少了数据的传递和管理时间。
|
22天前
|
存储 人工智能 API
使用 OSS-Vectors-Embed-CLI 工具三步搭建多模态语义检索系统
本文将介绍如何使用 OSS Vectors Embed CLI 命令行工具,通过若干简单的命令快速构建多模态语义检索系统。同时介绍 OSS Vectors Embed CLI 命令行工具的灵活自定义能力,如批量写入、自定义向量键、自定义向量模型参数等。
272 1
|
1月前
|
存储 人工智能 缓存
四年三次,再获殊荣!阿里云斩获全球存储顶会 FAST’26 最佳论文
阿里云联合上海交大、Solidigm论文《Here, There and Everywhere》获 FAST '26 最佳论文奖,在过去四年内第三次摘得这一国际学术界最高荣誉。论文梳理了本地盘技术的“三代进化史”,并提出了本云融合存储新架构—— Latte,利用软硬协同与本云融合的技术红利,为云原生数据库、AI推理及大数据分析奠定更坚实的基石。
312 3
|
19天前
|
自然语言处理 文件存储 数据安全/隐私保护
最佳实践:为OpenClaw配置网盘空间
网盘与相册服务(PDS)为 OpenClaw 提供云端文件存储能力。配置后,OpenClaw 可直接访问网盘文件作为任务素材,也可将生成的文档、图片、视频等保存到网盘供您下载使用。网盘支持多空间隔离和文件级权限管控,确保不同用户间的数据安全。
646 7
|
15天前
|
存储 人工智能 弹性计算
阿里云网盘 Skill 上线,附 OpenClaw 配置网盘空间实操教程
阿里云网盘正式上线OpenClaw专属Skill,为龙虾AI提供云端存储、多端实时同步与精细权限管控,解决本地空间不足、跨端难协同、数据不安全等痛点,3分钟配置即享高性价比(200GB/月仅6.6元)AI工作流升级。
516 6