Alibaba Cloud OSS: From Object Storage to AI-Native Data Infrastructure with Vector Bucket & Metaquery

本文涉及的产品
对象存储 OSS,OSS 加速器 50 GB 1个月
简介: This article provides an in-depth look at how OSS builds an AI-native storage foundation to empower key scenarios including Retrieval-Augmented Generation (RAG), enterprise search, and AI-powered content management—helping you efficiently build next-generation intelligent applications.

By Justin See


As data volumes explode and AI becomes central to enterprise competitiveness, traditional object storage must evolve. Alibaba Cloud Object Storage Service (OSS) is leading this transformation—moving from a passive data carrier to an intelligent, AI-powered knowledge provider.

This blog explores OSS’s technical foundation, its evolution toward AI-native workloads, and deep dives into two major innovations: Vector Bucket and Meta Query.

In the AI era, Vector Data and the ability to query it, is the foundation of AI applications. This diagram illustrates how diverse unstructured data types—documents, images, videos —are transformed into dimensional vector representations through embedding. These vectors serve as the backbone for advanced AI capabilities such as content awareness and semantic retrieval. By converting raw data into structured vector formats, OSS enables intelligent indexing, search, and discovery across massive datasets, forming the basis for applications like retrieval-augmented generation (RAG), enterprise search, and AI content management.

image.png

1. OSS Architecture and Core Capabilities


Alibaba Cloud OSS is a distributed, highly durable object storage system designed for massive-scale unstructured data. It supports workloads ranging from web hosting and backup to AI training and semantic retrieval.

OSS Architecture Layers

Layer Components
Application Interfaces API, SDKs, CLI
OSS Service Control Layer, Access Layer, Metadata Indexing
Storage Infrastructure Erasure Coding, Zone-Redundant Storage
AI Integration MCP Server, AI Assistant, Semantic Retrieval Engine

Storage Classes and Cost Optimization

Class Use Case Retrieval Time
Standard Hot data Instant
IA Infrequent access Instant
Archive Cold data Minutes
Cold Archive Deep archive Hours
Deep Cold Archive Long-term retention Hours

Lifecycle policies and OSS Inventory now support predictive tiering and DryRun simulations to prevent accidental deletions.

Performance and QoS

OSS Resource Pool QoS enables:

● Shared throughput across buckets

● Priority-based scheduling (P1–P4)

● Guaranteed minimum bandwidth

● Real-time monitoring and dynamic allocation

This ensures stable performance across mixed workloads—online services, batch jobs, AI training, and data migration.

2. OSS Vector Bucket: AI-Native Storage for Embeddings


Vector data is the foundation of semantic search, recommendation engines, and retrieval-augmented generation (RAG). OSS Vector Bucket introduces native support for storing, indexing, and querying high-dimensional vectors.

image.png

Architecture

● Raw data stored in OSS Bucket

● Embedding service generates vectorized data

● Vectors stored in OSS Vector Bucket

● MCP Server enables semantic retrieval for AI Content Awareness

● AI Agent integrates with RAG and AI Semantic Retrieval

Difference with Traditional Vector Database

As enterprise AI workloads scale, the volume of vectorized data grows exponentially—driving up infrastructure costs and straining traditional storage architectures. OSS Vector Bucket offers a cost-effective alternative by decoupling compute from storage, allowing vector queries to be executed directly on OSS without relying on tightly coupled service nodes. In most AI use cases, customers are able to tolerate higher retrieval latencies—hundreds of milliseconds; OSS vector bucket has higher latencies but significantly reducing operational costs through pay-as-you-go pricing for both storage and query scans. By migrating to OSS Vector Bucket, enterprises can build retrieval-augmented generation (RAG) applications that are not only scalable and performant, but also financially sustainable.

Use Cases

● RAG Applications

● AI Agent Retrieval

● AI-powered Content Management Platform for Social Media, E-commerce, Media

Key Features

● Supports hundreds of billions of vectors per account

● Native OSS API with SDK, CLI, and console access

● Integrated with Tablestore for high-performance workloads

● Pay-as-you-go pricing: storage capacity, query volume

● Unified permission management via OSS bucket policies

Key Benefits

● Lower vector database costs

● Automatic scaling & limitless elasticity

● Reduces data silo and management overheads

Watch Demo: https://www.youtube.com/watch?v=xgY7k8hVS20

Documentation: Vector Bucket Documentation

3. OSS Metaquery: Content Awareness & Semantic Search


OSS Metaquery transforms raw unstructured data into an intelligent, searchable knowledge layer by automatically generating embeddings for every newly added object, eliminating the need for manual preprocessing or external vector pipelines. Once embedded, the data becomes immediately accessible through semantic search, allowing users to query their OSS buckets using natural language rather than rigid keywords or file paths. The system combines scalar filtering—such as metadata conditions on size, time, or tags—with vector-based similarity search to deliver highly relevant results through a hybrid retrieval engine. In production deployments, this approach has demonstrated precision‑recall rates of up to 85 percent, significantly outperforming traditional self‑built search solutions and enabling enterprises to build powerful retrieval‑augmented and content‑aware applications directly on top of OSS.

image.png

Use cases

● Intelligent Enterprise Search

● RAG applications

● AI-powered Content Management Platform

Key Features

● One-click enables automatic embeddings

● Semantic search using natural language

● No extra management overheads

Key Benefits

● Time to market significantly shortened without the need to build vector embedding & natural language search engine

● Higher accuracy search with multi-channel recall

● Lower costs by leveraging cost effective object storage, and no additional components to build or manage

Watch Demo: https://www.youtube.com/watch?v=xgY7k8hVS20

Documentation: Metaquery Documentation

4. OSS Accelerator and Other Tooling Upgrades


To support high-throughput AI workloads, OSS also introduced:

OSS Accelerator (EFC Cache) Upgrade

The OSS Accelerator introduces a high‑performance, compute‑proximate caching layer designed to dramatically improve data access speeds for AI, analytics, and real‑time workloads running on Alibaba Cloud. By deploying NVMe‑based cache nodes in the same zone as compute resources, the accelerator reduces read latency to single‑digit milliseconds and supports burst throughput of up to 100 GB/s with more than 100,000 QPS. It requires no application changes: all OSS clients, SDKs, and tools automatically benefit from acceleration through a unified namespace, with strong consistency maintained between cached data and the underlying OSS objects. Multiple caching strategies—including on‑read prefetch, manual prefetch, and synchronized prefetch—ensure that hot data is always available at high speed, while an LRU eviction policy manages cache capacity efficiently. This upgrade is particularly impactful for workloads such as model loading, BI hot table queries, and high‑frequency inference, enabling organizations to achieve low‑latency performance while keeping the majority of their data stored cost‑effectively across OSS Standard, IA, Archive, and Cold Archive tiers.

OSSFS V2 Upgrade

OSSFS V2 is a next‑generation, high‑performance mount tool designed to make OSS behave like a local file system for AI, analytics, and containerized workloads. Built with a lightweight protocol and deeply optimized I/O path, OSSFS V2 delivers substantial performance improvements over previous versions—achieving up to 3 GB/s single‑thread read throughput and significantly reducing CPU and memory overhead. It introduces a negative metadata cache to minimize redundant lookups, improving responsiveness for workloads that perform frequent directory scans or small‑file access. OSSFS V2 is fully compatible with Kubernetes environments through CSI integration, enabling seamless mounting of OSS buckets as persistent volumes in ACK and ACS clusters. This makes it ideal for applications that cannot be easily modified to use the OSS SDK directly, such as legacy data processing pipelines, distributed training frameworks, and containerized AI workloads. With strong consistency, elastic scalability, and support for high‑concurrency access, OSSFS V2 allows organizations to use OSS as a high‑throughput data layer across the entire AI lifecycle—from data ingestion and preprocessing to model training and inference.

OSS Connector for Hadoop V2 Upgrade

The OSS Connector for Hadoop V2 delivers a major performance and efficiency leap for data lake and big data analytics workloads running on Hadoop and Spark. This new version introduces an adaptive prefetching mechanism that eliminates redundant metadata operations, significantly improving read throughput—up to 5.8× faster in benchmark tests—and reducing end‑to‑end SQL query time by 28.5 percent. It also integrates seamlessly with OSS Accelerator, enabling hot data to be cached on NVMe storage close to compute nodes, which can further reduce query latency by up to 40 percent. Built on top of the next‑generation OSS Java SDK V2, the connector adopts default V4 authentication for stronger security and improved performance. These enhancements make OSS Connector for Hadoop V2 a high‑performance, cloud‑native storage interface for AI data lakes, supporting large‑scale ETL, interactive analytics, and machine learning pipelines with significantly lower overhead and higher throughput.

OSS Resource Pool QoS Upgrade

The OSS Resource Pool QoS upgrade introduces a unified performance management framework that allows multiple buckets and workloads to share a common throughput pool while maintaining predictable service quality. Instead of each bucket operating in isolation, enterprises can now allocate and prioritize throughput across business units, job types, or RAM accounts. QoS policies support both priority‑based dynamic control—ensuring critical online services receive the bandwidth they need during peak hours—and minimum guaranteed throughput, which protects lower‑priority batch or analytics jobs from starvation. This fine‑grained control enables stable performance for mixed workloads such as AI training, data preprocessing, online inference, and large‑scale data migration, all while maximizing overall resource utilization. With throughput pools scaling to tens of terabits per second, OSS QoS becomes a foundational capability for storage‑compute separation architectures and unified AI data lakes.

OSS SDK V2 Upgrade

The OSS SDK V2 upgrade delivers a comprehensive modernization of the OSS client experience, offering higher performance, stronger security, and broader language coverage for developers building AI, analytics, and cloud‑native applications. This new version introduces a fully asynchronous API architecture that significantly improves throughput for high‑concurrency workloads such as model training, data ingestion, and large‑scale ETL. It adopts default V4 authentication for enhanced security and more efficient request signing, reducing overhead for frequent or parallel operations. SDK V2 provides full language support—including Go, Python, PHP, .NET, Swift, and Java—ensuring consistent behavior and performance across diverse development environments. With improved error handling, streamlined configuration, and optimized network usage, SDK V2 enables developers to interact with OSS more efficiently while taking full advantage of the platform’s evolving AI‑native capabilities.

These upgrades enable OSS to serve as the backbone for AI data lakes and inference platforms.

Conclusion

Alibaba Cloud OSS is no longer just object storage—it’s a full-stack, AI-native data infrastructure. With Vector Bucket, Metaquery, Content Awareness, Semantic Retrieval, OSS Accelerator and other upgrades, OSS supports:

● AI training and inference

● Intelligent data discovery

● Semantic search and retrieval

● Cost-effective RAG applications

● AI-powered Content Management Platform

● Unified data lake architectures

Whether you're building AI Agents, managing enterprise digital assets, or scaling AI workloads, OSS is ready to power your next-generation data strategy.

相关实践学习
对象存储OSS快速上手——如何使用ossbrowser
本实验是对象存储OSS入门级实验。通过本实验,用户可学会如何用对象OSS的插件,进行简单的数据存、查、删等操作。
相关文章
|
存储 人工智能 自动驾驶
高性能存储CPFS在AIGC场景的具体应用
高性能存储CPFS在AIGC场景的具体应用
|
1月前
|
人工智能 API 机器人
OpenClaw 用户部署和使用指南汇总
本文档为OpenClaw(原MoltBot)官方使用指南,涵盖一键部署(阿里云轻量服务器年仅68元)、钉钉/飞书/企微等多平台AI员工搭建、典型场景实践及高频问题FAQ。同步更新产品化修复进展,助力用户高效落地7×24小时主动执行AI助手。
19711 104
|
15天前
|
存储 人工智能 缓存
四年三次,再获殊荣!阿里云斩获全球存储顶会 FAST’26 最佳论文
阿里云联合上海交大、Solidigm论文《Here, There and Everywhere》获 FAST '26 最佳论文奖,在过去四年内第三次摘得这一国际学术界最高荣誉。论文梳理了本地盘技术的“三代进化史”,并提出了本云融合存储新架构—— Latte,利用软硬协同与本云融合的技术红利,为云原生数据库、AI推理及大数据分析奠定更坚实的基石。
207 2
|
存储 人工智能 运维
阿里云联合上海交大荣膺 FAST'26 最佳论文:揭秘云上本地存储的演进与未来发展
通过论文,阿里云展示了如何利用软硬协同(ASIC+SoC)与端云融合(Local+EBS)的技术红利,打破存储性能、成本与可靠性的“不可能三角”。
177 1
|
2天前
|
存储 人工智能 开发工具
OSS 向量 Bucket 最佳实践:快速构建多模态图片语义检索
本文介绍基于 OSS 向量 Bucket 和阿里云大模型服务平台百炼的多模态 Embedding 模型,搭建海量图片的智能语义检索系统,实现基于自然语言描述的文搜图能力的最佳实践,适用于电商商品搜索、智能相册、媒体资产管理、AI 语义检索、图片知识库等场景。
|
2天前
|
弹性计算 安全 网络安全
最佳实践:OSS AP 和云网络 Gateway Endpoint
本文手把手教你基于 OSS 接入点和 VPC 网关终端节点来构建安全的多租户私网访问架构。
|
人工智能 算法 数据管理
阿里云 OSS MetaQuery 全面升级——新增内容和语义的检索能力,助力 AI 应用快速落地
阿里云 OSS MetaQuery(数据索引)全新升级,支持基于内容和语义的智能检索,面向安防监控、智慧社区、智能零售等场景。企业可快速开启该能力,无需自建基础设施或优化模型,即可自动完成视频、图片、文档等非结构化数据的向量化与索引构建,基于成熟的精排算法和多路召回机制,有效提升检索准确率与召回率,轻松实现 RAG 多模态语义检索和 AI 应用,标志着 OSS 迈入 AI 原生数据管理新时代。
882 0
|
7月前
|
存储 人工智能 NoSQL
万字解码 Agentic AI 时代的记忆系统演进之路
本文深入探讨了在 Agentic AI 时代,记忆(Memory) 作为智能体核心能力的定义、构建与技术演进。
2229 9
万字解码 Agentic AI 时代的记忆系统演进之路
|
2月前
|
人工智能 自然语言处理 Serverless
新突破!阿里云携手技威时代共同开启 IPC 智能化新阶段
阿里云IPC AI方案融合千问大模型视觉理解与OSS Metaquery多模态检索,实现Serverless、低成本、高准召的智能视频检索。无需硬件改造,存量设备即可升级,一句自然语言唤醒沉睡视频,让“看”升级为“懂”。
176 4
|
9月前
|
人工智能 Apache 流计算
FFA 2025 新加坡站全议程上线|The Future of AI is Real-Time
Flink Forward Asia 2025将于7月3日在新加坡举办,主题为“实时智能的未来”。大会聚焦实时AI、实时湖仓与实时分析,展示Apache Flink及社区项目如Paimon、Fluss的最新成果。来自阿里云、AWS、TikTok等企业专家将分享洞见,现场及直播观众均可参与互动抽奖,共襄技术盛宴。
655 14
FFA 2025 新加坡站全议程上线|The Future of AI is Real-Time