Alibaba Cloud OSS: From Object Storage to AI-Native Data Infrastructure with Vector Bucket & Metaquery

简介: This article provides an in-depth look at how OSS builds an AI-native storage foundation to empower key scenarios including Retrieval-Augmented Generation (RAG), enterprise search, and AI-powered content management—helping you efficiently build next-generation intelligent applications.

By Justin See


As data volumes explode and AI becomes central to enterprise competitiveness, traditional object storage must evolve. Alibaba Cloud Object Storage Service (OSS) is leading this transformation—moving from a passive data carrier to an intelligent, AI-powered knowledge provider.

This blog explores OSS’s technical foundation, its evolution toward AI-native workloads, and deep dives into two major innovations: Vector Bucket and Meta Query.

In the AI era, Vector Data and the ability to query it, is the foundation of AI applications. This diagram illustrates how diverse unstructured data types—documents, images, videos —are transformed into dimensional vector representations through embedding. These vectors serve as the backbone for advanced AI capabilities such as content awareness and semantic retrieval. By converting raw data into structured vector formats, OSS enables intelligent indexing, search, and discovery across massive datasets, forming the basis for applications like retrieval-augmented generation (RAG), enterprise search, and AI content management.

image.png

1. OSS Architecture and Core Capabilities


Alibaba Cloud OSS is a distributed, highly durable object storage system designed for massive-scale unstructured data. It supports workloads ranging from web hosting and backup to AI training and semantic retrieval.

OSS Architecture Layers

Layer Components
Application Interfaces API, SDKs, CLI
OSS Service Control Layer, Access Layer, Metadata Indexing
Storage Infrastructure Erasure Coding, Zone-Redundant Storage
AI Integration MCP Server, AI Assistant, Semantic Retrieval Engine

Storage Classes and Cost Optimization

Class Use Case Retrieval Time
Standard Hot data Instant
IA Infrequent access Instant
Archive Cold data Minutes
Cold Archive Deep archive Hours
Deep Cold Archive Long-term retention Hours

Lifecycle policies and OSS Inventory now support predictive tiering and DryRun simulations to prevent accidental deletions.

Performance and QoS

OSS Resource Pool QoS enables:

● Shared throughput across buckets

● Priority-based scheduling (P1–P4)

● Guaranteed minimum bandwidth

● Real-time monitoring and dynamic allocation

This ensures stable performance across mixed workloads—online services, batch jobs, AI training, and data migration.

2. OSS Vector Bucket: AI-Native Storage for Embeddings


Vector data is the foundation of semantic search, recommendation engines, and retrieval-augmented generation (RAG). OSS Vector Bucket introduces native support for storing, indexing, and querying high-dimensional vectors.

image.png

Architecture

● Raw data stored in OSS Bucket

● Embedding service generates vectorized data

● Vectors stored in OSS Vector Bucket

● MCP Server enables semantic retrieval for AI Content Awareness

● AI Agent integrates with RAG and AI Semantic Retrieval

Difference with Traditional Vector Database

As enterprise AI workloads scale, the volume of vectorized data grows exponentially—driving up infrastructure costs and straining traditional storage architectures. OSS Vector Bucket offers a cost-effective alternative by decoupling compute from storage, allowing vector queries to be executed directly on OSS without relying on tightly coupled service nodes. In most AI use cases, customers are able to tolerate higher retrieval latencies—hundreds of milliseconds; OSS vector bucket has higher latencies but significantly reducing operational costs through pay-as-you-go pricing for both storage and query scans. By migrating to OSS Vector Bucket, enterprises can build retrieval-augmented generation (RAG) applications that are not only scalable and performant, but also financially sustainable.

Use Cases

● RAG Applications

● AI Agent Retrieval

● AI-powered Content Management Platform for Social Media, E-commerce, Media

Key Features

● Supports hundreds of billions of vectors per account

● Native OSS API with SDK, CLI, and console access

● Integrated with Tablestore for high-performance workloads

● Pay-as-you-go pricing: storage capacity, query volume

● Unified permission management via OSS bucket policies

Key Benefits

● Lower vector database costs

● Automatic scaling & limitless elasticity

● Reduces data silo and management overheads

Watch Demo: https://www.youtube.com/watch?v=xgY7k8hVS20

Documentation: Vector Bucket Documentation

3. OSS Metaquery: Content Awareness & Semantic Search


OSS Metaquery transforms raw unstructured data into an intelligent, searchable knowledge layer by automatically generating embeddings for every newly added object, eliminating the need for manual preprocessing or external vector pipelines. Once embedded, the data becomes immediately accessible through semantic search, allowing users to query their OSS buckets using natural language rather than rigid keywords or file paths. The system combines scalar filtering—such as metadata conditions on size, time, or tags—with vector-based similarity search to deliver highly relevant results through a hybrid retrieval engine. In production deployments, this approach has demonstrated precision‑recall rates of up to 85 percent, significantly outperforming traditional self‑built search solutions and enabling enterprises to build powerful retrieval‑augmented and content‑aware applications directly on top of OSS.

image.png

Use cases

● Intelligent Enterprise Search

● RAG applications

● AI-powered Content Management Platform

Key Features

● One-click enables automatic embeddings

● Semantic search using natural language

● No extra management overheads

Key Benefits

● Time to market significantly shortened without the need to build vector embedding & natural language search engine

● Higher accuracy search with multi-channel recall

● Lower costs by leveraging cost effective object storage, and no additional components to build or manage

Watch Demo: https://www.youtube.com/watch?v=xgY7k8hVS20

Documentation: Metaquery Documentation

4. OSS Accelerator and Other Tooling Upgrades


To support high-throughput AI workloads, OSS also introduced:

OSS Accelerator (EFC Cache) Upgrade

The OSS Accelerator introduces a high‑performance, compute‑proximate caching layer designed to dramatically improve data access speeds for AI, analytics, and real‑time workloads running on Alibaba Cloud. By deploying NVMe‑based cache nodes in the same zone as compute resources, the accelerator reduces read latency to single‑digit milliseconds and supports burst throughput of up to 100 GB/s with more than 100,000 QPS. It requires no application changes: all OSS clients, SDKs, and tools automatically benefit from acceleration through a unified namespace, with strong consistency maintained between cached data and the underlying OSS objects. Multiple caching strategies—including on‑read prefetch, manual prefetch, and synchronized prefetch—ensure that hot data is always available at high speed, while an LRU eviction policy manages cache capacity efficiently. This upgrade is particularly impactful for workloads such as model loading, BI hot table queries, and high‑frequency inference, enabling organizations to achieve low‑latency performance while keeping the majority of their data stored cost‑effectively across OSS Standard, IA, Archive, and Cold Archive tiers.

OSSFS V2 Upgrade

OSSFS V2 is a next‑generation, high‑performance mount tool designed to make OSS behave like a local file system for AI, analytics, and containerized workloads. Built with a lightweight protocol and deeply optimized I/O path, OSSFS V2 delivers substantial performance improvements over previous versions—achieving up to 3 GB/s single‑thread read throughput and significantly reducing CPU and memory overhead. It introduces a negative metadata cache to minimize redundant lookups, improving responsiveness for workloads that perform frequent directory scans or small‑file access. OSSFS V2 is fully compatible with Kubernetes environments through CSI integration, enabling seamless mounting of OSS buckets as persistent volumes in ACK and ACS clusters. This makes it ideal for applications that cannot be easily modified to use the OSS SDK directly, such as legacy data processing pipelines, distributed training frameworks, and containerized AI workloads. With strong consistency, elastic scalability, and support for high‑concurrency access, OSSFS V2 allows organizations to use OSS as a high‑throughput data layer across the entire AI lifecycle—from data ingestion and preprocessing to model training and inference.

OSS Connector for Hadoop V2 Upgrade

The OSS Connector for Hadoop V2 delivers a major performance and efficiency leap for data lake and big data analytics workloads running on Hadoop and Spark. This new version introduces an adaptive prefetching mechanism that eliminates redundant metadata operations, significantly improving read throughput—up to 5.8× faster in benchmark tests—and reducing end‑to‑end SQL query time by 28.5 percent. It also integrates seamlessly with OSS Accelerator, enabling hot data to be cached on NVMe storage close to compute nodes, which can further reduce query latency by up to 40 percent. Built on top of the next‑generation OSS Java SDK V2, the connector adopts default V4 authentication for stronger security and improved performance. These enhancements make OSS Connector for Hadoop V2 a high‑performance, cloud‑native storage interface for AI data lakes, supporting large‑scale ETL, interactive analytics, and machine learning pipelines with significantly lower overhead and higher throughput.

OSS Resource Pool QoS Upgrade

The OSS Resource Pool QoS upgrade introduces a unified performance management framework that allows multiple buckets and workloads to share a common throughput pool while maintaining predictable service quality. Instead of each bucket operating in isolation, enterprises can now allocate and prioritize throughput across business units, job types, or RAM accounts. QoS policies support both priority‑based dynamic control—ensuring critical online services receive the bandwidth they need during peak hours—and minimum guaranteed throughput, which protects lower‑priority batch or analytics jobs from starvation. This fine‑grained control enables stable performance for mixed workloads such as AI training, data preprocessing, online inference, and large‑scale data migration, all while maximizing overall resource utilization. With throughput pools scaling to tens of terabits per second, OSS QoS becomes a foundational capability for storage‑compute separation architectures and unified AI data lakes.

OSS SDK V2 Upgrade

The OSS SDK V2 upgrade delivers a comprehensive modernization of the OSS client experience, offering higher performance, stronger security, and broader language coverage for developers building AI, analytics, and cloud‑native applications. This new version introduces a fully asynchronous API architecture that significantly improves throughput for high‑concurrency workloads such as model training, data ingestion, and large‑scale ETL. It adopts default V4 authentication for enhanced security and more efficient request signing, reducing overhead for frequent or parallel operations. SDK V2 provides full language support—including Go, Python, PHP, .NET, Swift, and Java—ensuring consistent behavior and performance across diverse development environments. With improved error handling, streamlined configuration, and optimized network usage, SDK V2 enables developers to interact with OSS more efficiently while taking full advantage of the platform’s evolving AI‑native capabilities.

These upgrades enable OSS to serve as the backbone for AI data lakes and inference platforms.

Conclusion

Alibaba Cloud OSS is no longer just object storage—it’s a full-stack, AI-native data infrastructure. With Vector Bucket, Metaquery, Content Awareness, Semantic Retrieval, OSS Accelerator and other upgrades, OSS supports:

● AI training and inference

● Intelligent data discovery

● Semantic search and retrieval

● Cost-effective RAG applications

● AI-powered Content Management Platform

● Unified data lake architectures

Whether you're building AI Agents, managing enterprise digital assets, or scaling AI workloads, OSS is ready to power your next-generation data strategy.

相关实践学习
对象存储OSS快速上手——如何使用ossbrowser
本实验是对象存储OSS入门级实验。通过本实验,用户可学会如何用对象OSS的插件,进行简单的数据存、查、删等操作。
相关文章
|
4天前
|
存储 人工智能 自然语言处理
阿里云OpenClaw(原Clawdbot)一键部署指南:零基础秒级启用AI助理
OpenClaw(前身为Clawdbot、Moltbot)是一款具备自然语言理解与任务自动化能力的AI代理工具,能24小时响应指令,处理文件管理、信息查询、跨应用协同等实操任务。阿里云提供的专属一键部署方案,通过预配置镜像与可视化操作,简化了依赖安装、端口配置等复杂流程,零基础用户无需专业技术储备,也能在云服务器上快速启用该服务,打造专属智能助理。本文将详细拆解部署全流程、进阶功能配置及问题排查方案,助力高效落地使用。
239 14
|
3天前
|
弹性计算 搜索推荐 API
2026年阿里云三种方案快速部署 OpenClaw(Clawdbot)详细教程
OpenClaw(前身为Clawdbot、Moltbot)作为一款功能强大的开源AI代理与自动化平台,凭借自然语言理解、任务自动化执行以及多工具集成的核心能力,能够实现读写文件、运行脚本、搭建个性化工作流等一系列实用操作,广泛适配个人办公自动化、开发辅助以及轻量团队协作等多种场景。2026年阿里云针对不同用户的使用需求,推出了轻量云服务器、无影云电脑、ECS云服务器三种针对性部署方案,均通过预置专属镜像简化配置流程,无需复杂命令行操作,零基础用户也可顺利完成部署。本文将详细拆解这三种方案的完整部署流程,同时梳理操作中的关键要点与常见问题,助力不同场景用户快速搭建专属OpenClaw服务。
263 3
|
28天前
|
存储 弹性计算 对象存储
阿里云服务器租用费用价格:2026手动整理一年、1个月和1小时收费标准
2026年阿里云服务器租用价格出炉,涵盖轻量应用服务器、ECS、GPU服务器等多种类型。提供38元/年起秒杀价,99元/年起爆款套餐,支持按年、月、小时计费,覆盖个人开发者至企业级应用场景,附带详细选型建议与附加费用说明。
285 10
|
30天前
|
域名解析 网络协议 安全
2026阿里云云解析 DNS 个人版深度解析:功能、价格与选型参考
在域名解析服务领域,阿里云云解析 DNS 凭借全球节点覆盖与安全防护能力,成为个人开发者与企业用户的重要选择。其中个人版以亲民的价格,在性价比层面具备显著吸引力。下面从版本定位、核心功能、价格体系、实测表现及选型建议等维度,对阿里云云解析 DNS 个人版进行全面解析,为用户提供客观决策依据。
|
28天前
|
存储 弹性计算 人工智能
租用云服务器多少钱一个月?2026年阿里云服务器租用月付价格说明
租阿里云服务器的月费受实例类型、配置、地域等因素影响,从 25 元到数千元不等,覆盖个人开发、中小企业业务到高性能计算等全场景。以下结合 2026 年最新价格信息,梳理轻量应用服务器、ECS 云服务器及 GPU 服务器的核心配置与月费,帮助用户根据需求精准选择,避免成本浪费。
|
2月前
|
存储 人工智能 安全
无影GPU云电脑,焕新升级!
在AI与3D技术爆发时代,阿里云「无影云AI工作站」以RTX 5880 Ada GPU打造云端超级终端,提供192GB显存、4K低延迟体验,支持实时渲染、AIGC、数字人直播。全球11地部署、按需付费、数据加密安全可控,免本地硬件投入,让创意与科研高效前行。
|
1月前
|
存储 Web App开发 安全
2026年网站搭建教程(详细的建站步骤)
企业网站搭建全流程:从需求分析、架构规划到域名注册、主机配置,依托PageAdmin系统完成部署与内容建设,经多维度测试后正式上线。流程覆盖前期准备、基础搭建、系统安装、内容填充及发布运维,确保网站稳定、兼容、易维护,适用于企业、政务等机构,具备强复制性与标准化指导意义。(238字)
438 3
|
3月前
|
人工智能
阿里云优惠券在哪领取?手动整理2025阿里云代金券、上云补贴活动整理
2025阿里云优惠券免费领!个人可享6折券+12张代金券,总计2088元;学生专享300元无门槛券,0元购服务器;企业用户更有5亿算力补贴、10万出海支持及AI先锋计划。点击活动中心、学生专区、企业权益中心一键领取,上云更省钱!
548 5
|
10天前
|
存储 人工智能 弹性计算
2026年阿里云新用户专享活动规则及v新功能汇总参考
阿里云新用户专享活动核心围绕“低价套餐+叠加优惠”展开,规则聚焦身份定义、限购续费、叠加限制三大核心;2026年新功能则重点发力AI融合、存储优化、数据库智能运维等领域,覆盖多模态处理、高效检索、智能决策等场景。以下是详细解读,均基于2026年官方最新政策与发布信息。
155 15
|
2天前
|
人工智能 弹性计算 自然语言处理
2026年阿里云 OpenClaw(Clawdbot)一键部署教程(图文版)
2026年AI Agent爆发在即,OpenClaw(原Clawdbot/Moltbot)作为开源、本地优先的AI智能代理平台,支持7×24小时私人助理部署。它不止聊天,更能执行文件处理、日程管理、跨平台自动化等真实任务,兼容Qwen/GPT/Claude等多模型,是您的全能“数字员工”。
111 17