Alibaba Cloud OSS: From Object Storage to AI-Native Data Infrastructure with Vector Bucket & Metaquery

本文涉及的产品
对象存储 OSS,OSS 加速器 50 GB 1个月
简介: This article provides an in-depth look at how OSS builds an AI-native storage foundation to empower key scenarios including Retrieval-Augmented Generation (RAG), enterprise search, and AI-powered content management—helping you efficiently build next-generation intelligent applications.

By Justin See


As data volumes explode and AI becomes central to enterprise competitiveness, traditional object storage must evolve. Alibaba Cloud Object Storage Service (OSS) is leading this transformation—moving from a passive data carrier to an intelligent, AI-powered knowledge provider.

This blog explores OSS’s technical foundation, its evolution toward AI-native workloads, and deep dives into two major innovations: Vector Bucket and Meta Query.

In the AI era, Vector Data and the ability to query it, is the foundation of AI applications. This diagram illustrates how diverse unstructured data types—documents, images, videos —are transformed into dimensional vector representations through embedding. These vectors serve as the backbone for advanced AI capabilities such as content awareness and semantic retrieval. By converting raw data into structured vector formats, OSS enables intelligent indexing, search, and discovery across massive datasets, forming the basis for applications like retrieval-augmented generation (RAG), enterprise search, and AI content management.

image.png

1. OSS Architecture and Core Capabilities


Alibaba Cloud OSS is a distributed, highly durable object storage system designed for massive-scale unstructured data. It supports workloads ranging from web hosting and backup to AI training and semantic retrieval.

OSS Architecture Layers

Layer Components
Application Interfaces API, SDKs, CLI
OSS Service Control Layer, Access Layer, Metadata Indexing
Storage Infrastructure Erasure Coding, Zone-Redundant Storage
AI Integration MCP Server, AI Assistant, Semantic Retrieval Engine

Storage Classes and Cost Optimization

Class Use Case Retrieval Time
Standard Hot data Instant
IA Infrequent access Instant
Archive Cold data Minutes
Cold Archive Deep archive Hours
Deep Cold Archive Long-term retention Hours

Lifecycle policies and OSS Inventory now support predictive tiering and DryRun simulations to prevent accidental deletions.

Performance and QoS

OSS Resource Pool QoS enables:

● Shared throughput across buckets

● Priority-based scheduling (P1–P4)

● Guaranteed minimum bandwidth

● Real-time monitoring and dynamic allocation

This ensures stable performance across mixed workloads—online services, batch jobs, AI training, and data migration.

2. OSS Vector Bucket: AI-Native Storage for Embeddings


Vector data is the foundation of semantic search, recommendation engines, and retrieval-augmented generation (RAG). OSS Vector Bucket introduces native support for storing, indexing, and querying high-dimensional vectors.

image.png

Architecture

● Raw data stored in OSS Bucket

● Embedding service generates vectorized data

● Vectors stored in OSS Vector Bucket

● MCP Server enables semantic retrieval for AI Content Awareness

● AI Agent integrates with RAG and AI Semantic Retrieval

Difference with Traditional Vector Database

As enterprise AI workloads scale, the volume of vectorized data grows exponentially—driving up infrastructure costs and straining traditional storage architectures. OSS Vector Bucket offers a cost-effective alternative by decoupling compute from storage, allowing vector queries to be executed directly on OSS without relying on tightly coupled service nodes. In most AI use cases, customers are able to tolerate higher retrieval latencies—hundreds of milliseconds; OSS vector bucket has higher latencies but significantly reducing operational costs through pay-as-you-go pricing for both storage and query scans. By migrating to OSS Vector Bucket, enterprises can build retrieval-augmented generation (RAG) applications that are not only scalable and performant, but also financially sustainable.

Use Cases

● RAG Applications

● AI Agent Retrieval

● AI-powered Content Management Platform for Social Media, E-commerce, Media

Key Features

● Supports hundreds of billions of vectors per account

● Native OSS API with SDK, CLI, and console access

● Integrated with Tablestore for high-performance workloads

● Pay-as-you-go pricing: storage capacity, query volume

● Unified permission management via OSS bucket policies

Key Benefits

● Lower vector database costs

● Automatic scaling & limitless elasticity

● Reduces data silo and management overheads

Watch Demo: https://www.youtube.com/watch?v=xgY7k8hVS20

Documentation: Vector Bucket Documentation

3. OSS Metaquery: Content Awareness & Semantic Search


OSS Metaquery transforms raw unstructured data into an intelligent, searchable knowledge layer by automatically generating embeddings for every newly added object, eliminating the need for manual preprocessing or external vector pipelines. Once embedded, the data becomes immediately accessible through semantic search, allowing users to query their OSS buckets using natural language rather than rigid keywords or file paths. The system combines scalar filtering—such as metadata conditions on size, time, or tags—with vector-based similarity search to deliver highly relevant results through a hybrid retrieval engine. In production deployments, this approach has demonstrated precision‑recall rates of up to 85 percent, significantly outperforming traditional self‑built search solutions and enabling enterprises to build powerful retrieval‑augmented and content‑aware applications directly on top of OSS.

image.png

Use cases

● Intelligent Enterprise Search

● RAG applications

● AI-powered Content Management Platform

Key Features

● One-click enables automatic embeddings

● Semantic search using natural language

● No extra management overheads

Key Benefits

● Time to market significantly shortened without the need to build vector embedding & natural language search engine

● Higher accuracy search with multi-channel recall

● Lower costs by leveraging cost effective object storage, and no additional components to build or manage

Watch Demo: https://www.youtube.com/watch?v=xgY7k8hVS20

Documentation: Metaquery Documentation

4. OSS Accelerator and Other Tooling Upgrades


To support high-throughput AI workloads, OSS also introduced:

OSS Accelerator (EFC Cache) Upgrade

The OSS Accelerator introduces a high‑performance, compute‑proximate caching layer designed to dramatically improve data access speeds for AI, analytics, and real‑time workloads running on Alibaba Cloud. By deploying NVMe‑based cache nodes in the same zone as compute resources, the accelerator reduces read latency to single‑digit milliseconds and supports burst throughput of up to 100 GB/s with more than 100,000 QPS. It requires no application changes: all OSS clients, SDKs, and tools automatically benefit from acceleration through a unified namespace, with strong consistency maintained between cached data and the underlying OSS objects. Multiple caching strategies—including on‑read prefetch, manual prefetch, and synchronized prefetch—ensure that hot data is always available at high speed, while an LRU eviction policy manages cache capacity efficiently. This upgrade is particularly impactful for workloads such as model loading, BI hot table queries, and high‑frequency inference, enabling organizations to achieve low‑latency performance while keeping the majority of their data stored cost‑effectively across OSS Standard, IA, Archive, and Cold Archive tiers.

OSSFS V2 Upgrade

OSSFS V2 is a next‑generation, high‑performance mount tool designed to make OSS behave like a local file system for AI, analytics, and containerized workloads. Built with a lightweight protocol and deeply optimized I/O path, OSSFS V2 delivers substantial performance improvements over previous versions—achieving up to 3 GB/s single‑thread read throughput and significantly reducing CPU and memory overhead. It introduces a negative metadata cache to minimize redundant lookups, improving responsiveness for workloads that perform frequent directory scans or small‑file access. OSSFS V2 is fully compatible with Kubernetes environments through CSI integration, enabling seamless mounting of OSS buckets as persistent volumes in ACK and ACS clusters. This makes it ideal for applications that cannot be easily modified to use the OSS SDK directly, such as legacy data processing pipelines, distributed training frameworks, and containerized AI workloads. With strong consistency, elastic scalability, and support for high‑concurrency access, OSSFS V2 allows organizations to use OSS as a high‑throughput data layer across the entire AI lifecycle—from data ingestion and preprocessing to model training and inference.

OSS Connector for Hadoop V2 Upgrade

The OSS Connector for Hadoop V2 delivers a major performance and efficiency leap for data lake and big data analytics workloads running on Hadoop and Spark. This new version introduces an adaptive prefetching mechanism that eliminates redundant metadata operations, significantly improving read throughput—up to 5.8× faster in benchmark tests—and reducing end‑to‑end SQL query time by 28.5 percent. It also integrates seamlessly with OSS Accelerator, enabling hot data to be cached on NVMe storage close to compute nodes, which can further reduce query latency by up to 40 percent. Built on top of the next‑generation OSS Java SDK V2, the connector adopts default V4 authentication for stronger security and improved performance. These enhancements make OSS Connector for Hadoop V2 a high‑performance, cloud‑native storage interface for AI data lakes, supporting large‑scale ETL, interactive analytics, and machine learning pipelines with significantly lower overhead and higher throughput.

OSS Resource Pool QoS Upgrade

The OSS Resource Pool QoS upgrade introduces a unified performance management framework that allows multiple buckets and workloads to share a common throughput pool while maintaining predictable service quality. Instead of each bucket operating in isolation, enterprises can now allocate and prioritize throughput across business units, job types, or RAM accounts. QoS policies support both priority‑based dynamic control—ensuring critical online services receive the bandwidth they need during peak hours—and minimum guaranteed throughput, which protects lower‑priority batch or analytics jobs from starvation. This fine‑grained control enables stable performance for mixed workloads such as AI training, data preprocessing, online inference, and large‑scale data migration, all while maximizing overall resource utilization. With throughput pools scaling to tens of terabits per second, OSS QoS becomes a foundational capability for storage‑compute separation architectures and unified AI data lakes.

OSS SDK V2 Upgrade

The OSS SDK V2 upgrade delivers a comprehensive modernization of the OSS client experience, offering higher performance, stronger security, and broader language coverage for developers building AI, analytics, and cloud‑native applications. This new version introduces a fully asynchronous API architecture that significantly improves throughput for high‑concurrency workloads such as model training, data ingestion, and large‑scale ETL. It adopts default V4 authentication for enhanced security and more efficient request signing, reducing overhead for frequent or parallel operations. SDK V2 provides full language support—including Go, Python, PHP, .NET, Swift, and Java—ensuring consistent behavior and performance across diverse development environments. With improved error handling, streamlined configuration, and optimized network usage, SDK V2 enables developers to interact with OSS more efficiently while taking full advantage of the platform’s evolving AI‑native capabilities.

These upgrades enable OSS to serve as the backbone for AI data lakes and inference platforms.

Conclusion

Alibaba Cloud OSS is no longer just object storage—it’s a full-stack, AI-native data infrastructure. With Vector Bucket, Metaquery, Content Awareness, Semantic Retrieval, OSS Accelerator and other upgrades, OSS supports:

● AI training and inference

● Intelligent data discovery

● Semantic search and retrieval

● Cost-effective RAG applications

● AI-powered Content Management Platform

● Unified data lake architectures

Whether you're building AI Agents, managing enterprise digital assets, or scaling AI workloads, OSS is ready to power your next-generation data strategy.

相关实践学习
对象存储OSS快速上手——如何使用ossbrowser
本实验是对象存储OSS入门级实验。通过本实验,用户可学会如何用对象OSS的插件,进行简单的数据存、查、删等操作。
相关文章
|
存储 人工智能 搜索推荐
Spring AI Alibaba DeepResearch源码解读
DeepResearch是SAA社区推出的智能体项目,支持复杂信息搜索、分析与结构化报告生成。其基于Graph构建14个协同节点(如Coordinator、Planner、Researcher等),融合Plan & Execute、LLM Reflection、Hybrid RAG、Self-evolving角色记忆、HITL等前沿技术,实现端到端深度研究自动化
252 0
Spring AI Alibaba DeepResearch源码解读
|
25天前
|
存储 人工智能 Linux
2026年阿里云部署OpenClaw(Clawdbot)稳定运行终极指南+5大避坑设置
在AI智能体工具飞速发展的2026年,OpenClaw(原Clawdbot、Moltbot)凭借开源灵活、功能强大的特性,成为个人与中小企业打造专属AI助手的热门选择。它能承担代码开发、日程管理、文档处理等各类任务,但“脾气刁钻”的问题也让不少用户头疼——一言不合就崩溃、重启就失忆、Token消耗过快、配置文件易丢失。
1046 1
|
27天前
|
人工智能 应用服务中间件 网络安全
2026年阿里云部署OpenClaw(Clawdbot)流程,OpenClaw无缝接入个人微信攻略
在AI智能助手深度融入日常沟通的2026年,OpenClaw(曾用名Clawdbot、Moltbot)作为功能强大的开源AI框架,支持邮件管理、代码生成、信息查询等多元化任务。将其接入个人微信,能实现“随时随地发指令、AI助手秒响应”的便捷体验——无需打开专用客户端,通过微信即可调用OpenClaw的全部功能,无论是查询资讯、生成文档还是执行自动化任务,都能高效完成。
3614 1
|
2月前
|
机器人 API 数据安全/隐私保护
只需3步,无影云电脑一键部署Moltbot(Clawdbot)
本指南详解Moltbot(Clawdbot)部署全流程:一、购买无影云电脑Moltbot专属套餐(含2000核时);二、下载客户端并配置百炼API Key、钉钉APP KEY及QQ通道;三、验证钉钉/群聊交互。支持多端,7×24运行可关闭休眠。
5161 24
|
3月前
|
机器学习/深度学习 人工智能 API
MaaS市场全球领导者!阿里云5项能力获评最高评级
18日,Omdia发布《2025年全球企业级MaaS市场分析》报告,阿里云获评全球领导者,并在基础模型、模型精调等5大维度获最高评级,为中国唯一。报告指出,MaaS已进入2.0阶段,需提供全栈AI能力。阿里云凭借通义大模型、百炼平台等全栈方案,服务超100万客户,覆盖众多世界500强企业,GenAI渗透率居首。
277 2
|
2月前
|
人工智能 自然语言处理 Serverless
新突破!阿里云携手技威时代共同开启 IPC 智能化新阶段
阿里云IPC AI方案融合千问大模型视觉理解与OSS Metaquery多模态检索,实现Serverless、低成本、高准召的智能视频检索。无需硬件改造,存量设备即可升级,一句自然语言唤醒沉睡视频,让“看”升级为“懂”。
164 4
|
2月前
|
人工智能 Cloud Native 测试技术
AI Agent 职业路线全解析:从入门到专家的体系化成长路径
本文系统解析AI Agent驱动的软件工程范式变革,面向阿里云开发者,从技术认知、能力图谱、岗位细分到进阶路径,全面梳理Agent时代的职业发展逻辑。涵盖提示工程、架构设计、多Agent协同、云原生落地等核心能力,助力技术人构建面向大模型时代的竞争力。(238字)
499 7
|
2月前
|
人工智能 应用服务中间件 API
阿里云上线Clawdbot全套云服务,Clawdbot 部署保姆级教程
阿里云上线 Moltbot(原名 Clawdbot)全套云服务,用户可在轻量应用服务器或“无影云电脑”上快速启用 Moltbot,并按需调用阿里云“百炼”平台上的上百款“千问”系列大模型;同时打通消息信道(钉钉、iMessage 等),以实现“在消息里唤醒并指挥智能体”的端到端体验,目标是把开源 Agent 的火爆从个人部署扩展到企业与消费级云端服务。
1461 9