By Justin See
As data volumes explode and AI becomes central to enterprise competitiveness, traditional object storage must evolve. Alibaba Cloud Object Storage Service (OSS) is leading this transformation—moving from a passive data carrier to an intelligent, AI-powered knowledge provider.
This blog explores OSS’s technical foundation, its evolution toward AI-native workloads, and deep dives into two major innovations: Vector Bucket and Meta Query.
In the AI era, Vector Data and the ability to query it, is the foundation of AI applications. This diagram illustrates how diverse unstructured data types—documents, images, videos —are transformed into dimensional vector representations through embedding. These vectors serve as the backbone for advanced AI capabilities such as content awareness and semantic retrieval. By converting raw data into structured vector formats, OSS enables intelligent indexing, search, and discovery across massive datasets, forming the basis for applications like retrieval-augmented generation (RAG), enterprise search, and AI content management.
1. OSS Architecture and Core Capabilities
Alibaba Cloud OSS is a distributed, highly durable object storage system designed for massive-scale unstructured data. It supports workloads ranging from web hosting and backup to AI training and semantic retrieval.
OSS Architecture Layers
| Layer | Components |
| Application Interfaces | API, SDKs, CLI |
| OSS Service | Control Layer, Access Layer, Metadata Indexing |
| Storage Infrastructure | Erasure Coding, Zone-Redundant Storage |
| AI Integration | MCP Server, AI Assistant, Semantic Retrieval Engine |
Storage Classes and Cost Optimization
| Class | Use Case | Retrieval Time |
| Standard | Hot data | Instant |
| IA | Infrequent access | Instant |
| Archive | Cold data | Minutes |
| Cold Archive | Deep archive | Hours |
| Deep Cold Archive | Long-term retention | Hours |
Lifecycle policies and OSS Inventory now support predictive tiering and DryRun simulations to prevent accidental deletions.
Performance and QoS
OSS Resource Pool QoS enables:
● Shared throughput across buckets
● Priority-based scheduling (P1–P4)
● Guaranteed minimum bandwidth
● Real-time monitoring and dynamic allocation
This ensures stable performance across mixed workloads—online services, batch jobs, AI training, and data migration.
2. OSS Vector Bucket: AI-Native Storage for Embeddings
Vector data is the foundation of semantic search, recommendation engines, and retrieval-augmented generation (RAG). OSS Vector Bucket introduces native support for storing, indexing, and querying high-dimensional vectors.
Architecture
● Raw data stored in OSS Bucket
● Embedding service generates vectorized data
● Vectors stored in OSS Vector Bucket
● MCP Server enables semantic retrieval for AI Content Awareness
● AI Agent integrates with RAG and AI Semantic Retrieval
Difference with Traditional Vector Database
As enterprise AI workloads scale, the volume of vectorized data grows exponentially—driving up infrastructure costs and straining traditional storage architectures. OSS Vector Bucket offers a cost-effective alternative by decoupling compute from storage, allowing vector queries to be executed directly on OSS without relying on tightly coupled service nodes. In most AI use cases, customers are able to tolerate higher retrieval latencies—hundreds of milliseconds; OSS vector bucket has higher latencies but significantly reducing operational costs through pay-as-you-go pricing for both storage and query scans. By migrating to OSS Vector Bucket, enterprises can build retrieval-augmented generation (RAG) applications that are not only scalable and performant, but also financially sustainable.
Use Cases
● RAG Applications
● AI Agent Retrieval
● AI-powered Content Management Platform for Social Media, E-commerce, Media
Key Features
● Supports hundreds of billions of vectors per account
● Native OSS API with SDK, CLI, and console access
● Integrated with Tablestore for high-performance workloads
● Pay-as-you-go pricing: storage capacity, query volume
● Unified permission management via OSS bucket policies
Key Benefits
● Lower vector database costs
● Automatic scaling & limitless elasticity
● Reduces data silo and management overheads
Watch Demo: https://www.youtube.com/watch?v=xgY7k8hVS20
Documentation: Vector Bucket Documentation
3. OSS Metaquery: Content Awareness & Semantic Search
OSS Metaquery transforms raw unstructured data into an intelligent, searchable knowledge layer by automatically generating embeddings for every newly added object, eliminating the need for manual preprocessing or external vector pipelines. Once embedded, the data becomes immediately accessible through semantic search, allowing users to query their OSS buckets using natural language rather than rigid keywords or file paths. The system combines scalar filtering—such as metadata conditions on size, time, or tags—with vector-based similarity search to deliver highly relevant results through a hybrid retrieval engine. In production deployments, this approach has demonstrated precision‑recall rates of up to 85 percent, significantly outperforming traditional self‑built search solutions and enabling enterprises to build powerful retrieval‑augmented and content‑aware applications directly on top of OSS.
Use cases
● Intelligent Enterprise Search
● RAG applications
● AI-powered Content Management Platform
Key Features
● One-click enables automatic embeddings
● Semantic search using natural language
● No extra management overheads
Key Benefits
● Time to market significantly shortened without the need to build vector embedding & natural language search engine
● Higher accuracy search with multi-channel recall
● Lower costs by leveraging cost effective object storage, and no additional components to build or manage
Watch Demo: https://www.youtube.com/watch?v=xgY7k8hVS20
Documentation: Metaquery Documentation
4. OSS Accelerator and Other Tooling Upgrades
To support high-throughput AI workloads, OSS also introduced:
OSS Accelerator (EFC Cache) Upgrade
The OSS Accelerator introduces a high‑performance, compute‑proximate caching layer designed to dramatically improve data access speeds for AI, analytics, and real‑time workloads running on Alibaba Cloud. By deploying NVMe‑based cache nodes in the same zone as compute resources, the accelerator reduces read latency to single‑digit milliseconds and supports burst throughput of up to 100 GB/s with more than 100,000 QPS. It requires no application changes: all OSS clients, SDKs, and tools automatically benefit from acceleration through a unified namespace, with strong consistency maintained between cached data and the underlying OSS objects. Multiple caching strategies—including on‑read prefetch, manual prefetch, and synchronized prefetch—ensure that hot data is always available at high speed, while an LRU eviction policy manages cache capacity efficiently. This upgrade is particularly impactful for workloads such as model loading, BI hot table queries, and high‑frequency inference, enabling organizations to achieve low‑latency performance while keeping the majority of their data stored cost‑effectively across OSS Standard, IA, Archive, and Cold Archive tiers.
OSSFS V2 Upgrade
OSSFS V2 is a next‑generation, high‑performance mount tool designed to make OSS behave like a local file system for AI, analytics, and containerized workloads. Built with a lightweight protocol and deeply optimized I/O path, OSSFS V2 delivers substantial performance improvements over previous versions—achieving up to 3 GB/s single‑thread read throughput and significantly reducing CPU and memory overhead. It introduces a negative metadata cache to minimize redundant lookups, improving responsiveness for workloads that perform frequent directory scans or small‑file access. OSSFS V2 is fully compatible with Kubernetes environments through CSI integration, enabling seamless mounting of OSS buckets as persistent volumes in ACK and ACS clusters. This makes it ideal for applications that cannot be easily modified to use the OSS SDK directly, such as legacy data processing pipelines, distributed training frameworks, and containerized AI workloads. With strong consistency, elastic scalability, and support for high‑concurrency access, OSSFS V2 allows organizations to use OSS as a high‑throughput data layer across the entire AI lifecycle—from data ingestion and preprocessing to model training and inference.
OSS Connector for Hadoop V2 Upgrade
The OSS Connector for Hadoop V2 delivers a major performance and efficiency leap for data lake and big data analytics workloads running on Hadoop and Spark. This new version introduces an adaptive prefetching mechanism that eliminates redundant metadata operations, significantly improving read throughput—up to 5.8× faster in benchmark tests—and reducing end‑to‑end SQL query time by 28.5 percent. It also integrates seamlessly with OSS Accelerator, enabling hot data to be cached on NVMe storage close to compute nodes, which can further reduce query latency by up to 40 percent. Built on top of the next‑generation OSS Java SDK V2, the connector adopts default V4 authentication for stronger security and improved performance. These enhancements make OSS Connector for Hadoop V2 a high‑performance, cloud‑native storage interface for AI data lakes, supporting large‑scale ETL, interactive analytics, and machine learning pipelines with significantly lower overhead and higher throughput.
OSS Resource Pool QoS Upgrade
The OSS Resource Pool QoS upgrade introduces a unified performance management framework that allows multiple buckets and workloads to share a common throughput pool while maintaining predictable service quality. Instead of each bucket operating in isolation, enterprises can now allocate and prioritize throughput across business units, job types, or RAM accounts. QoS policies support both priority‑based dynamic control—ensuring critical online services receive the bandwidth they need during peak hours—and minimum guaranteed throughput, which protects lower‑priority batch or analytics jobs from starvation. This fine‑grained control enables stable performance for mixed workloads such as AI training, data preprocessing, online inference, and large‑scale data migration, all while maximizing overall resource utilization. With throughput pools scaling to tens of terabits per second, OSS QoS becomes a foundational capability for storage‑compute separation architectures and unified AI data lakes.
OSS SDK V2 Upgrade
The OSS SDK V2 upgrade delivers a comprehensive modernization of the OSS client experience, offering higher performance, stronger security, and broader language coverage for developers building AI, analytics, and cloud‑native applications. This new version introduces a fully asynchronous API architecture that significantly improves throughput for high‑concurrency workloads such as model training, data ingestion, and large‑scale ETL. It adopts default V4 authentication for enhanced security and more efficient request signing, reducing overhead for frequent or parallel operations. SDK V2 provides full language support—including Go, Python, PHP, .NET, Swift, and Java—ensuring consistent behavior and performance across diverse development environments. With improved error handling, streamlined configuration, and optimized network usage, SDK V2 enables developers to interact with OSS more efficiently while taking full advantage of the platform’s evolving AI‑native capabilities.
These upgrades enable OSS to serve as the backbone for AI data lakes and inference platforms.
Conclusion
Alibaba Cloud OSS is no longer just object storage—it’s a full-stack, AI-native data infrastructure. With Vector Bucket, Metaquery, Content Awareness, Semantic Retrieval, OSS Accelerator and other upgrades, OSS supports:
● AI training and inference
● Intelligent data discovery
● Semantic search and retrieval
● Cost-effective RAG applications
● AI-powered Content Management Platform
● Unified data lake architectures
Whether you're building AI Agents, managing enterprise digital assets, or scaling AI workloads, OSS is ready to power your next-generation data strategy.