cassandra mongodb选择——cassandra:分布式扩展好,写性能强,以及可以预料的查询;mongodb:非事务,支持复杂查询,但是不适合报表-阿里云开发者社区

开发者社区> 数据库> 正文

cassandra mongodb选择——cassandra:分布式扩展好,写性能强,以及可以预料的查询;mongodb:非事务,支持复杂查询,但是不适合报表


Of course, like any technology MongoDB has its strengths and weaknesses. MongoDB is designed for OLTP workloads. It can do complex queries, but it’s not necessarily the best fit for reporting-style workloads. Or if you need complex transactions, it’s not going to be a good choice. However, MongoDB’s simplicity makes it a great place to start.


This ease of scaling, coupled with exceptional write performance (“All you’re doing is appending to the end of a log file”) and predictable query performance, add up to a high-performance workhorse in Cassandra.



Cassandra does not support Range based row-scans which may be limiting in certain use-cases. Cassandra is well suited for supporting single-row queries, or selecting multiple rows based on a Column-Value index.Cassandra supports secondary indexes on column families where the column name is known. Aggregations in Cassandra are not supported by the Cassandra nodes - client must provide aggregations. When the aggregation requirement spans multiple rows, Random Partitioning makes aggregations very difficult for the client. Recommendation is to use Storm or Hadoop for aggregations.





Comparison Of NoSQL Databases HBase, Cassandra & MongoDB:
Key characteristics:
· Distributed and scalable big data store
· Strong consistency
· Built on top of Hadoop HDFS
· CP on CAP

Good for:
· Optimized for read
· Well suited for range based scan
· Strict consistency
· Fast read and write with scalability

Not good for:
· Classic transactional applications or even relational analytics
· Applications need full table scan
· Data to be aggregated, rolled up, analyzed cross rows

Usage Case: Facebook message

Key characteristics:
· High availability
· Incremental scalability
· Eventually consistent
· Trade-offs between consistency and latency
· Minimal administration
· No SPF (Single point of failure) – all nodes are the same in Cassandra
· AP on CAP

Good for:
· Simple setup, maintenance code
· Fast random read/write
· Flexible parsing/wide column requirement
· No multiple secondary index needed

Not good for:
· Secondary index
· Relational data
· Transactional operations (Rollback, Commit)
· Primary & Financial record
· Stringent and authorization needed on data
· Dynamic queries/searching on column data
· Low latency

Usage Case: Twitter, Travel portal

Key characteristics:
· Schemas to change as applications evolve (Schema-free)
· Full index support for high performance
· Replication and failover for high availability
· Auto Sharding for easy Scalability
· Rich document based queries for easy readability
· Master-slave model
· CP on CAP

Good for:
· RDBMS replacement for web applications
· Semi-structured content management
· Real-time analytics and high-speed logging, caching and high scalability
· Web 2.0, Media, SAAS, Gaming

Not good for:
· Highly transactional system
· Applications with traditional database requirements such as foreign key constraints

Usage Case: Craigslist, Foursquare





For analytics, MongoDB provides a custom map/reduce implementation; Cassandra provides native Hadoop support, including for Hive (a SQL data warehouse built on Hadoop map/reduce) and Pig (a Hadoop-specific analysis language that many think is a better fit for map/reduce workloads than SQL).




+ 订阅