Businesses today are challenged by the ongoing explosion of data. Gartner is predicting data growth will exceed 650% over the next five years.
Organizations capture, track, analyze and store everything from mass quantities of transactional,online and mobile data, to growing amounts of machine-generated data. In fact, machine-generated data, including sources ranging from web, telecom network and call-detail records, to data from online gaming, social networks, sensors,computer logs, satellites, financial transaction feeds and more, represents the fastest-growing category of Big Data. Highvolume web sites can generate billions of data entries every month. As volumes expand into the tens of terabytes and even the petabyte range, IT departments are being pushed by end users to provide enhanced analytics and reporting against these everincreasing volumes of data. Managers need to be able to quickly understand this information, but, all too often, extracting useful intelligence can be like finding the proverbial ‘needle in the haystack.’Using traditional row-based databases that were not designed to analyze this amount of data, IT managers typically try to mitigate these plummeting response times using several responses. Unfortunately, each method has a significant adverse impact on analytic effectiveness and/or costs.
There are various Database players in the market. Here is one quick comparsion on Row-Based Vs Vs Columnar Vs NoSQL.
Row-basedDescription: Data structured or stored in Rows.
Common Use Case: Used in transaction processing, interactive transaction applications.
Strength: Robust, proven technology to capture intermediate transactions.
Weakness: Scalability and query processing time for huge data.
Size of DB: Several GB to TB.
Key Players: Sybase, Oracle, My SQL, DB2
ColumnarDescription: Data is vertically partitioned and stored in Columns.
Common Use Case: Historical data analysis, data warehousing and business Intelligence.
Strength: Faster query (specially ad-hoc queries) on large data.
Weakness: not suitable for transaction, import export seep & heavy computing resource utilization.
Size of DB: Several GB to 50 TB.
Key Players: Info Bright, Asterdata, Vertica, Sybase IQ, Paraccel
NoSQL Key Value StoredDescription: Data stored in memory with some persistent backup.
Common Use Case: Used in cache for storing frequently requested data in applications.
Strength: Scalable, faster retrieval of data , supports Unstructured and partial structured data.
Weakness: All data should fit to memory, does not support complex query.
Size of DB: Several GBs to several TBs.
Key Players: Amazon S3, MemCached, Redis, Voldemort
NoSQL- Document StoreDescription: Persistent storage of unstructured or semi-structured data along with some SQL Querying functionality.
Common Use Case: Web applications or any application which needs better performance and scalability without defining columns in RDBMS.
Strength: Persistent store with scalability and better query support than key-value store.
Weakness: Lack of sophisticated query capabilities.
Size of DB: Several TBs to PBs.
Key Players: MongoDB, CouchDB, SimpleDb
NoSQL- Column StoreDescription: Very large data store and supports Map-Reduce.
Common Use Case: Real time data logging in Finance and web analytics.
Strength: Very high throughput for Big Data, Strong Partitioning Support, random read-write access.
Weakness: Complex query, availability of APIs, response time.
Size of DB: Several TBs to PBs.Key Players: HBase, Big Table, Cassandra