A NoSQL for Analytics and Transactions - TechMinfy

A NoSQL for Analytics and Transactions


A Cloud based Enterprise NoSQL but Transactional and Analytical Database

MarkLogic is an Enterprise NoSQL Database. It can handle big data with transactional consistency, security, backup and recovery. It is a transactional and analytical database to run mission-critical apps. It is optimized for structured and unstructured data—allowing you to store, manage, query and search across JSON, XML, RDF, geospatial data, text, and large binaries.

  • Data integration by ingesting data on as-is basis for a seamless agile application development.
  • Transactions, Analytics, and Search on a single platform.
  • It does the Extraction and Loading without a necessity of transformation.

The only database that can natively store and rapidly query JSON, XML, RDF, Geospatial, Text, and Binaries; providing a single powerful platform for all of your data. The document-centric data model is schema-agnostic, which provides flexibility in modeling data. And, because it reduces data transformation, you can avoid lost fidelity and functionality from data conversion and brittle ETL. It makes it much easier to load data from different sources and adapt to changes over time.

A Document-Centric Database

NoSQL, AWS, Analytics, Transactions, Database, AWS NoSQL, Analytics on AWS, NoSQL on AWS, Amazon Web Services, Amazon EC2, Amazon Partners

It stores data using a compressed-tree, document-like model, as shown in the example above. Its data model makes it’s easier to persist multiple types of relationships and hierarchical information within a single piece of complex data. It also reduces the transformation required when moving data between the database, middle tier, and front-end of an application. This reduces the workload on the server and makes development much smoother.

More Flexibility

Data Type Query Language MarkLogic Relational Databases NoSQL Databases Triple Stores*
JSON documents JavaScript MarkLogic stores JSON documents natively. Production-Proven Indexing, Data Management, Security Capability for web’s predominant data format. Limited JSON Compatibility, typically achieved by treating JSON documents as a BLOBs or text blocks, Indexing and Retrieving takes long time relative to relational data. Some NoSQL Databases store JSON Documents natively. Not designed to store and query both RDF and JSON in the same database.
XML documents XQuery Stores XML Documents natively, compresses and preserves in a hierarchical manner. As an Add-On. Doesn’t offer XML Support, some Modern NoSQL databases lack flexibility. Not designed to store and query both RDF and XML in the same database.
RDF triples SPARQL Stores Natively As an Add-On. No document-stores, wide-column stores, key-value stores that can handle RDF Triples. Dedicated and single purpose to handle RDF Triple Stores.
Geospatial data XQuery, JavaScript Integrates well with Google Maps, Google Earth, ESRI ArcGIS, MS Bing Maps to visualize data. An unstructured component of a geospatial data gets difficult to search or lost. Degree of Indexing and search Varies. Meant for Single Purpose, Cannot handle RDF Tripe Store and Geospatial data same time.
Relational Data SQL Relational Data can be ingested and modeled using document model. Example 600 RDBMS Tools can be mapped to 13 XML Schema files. Stores data in rows and columns and queries data using SQL. Limited support for mapping relational data to non-relational model. Requires a connector to move data between relational database and triple store.

It is here with Semantics, Tiered Storage, Elasticity and Hadoop Direct Access. It only works on 64-bit Amazon EC2 instances as there is no 32-bit version of MarkLogic. It is a fully ACID-compliant transactional data store so it requires substantial input/output (I/O) throughput which AWS is fully equipped to provision. It supports S3 as a data store. But because Amazon S3 does not currently support “append,” it cannot be used as a primary storage. We can use the AWS Cloud Formation templates in conjunction with DevOps tools like Ansible, Puppet, or Chef to manage configuration and deployment. Although storage-optimized instance types offer extremely good I/O throughput, the storage is ephemeral. Because of the risk of data loss should a host fail, we do not currently recommend using ephemeral volumes for forest data, unless data loss is an acceptable risk for the application. As always the M3 general purpose series Amazon EC2 instance types are good fits for MarkLogic deployments too also are the compute-optimized C3 series instances and for applications requiring large numbers of range indexes or semantics triple store use cases, the R3 memory-optimized instance types may also be appropriate.

— Chakradhari Sharma (AWS Solution Architect), TechMinfy!!