Please see my other blog for Oracle EBusiness Suite Posts - EBMentors

Search This Blog

Note: All the posts are based on practical approach avoiding lengthy theory. All have been tested on some development servers. Please don’t test any post on production servers until you are sure.

Saturday, April 13, 2013

Big Data: Oracle NoSQL Database - Intro

"A DBA walks into a NOSQL bar, but turns and leaves because he couldn't find a table"

Oracle NoSQL Database provides multi-terabyte distributed key/value pair storage that offers scalable throughput and performance.
It services network requests to store and retrieve data which is organized into key-value pairs. It offers full Create, Read, Update and Delete (CRUD) operations with adjustable durability guarantees.

Oracle NoSQL Database is meant to be installed behind the application server, causing it to either take the place of the back-end database, or work alongside it. To make use of Oracle NoSQL Database, code must be written (using Java or C) that runs on the application server.

Architecture
The KVStore
An application makes use of Oracle NoSQL Database by performing network requests against Oracle NoSQL Database's key-value store, which is referred to as the KVStore.

Oracle NoSQL Database Driver
The requests are made using the Oracle NoSQL Database Driver, which is linked into your application as a Java library (.jar file), and then accessed using a series of Java APIs. By using the Oracle NoSQL Database APIs, the developer is able to perform create, read,
update and delete operations on the data contained in the KVStore.

Storage Node
A Storage Node is a physical (or virtual) machine with its own local storage. The machine is intended to be commodity hardware. The KVstore contains multiple Storage Nodes.It should be, but is not required to be, identical to all other Storage Nodes within the store. It also runs a storage node agent that monitors node behavior, reports it to the administration service, and handles configuration change requests.

Replication Nodes 
A storage node hosts a set of Replication Nodes.Data is spread across the Replication Nodes. A Replication Node can be thought of as a single database which contains key-value pairs.

Shards
Replication Nodes are organized into shards. A shard contains a single Replication Node which is responsible for performing database writes, and which copies those writes to the other Replication Nodes in the shard. This is called the master node. All other Replication Nodes in the shard are used to service read-only operations. These are called the replicas.
Production KVStores should contain multiple shards.

Monitoring Software
Each Storage Node contains monitoring software that ensures the Replication Nodes which it hosts are running and are otherwise healthy.

Replication Factor
The number of nodes belonging to a shard is called its Replication Factor. The larger a shard's Replication Factor, the faster its read performance  but the slower its write performance.

Partitions
Each shard contains one or more partitions. Keys are assigned to a partition. Once a key is placed in a partition, it cannot be moved to a different partition. Oracle NoSQL Database automatically assigns keys evenly across all the available partitions.
  
Access and Security
Access to the KVStore and its data is performed in two different ways. Routine access to the
data is performed using Java APIs
that the application developer uses to allow his application
to interact with the Oracle NoSQL Database Driver, which communicates with the store's
Storage Nodes in order to perform whatever data access the application developer requires.
Administration Service 
Administrative access to the store is performed using a command line interface or a browser-based graphical user interface, an administration service. This service supports core functionality such as the ability to configure, start, monitor, and stop a storage node, without requiringmanual effort with configuration files, shell scripts, or explicit database operations. In addition to facilitating configuration changes, the administration service also collects and maintains performance statistics and logs important system events, providing online monitoring and input to performance tuning.


Topologies
A topology is the collection of storage nodes, replication nodes and administration services
that make up an NoSQL DB store. A deployed store has one topology that describes its state at a given time.
Topologies can be changed to achieve different performance characteristics, or in reaction
to changes in the number or characteristics of the Storage Nodes. Changing and deploying a topology is an iterative process.

KVLite
KVLite is a simplified version of Oracle NoSQL Database. It provides a single-node store that is not replicated. It runs in a single process without requiring any administrative interface. It is intended for use by application developers who need to unit test their Oracle NoSQL
Database application. It is not intended for production deployment, or for performance
measurements.
KVLite is installed when you install KVStore.

Hadoop Integration
Oracle NoSQL Database can be integrated with Apache Hadoop systems using the
oracle.kv.hadoop.KVInputFormat. This class allows you to read data from Oracle NoSQL
Database and then prepare it for insertion into a Hadoop system.

Related Posts:
Big Data: A Brief Intro
Big Data: Working with Oracle NoSQL (KVLite)  

No comments: