Digital Artisan
JSON and object support in NoSQL database
This blog describes one of many features of Oracle NoSQL Database -- how its JDBC driver can store and query plain, old Java objects as JSON.
Oracle NoSQL Database is one of the most high-performing, robust NoSQL databases. It is also the most well-kept secret in NoSQL world.
JSON has become lingua franca of data representation. Accordingly, Oracle NoSQL Database, originally a key-value store, natively supports JSON as a basic database type among other composite types such as RECORD, MAP or ARRAY.
Definition for for database
The core pre-requisite to represent objects in database is to define a table with a column of JSON type. The column would store JSON representation of object.
A database table, OBJECT_TABLE
, to store object is defined as:
CREATE TABLE OBJECT_TABLE (
id LONG,
object JSON,
PRIMARY KEY (id)
)
The noteworthy aspects of database definition are
- a primary key is defined separately.
- a field of JSON type is declared to store object
Definition required for persistent object
The Java objects to be stored are POJOs -- Plain, old Java objects. No framework to extend, no interface to implement, no annotation.
However, a object must declare a property by bean style gettter/setter method i.e. a getX()
and/or setX(X x)
method would implictly declare a persistent property x
.
Steps to store and query objects
The following code examples show steps to store an object and query to retrieve it using basic JDBC interface.Getting a connection
To use the driver with a database, use a connection string such as
jdbc:nosql://localhost:5000/nosqlstore
where localhost:5000
is
the host and listening port of a database server and nosqlstore
is the name
of the store.
import java.sql.*
String url = "jdbc:nosql://localhost:5000/nosqlstore";
Connection con = DriverManager.getConnection(url);
Inserting an object
Once a connection has been obtained, an object is inserted viaResultSet.insertRow()
method.
At first, create a result set by a simple query
Connection con = ...;// the connection obtained in previous step
String sql = "SELECT * FROM OBJECT_TABLE";
ResultSet rs = con.createStatement().executeQuery(sql);
Next, use the result set to insert an object
rs.moveToInsertRow();
// create a POJO and set its properties
SimplePOJO pojo = new SimplePOJO();
pojo.setName("A blog example");
pojo.setWordCount(5678);
// the id must be set separately
rs.updateLong("id", 1234);
// update the result set with the object
rs.updateObject("object", pojo);
// insert the row to the database
rs.insertRow();
The entire object state is stored in JSON field of a record in OBJECT_TABLE. However, the identifier of the record is declared and set separately from the object.
Query an object
Once the object has been stored, next step would be to retrieve the object by query.Query by identifier
The straightforward way is to query the object by record identifier which is separate from object state.
String sql = "SELECT * FROM OBJECT_TABLE WHERE id=1234"; // 1234 is the record identifer
ResultSet rs = getConnection().createStatement().executeQuery(sql);
rs.next();
SimplePojo pojo = rs.getObject("object",SimplePojo.class);
The object can be retrieved from result set column.
Runtime Type preservation
Notice the absence of any casting when the object is retrieved from result set column. No casting is necessary because Oracle NoSQL database stores the object with its runtime type. encoded in JSON representation.
Query by object property
The object is stored in its JSON representation. But it is important to note the object can be queried on its properties i.e. the database is aware of JSON structure and does not store it as a mere string or opaque blob.
The query on object property uses a navigational syntax using dot as separator. As the example shows wordCount
is a property of the stored object (though it has not been declared as a database field). So the query on the property succeeds with usual comparison predicate.
String sql = "SELECT * FROM OBJECT_TABLE t WHERE t.object.wordCount >= 5000";
ResultSet rs = getConnection().createStatement().executeQuery(sql);
rs.next();
SimplePojo pojo = rs.getObject("object",SimplePojo.class);
Object Graph Storage
The object used in the above example SimplePojo
is simple, may be too simple. But the driver can store a more complex object graph. The next blog explores how to store and retrieve a complex object graph in Oracle NoSQL database using JDBC driver.
JDBC hardly needs any introduction. JDBC spec was released 30 years ago, in 1977 with Java version 1.1. Since then, JDBC has been widely adopted. Mostly used with RDBMS databases. Every RDBMS provides a JDBC driver.
But a JDBC driver for Oracle NoSQL database?
Wait ! There are two questions:
What is Oracle NoSQL database?
JDBC is for SQL. Why JDBC driver for NoSQL database?
Oracle NoSQL Database is the best well-kept secret of NoSQL world. It is one of the best NoSQL database technical capability and performance wise. It is also least known. Though some serious, large-scale, global application runs on Oracle NoSQL database, it is hardly known widely as google or stackoverflow would show.
Oracle NoSQL is a distributed database. It is scalable, elastic. At its core lies robust and time-tested kernel of BerkeleyDB. Oracle NoSQL database overlays a network fabric over a set of BerkleyDB kernels to deliver a high-performance, low-latency distributed database that even provides transactional warranty in limited sense.
Oracle NoSQL database, though a key-value database at its core, already provides a tabular view. And supports a query language that is quite similar to SQL. So, it is a natural and logical to have a JDBC driver for Oracle NoSQL database. This blog introduces such a JDBC driver.
As I mentioned earlier, JDBC has been around and does not need to be repeated. Oracle JDBC driver is compliant to JDBC Spec version 4.2.
I would mention only one aspect of new JDBC driver. Its' powerful capacity to store, retrieve and query user-defined objects.
One of the major attractiveness of NoSQL database is flexible schema. Oracle database natively supports JSON data type to store tree like nested data structure. JDBC driver builds on top of that native JSON facility to support object support with polymorphic type preservation.
For a developer, it means that database remembers (databases typically have excellent memory) the runtime type of an object being stored. And later retrieve the object with the same type.
Let us assume an interface: Human. The interface is implemented by two concrete classes Male and Female.
Now, when an object is stored, its runtime type is not Human which is an interface but an instance of concrete type: Male or Female. Later the object can be realized in memory as a Male or Female type, not only as a Human.
Moreover, polymorphic relationship is also supported. Let us say, two humans are related by partner relationship. The relationship is declared at Human level. For example,
Human getPartner()
But when an object is stored, the partner object is either a Male or Female. Again, the object graph is retrieved with type information preserved. So if a Male who has a Female partner is stored, when retrieved, the partner would be an instance of a Female object.
C# Driver for Oracle NoSQL database
Applications written in JavaScript, C, Python can use Oracle NoSQL database. Thanks to the drivers for respective languages.
C# driver is a new addition. C# driver also supports asynchronous programming model.
Your application issue a request to search and return immediately. The search result, which can potentially be large in number, is returned via callback to an observer supplied with original search request.
The release candidate of the driver is available from NuGet gallery and can be integrated to any C# application easily from Visual Studio.
An accompanying assembly includes a set of samples with source code to demonstrate how to use the driver.
Using locks in NoSQL database
Locks in NoSQL database
In this article, we explore Read-Modify-Write (RMW) lock which is equivalent to SELECT FOR UPDATE lock often found in RDBMS.Commonly, NoSQL databases often do not use locking as in traditional RDBMS. But Oracle NoSQL database offers a set of advanced features that can be used to implement locking semantics.
Read-Modify-Write Lock
An RMW lock uses following simple APIRMWLock.update(KVStore store, PrimaryKey pk, Updater updater);where
- store is a database connection
- pk is the primary key of row being locked for update
- updater is a function that updates the row values pointed by pk
public interface Updater { void update(Row row); }It is noteworthy that RMWLock may attempt to update the row more than once. This can happen if another thread is updating the row at the same time. In such cases, RMWLock would invoke the updater function on every retry with the latest state of the updated row.
An use case for RMWLock
Once client A had updated the value from x to x+L, then only thread B will see the updated value x+L and then B can update the value again to a different value, say x+L+L.
A typical usage of RMW lock in this simple case would be:
new RMWClient().update(store, pk, new Updater() { @Override public void update(Row row) { int lastUpdatedValue = row.get("NEXT_VALUE").asInt().get(); row.put("NEXT_VALUE", lastUpdatedValue + 20); } };Here we use the updater function to increment NEXT_VALUE by 20. The RMWLock ensures that NEXT_VALUE is updated atomically and without any other thread updating at the same time.
Resources
The code for RMWLock is available here. This code can directly be used in an application.A more complete example of RMWLock to realize a persistent sequence is described in a separate blog.
TimeSeries Analysis with Apache Spark and Oracle NoSQL database
What is a Timeseries?
This article describes modelling and analysis of Timeseries data with Apache Spark and Oracle NoSQL database.
Timeseries model in NoSQL
Timeseries data characteristics are different from RDBMS record
TimeSeries data is different in characteristics than conventional RDBMS record. The information of a timeseries is in aggregate properties of its events.
Timeseries data tends to be immutable, large in volume, ordered by time, and is primarily aggregated for access.
why NoSQL?
A timeseries is modeled as a table in NoSQL database. The name of the timeseries is same as the table and hence usual restrictions on table name applies. A row in timeseries table represents a Time Slot.
A timeslot is block of events with some metadata attribute such as slot index, start and end time of the slot etc. A time slot is atomic unit of a timeseries and an entire timeseries is a row of time slots.
A timeslot, as we would see later, provides the basic partition of timeseries and forms the basis for integration with Apache Spark.
Apache Spark
Apache Spark is a powerful cluster computing framework. Apache Spark performs a task on a large dataset by computing over partitions of the dataset where each computation may execute on different hosts of a cluster. Spark framework controls how partial tasks are distributed across the cluster hosts and how to combine partial results.Moreover, Spark models a computational task as a graph of actions where actions are lazily evaluated only when a result producing action needs to be computed.
Unique, monotonic sequence in NoSQL using SELECT FOR UPDATE style lock
Persistent Sequence in NoSQL database
What is a Database Sequence?
Database Sequence in NoSQL
To generate a sequence, we need a mechanism that can acquire a lock on a record, update its value and ensures that no other thread/process can update the same row in the meantime. This is typical SELECT FOR UPDATE behavior found in in RDBMS. It is also know as Read-Modify-Write pattern in database programming.
We take a concrete example of persistence Sequence to demonstrate this behavior. The persistennt sequence would generate a series of values that are unique and monotonic>
Sequence seq = new SequenceBuider() .withStore(store) .withName("mysequence") .withIncrement(100) .build();
Once a sequence is obtained, an application can get the series of values by:
long val = seq.next()
In the code example above, the sequence is created to fetch 100 contiguous values from database at a time, and these values are cached in memory. But if seq.next() is called more than 100 times, next batch of values will be fetched from database transparently to user.
A sequence is monotonic across multiple threads, multiple application
NoSQL advanced features to support Database Sequence
- Row Versioning: every database record carries a version
- Conditional put operation: a put() operation can be executed conditionally on record version.
- Access to Previous State: previous state of a record is available
Row Versioning
Conditional put() Operation
Access to previous state
The server would populate the return row with the state of the row before the operation. You can also specify which aspect of previous state be returned: only the previous version or value or both.
READ-MODIFY-WRITE pattern for Database Sequence
- READ a record by name with ABSOLUTE read consistency option.
Because NoSQL maintains multiple copies of data for robustness, a piece of data can be read from any of the copies (termed as Replication Node). Reading with ABSOLUTE consistency ensures that database has confirmed the data to be consistent across all the copies.
Oracle NoSQL provides different consistency options to read data: NONE_REQUIRED and ABSOLUTE. You can know about them in more details in the NoSQL documentation. - MODIFY update record value by incrementing its current value X by L. This value X denotes the next value of the sequence that has can be used, or in other words, values before X has already been reserved for use.
- WRITE the value (X+L) back with putIfVersion(), supplying a return row. Again, as in the case of read, write the data COMMIT_SYNC durability warranty that would ensure that data is not only written to durable storage but was synchronized across copies to avoid any other update.
RMW pattern with version-based update
In code lies the truth...
Conclusion
-
JSON and object support in NoSQL database This blog describes one of many features of Oracle NoSQL Database -- how its JDBC driver can st...
-
JDBC Driver for Oracle NoSQL Database JDBC hardly needs any introduction. JDBC spec was released 30 years ago, in 1977 with Java version ...