MSU-DB is a fast, highly scalable, highly available, causally consistent distributed key-value store. MSU-DB is written in Java. It uses Berkeley-DB for storage and retrieval of data in each node. For communication it uses Netty. MSU-DB guarantees causal consistency between replicas, and relays on Hybrid Logical Clocks to provide  fast wait-free writes. 


MSU-DB on Github: GitHub

What are Key-value Stores?

Key-value stores such as MSU-DB are a type of NoSQL storage systems. Watch the following video by Dr. Gupta on Key-value stores.

MSU-DB Quick Start:

Download

version 1.3

Top

Config File

You need to describe the architecture of your distributed data store in a config file. MSU-DB config files have a simple and straightforward format as follows:

environment_address;
number_of_datacenters;
node_id : ip_address : port : parent_id;
node_id : ip_address : port : parent_id;
node_id : ip_address : port : parent_id;
node_id : ip_address : port : parent_id;
...

For example, the following configuration describe a system consisting of three datacenters (replicas) each with 3 servers. All the data store files are located at /DBs folder:

./DBs;
3;
0:ip1:2000:0;
1:ip2:2000:0;
2:ip3:2000:0;

0:ip4:2000:0;
1:ip5:2000:0;
2:ip6:2000:0;

0:ip7:2000:0;
1:ip8:2000:0;
2:ip9:2000:0;

 

Note:

  • Any node whose parent is itself is the root node in the datacenter (see the paper for details). Thus, in the above example nodes with id 0 in each data center are the root nodes.
  • MSU-DB uses the port specify in each node for the server communicatin, and used the next prot number for the client communications. Thus, in the example above, server and client communication are through ports 2000 and 2001, respectively.

Top

Creating Directroy Structure

Befofre running your data store, you need to have folder name DBs in the address where jar file is located. DBs must contain one folder for each server. The name of the folder folder for each server should follow this format: DBdcID_serverID. For example DB0_0 is for server 0 in data center (replica) 0, or DB1_3 is server 3 in data center 1.

Top

Running MSU-DB

After creating the file structure, you can run MSU-DB on each node as follows:

java -jar msudb.jar config_file ip port

That's it! your distributed key-value store is ready to use.

Top

Querying MSU-DB

MSU-Db is a key-value store. The two main opeartions it supports are PUT and GET. They can be used as follows:

PUT:dependecy_time:key:value

GET:gst:key

To underestand the dependecy_time and gst, please refer to the paper. If you use 0 for dependecy_time and gst you will have an eventually consistent data store like Amazon Dynamo or Apache Cassandra.

Top

Step-by-step Example: Running a data store with two replicas

Top