A library published by scalar allows non-ACID distributed databases / storage to be ACID compliant, and by using distributed databases such as Cassandra, linear scalability and high availability are achieved.
The biggest feature is that ** you can use the strong and consistent ACID transaction function **.
See ScalarDB docs for more information.
ScalarDB mainly consists of the following three layers.
Hierarchical diagram of ScalarDB
The ScalarDB data model is a multidimensional map model based on the KeyValue format, and the record consists of a Partition Key, Clustering Key and a value set.
ScalarDB data model image diagram
Each value is uniquely mapped by a Primary Key consisting of a Partition Key, a Clustering Key and a value name.
Cassandra is a type of NoSQL Database and has the following features.
· High scalability and availability without a single point of failure -SQL-like query language, search support by secondary index · Flexible schema
However, it also has the following restrictions:
-Neither transactions nor JOINs are supported Foreign keys are not supported and keys are immutable · Key must be unique ・ Search is complicated
The data model consists of the following elements, which distribute the data at the expense of some data consistency.
Keyspace Top-level namespace
Column Family(Table) A container for column collections that corresponds to RDBMS tables
Partition key Key to distribute data by node
Values Column data other than Partiton Key and Clustering Key
From here on, I'm going to run this ScalarDB locally on Ubuntu 16.04 installed on Windows 10.
The required components are: -Oracle JDK 8 (OpenJDK 8) or higher ・ Cassandra 3.11.x (latest stable version at the time of writing) ・ Golang Ver.1.10 or above ・ Gradle Ver.4.10 or above
Let's install it immediately.
Oracle JDK 8(OpenJDK 8) (Reference: http://cassandra.apache.org/)
In the initial state of Ubuntu, it seems that there is a java8 execution environment but no development environment
Update package list
$ sudo apt update
Installation of java8 development environment
$ sudo apt install openjdk-8-jdk
Casssandra 3.11.x
Add Apache repository for Cassandora (for version 3.11.)
$ echo "deb http://www.apache.org/dist/cassandra/debian 311x main" | sudo tee -a /etc/apt/sources.list.d/cassandra.sources.list
Add Apache Cassandra repository key
$ curl https://www.apache.org/dist/cassandra/KEYS | sudo apt-key add -
Update repository
$ sudo apt-get update
Install Cassandra
$ sudo apt-get install cassandra
Start Cassandra
$ sudo service cassandra start
Confirmation of Cassandra startup
$ cqlsh
However, it takes time to start, so wait a while before executing.
OK if it becomes as follows
GPG error: http://www.apache.org 311x InRelease: The following signatures couldn't be verified because the public key is not available: NO_PUBKEY A278B781FE4B2BDA
Add public key
$ sudo apt-key adv --keyserver pool.sks-keyservers.net --recv-key A278B781FE4B2BDA
Update repository again
$ sudo apt-get update
golang
Download the golang archive
$ curl -L https://dl.google.com/go/go1.10.5.linux-amd64.tar.gz > go1.10.5.linux-amd64.tar.gz
Unzip the downloaded archive
$ sudo tar -C /usr/local -xzf go1.10.5.linux-amd64.tar.gz
Set GO path
$ vi ~/.bashrc
Add the following line to the body of .bashrc
export PATH=$PATH:/usr/local/go/bin
Reflect the path setting
$ source ~/.bashrc
Gradle You need to install SDK Man to install Gradle, but you need to install zip and unzip respectively to install SDK Man.
Installation of zip and unzip
$ sudo apt install zip unzip
Install SDK Man
$ curl -s "https://get.sdkman.io" | bash
Click here for details (https://sdkman.io/install)
SDK Man initial settings
$ source" /home/(your username) /.sdkman/bin/sdkman-init.sh "
Confirm the installation of SDK Man
$ sdk version
OK if a display like sdkman 5.0.0 + 51
is displayed
Install Gradle
$ sdk install gradle 4.10.2
Click here for details (https://gradle.org/install/)
Gradle version check
$ gradle --version
OK if you see something like Gradle 4.10.2
$ sudo mkdir /etc/scalar/
$ sudo vi /etc/scalar/database.properties
database.properties
# Comma separated contact points
scalar.database.contact_points=localhost
# Port number for all the contact points. Default port number for each database is used if empty.
# scalar.database.contact_port=
# Credential information to access the database
scalar.database.username=cassandra
scalar.database.password=cassandra
localhost
. When connecting multiple devices, separate them with commas.$ sudo mkdir -p /data/cassandra/data
$ sudo mkdir -p /data/cassandra/commitlog
$ sudo mkdir -p /data/cassandra/hints
$ sudo mkdir -p /data/cassandra/saved_caches
Change the owner of the / data / cassandra
folder
$ sudo chown -R cassandra:cassandra /data/cassandra
Make a backup of the Cassandra configuration file and then edit it
$ sudo cp /etc/cassandra/cassandra.yaml /etc/cassandra/cassandra.yaml.copy
$ sudo vi /etc/cassandra/cassandra.yaml
commitlog_Change directory settings(Line 196)
Change before
# commit log. when running on magnetic HDD, this should be a
# separate spindle than the data directories.
# If not set, the default directory is $CASSANDRA_HOME/data/commitlog.
commitlog_directory: /var/lib/cassandra/commitlog
After change
# commit log. when running on magnetic HDD, this should be a
# separate spindle than the data directories.
# If not set, the default directory is $CASSANDRA_HOME/data/commitlog.
commitlog_directory: /data/cassandra/commitlog
data_file_Change directories settings(Line 191)
Change before
# Directories where Cassandra should store data on disk. Cassandra
# will spread data evenly across them, subject to the granularity of
# the configured compaction strategy.
# If not set, the default directory is $CASSANDRA_HOME/data/data.
data_file_directories: /var/lib/cassandra/data
After change
# Directories where Cassandra should store data on disk. Cassandra
# will spread data evenly across them, subject to the granularity of
# the configured compaction strategy.
# If not set, the default directory is $CASSANDRA_HOME/data/data.
data_file_directories: /data/cassandra/data
hints_Change directory settings(Line 71)
Change before
# Directory where Cassandra should store hints.
# If not set, the default directory is $CASSANDRA_HOME/data/hints.
hints_directory: /var/lib/cassandra/hints
After change
# Directory where Cassandra should store hints.
# If not set, the default directory is $CASSANDRA_HOME/data/hints.
hints_directory: /data/cassandra/hints
saved_caches_Change directory settings(Line 368)
Change before
# saved caches
# If not set, the default directory is $CASSANDRA_HOME/data/saved_caches.
saved_caches_directory: /var/lib/cassandra/saved_caches
After change
# saved caches
# If not set, the default directory is $CASSANDRA_HOME/data/saved_caches.
saved_caches_directory: /data/cassandra/saved_caches
Toggle settings commented out(379,380,385,Line 386)
Change before
# commitlog_sync_batch_window_in_ms milliseconds between fsyncs.
# This window should be kept short because the writer threads will
# be unable to do extra work while waiting. (You may need to increase
# concurrent_writes for the same reason.)
#
# commitlog_sync: batch
# commitlog_sync_batch_window_in_ms: 2
#
# the other option is "periodic" where writes may be acked immediately
# and the CommitLog is simply synced every commitlog_sync_period_in_ms
# milliseconds.
commitlog_sync: periodic
commitlog_sync_period_in_ms: 10000
After change
# commitlog_sync_batch_window_in_ms milliseconds between fsyncs.
# This window should be kept short because the writer threads will
# be unable to do extra work while waiting. (You may need to increase
# concurrent_writes for the same reason.)
#
commitlog_sync: batch
commitlog_sync_batch_window_in_ms: 2
#
# the other option is "periodic" where writes may be acked immediately
# and the CommitLog is simply synced every commitlog_sync_period_in_ms
# milliseconds.
# commitlog_sync: periodic
# commitlog_sync_period_in_ms: 10000
$ sudo service cassandra restart
$ cqlsh
Successful startup if you can log in to Cassandra Console
$ sudo service cassandra status
Success if * Cassandra is running
is displayed
ScalarDB Schema Tool This tool is a tool that generates and loads the schema of the database for ScalarDB. There are two types, generator and loader. The generator creates a schema definition file and metadata definition specific to the storage implementation (eg Casssandra), and the loader uses the generator to get the schema file and create the schema definition in the storage. can do.
This eliminates the need to consider storage-specific schemas when modeling application data.
Clone ScalarDB's github repository
$ cd ~/
$ git clone https://github.com/scalar-labs/scalardb.git
Set path using variables
$ SCALARDB_HOME = / home / (user name of your environment) / scalarb
$ cd $SCALARDB_HOME
Run build
$ sudo ./gradlew installDist
OK if BUILD SUCCESSFUL is displayed
Go to the Schema Tools directory and run make
$ cd tools/schema
$ sudo make
$ sudo vi emoney-storage.sdbql
emoney-storage.sdbql
REPLICATION FACTOR 1;
CREATE NAMESPACE emoney;
CREATE TABLE emoney.account (
id TEXT PARTITIONKEY,
balance INT,
);
Try starting the generator.
$ sudo ./generator emoney-storage.sdbql emoney-storage.cql
Make sure that emoney-storage.cql is created.
This is the end of the environment construction section. In the sample application creation section, I would like to actually create a sample application and check the operation of the ScalarDB application.
Try running ScalarDB on WSL Ubuntu (Sample application creation)
Recommended Posts