Earlier this month (May 2019), the new version 1.0.0-beta4 of DL4J was released. I would like to summarize what kind of functions have been added.
Click here for English release notes. https://deeplearning4j.org/release-notes#onezerozerobeta4
DL4J Deeplearning4J is an open source deep learning framework that can be developed in JVM languages such as Java / Scala / Kotlin. Development is underway, led by Skymind, a startup company headquartered in San Francisco.
Speaking of deep learning, there is an image of Python. But,
--Java is mainly used in government offices and enterprise systems. ――In terms of programming language, Java developers are the most --DL4J natively supports Hadoop / Spark, which is a big data infrastructure. -There is commercial support from Skymind
Against this background, Java actually occupies a certain share in the world of deep learning. Although it is the data of 2018, it is located in the middle when looking at the number of stars (horizontal axis in the figure below) on GitHub of each framework.
(Image source: https://www.kdnuggets.com/2018/04/top-16-open-source-deep-learning-libraries.html)
Python frameworks such as TensorFlow, PyTorch, and Chainer are basically intended for R & D and experimental use, and the spectacular research results we see every day are developed using these frameworks. I think there are many. I have the impression that DL4J is trying to differentiate itself by focusing on its commercial position (although it can be used for free, of course).
You can also use it by Import model built with Keras to DL4J. In addition, it also supports importing TensorFlow models and ONNX / PMML format models.
Therefore, it is possible to use Python for development and Java for operation.
Let's take a look at the highlights of the features added in the new version 1.0.0-beta4 of DL4J.
--Support for multiple data types in the Linear Algebra Library for JVM ND4J --Support for acceleration library MKL-DNN for Intel processors --Performance improvement of ND4J by changing memory management method -Added Attention layer to DL4J -Supports BERT with DL4J (but limited)
We will look at the details of each below.
ND4J is a scientific calculation library for the JVM. It may be easy to imagine the Java version of Numpy in Python. There is also reports that ND4J is twice as fast as Numpy. Of course, it supports CUDA, and you can also use the GPU to speed up calculations.
By the way, in the previous version of ND4J, the type of N-dimensional matrix (tensor) was limited to float / double. Starting with 1.0.0-beta4, tensors also support all of the following common data types:
In particular, support for FP16 and INT types can be expected to reduce the size and speed of neural networks.
Support for MKL-DNN, a deep neural network (DNN) acceleration library for Intel processors, has also been added.
You can use MKL-DNN to speed up the following layers:
and Convolution1DLayer
(and Conv2D
ND4J ops)SubsamplingLayer
and Subsampling1DLayer
(and MaxPooling2D
ND4J ops)BatchNormalization
layer (and BatchNorm
ND4J op)LocalResponseNormalization
layer (and LocalResponseNormalization
ND4J op)Convolution3D
layer (and Conv3D
ND4J ops)In addition, support for other layers such as LSTM
will be supported in a later release.
You can also disable MKL-DNN globally by adding the following line.
It is also possible to invalidate only specific instructions by setting the environment variable ND4J_MKL_FALLBACK
Previous releases of ND4J used periodic garbage collection for automatic memory release. (For garbage collection, Qiita's "Organize Java's GC mechanism" was easy to understand.)
However, in ND4J, from 1.0.0-alpha, as an additional memory management model, [workspace](https://deeplearning4j.org/docs/latest/deeplearning4j- It introduced the concept of config-workspaces). Memory can be reused on the workspace without the intervention of a garbage collector.
While training a neural network with DL4J, there should basically be no memory to free. As a result, periodic garbage collection every few seconds added performance overhead.
Starting with 1.0.0-beta4, periodic garbage collection is disabled by default. (You can also enable it as follows.)
The Attention mechanism that started to explode from around the Google Translate (Transformer) paper called Attention Is All You Need announced by Google in 2017 It can now be used natively on DL4J.
Ryobot's blog is easy to understand for the explanation of Attention.
(Image source: http://deeplearning.hatenablog.com/entry/transformer))
BERT is a new method of pre-learning linguistic expressions. It has received a great deal of attention for its SOTA (state-of-the-art) results in a wide range of natural language processing tasks.
(Image source: https://twitter.com/_Ryobot/status/1050925881894400000)
You can now try BERT on DL4J as well.
-BERT pre-training model import support
-Addition of dataset iterator BertIterator
for BERT training
-Added BERT WordPiece Tokenizer
Some interesting features such as the following have been added secretly.
--[CapsuleLayer] on DL4J (https://github.com/deeplearning4j/deeplearning4j/blob/master/deeplearning4j/deeplearning4j-nn/src/main/java/org/deeplearning4j/nn/conf/layers/CapsuleLayer.java) Added (However, GPU acceleration will be supported in the next and subsequent releases)
--NonMaximumSuppression on ND4J (https://github.com/deeplearning4j/deeplearning4j/blob/master/nd4j/nd4j-backends/nd4j-api-parent/nd4j-api/src/main/java/org/nd4j/linalg Addition of /api/ops/impl/image/NonMaxSuppression.java) (NMS is used in object recognition algorithms such as YOLO -implementing-yolo-in-less-than-30-lines-of-python-code-97fb9835bfd2))
--Vectoring / ETL library DataVec to PythonTransorfm
Addition of /datavec-python/src/main/java/org/datavec/python/PythonTransform.java) (using Python data preprocessing code in Java)
There are many other function additions and bug fixes. Please refer to the Official Release Notes for the contents that could not be covered in this article.
Why not try using DL4J 1.0.0-beta4, which has many new features added to natively support the new technologies that have become standard in this way.
Recommended Posts