Previously, using ** Deep Distance Learning ** ["Image Anomaly Detection"](https://qiita.com/shinmura0/items/06d81c72601c7578c6d3#%E3%83%99%E3%83%B3%E3% 83% 81% E3% 83% 9E% E3% 83% BC% E3% 82% AF) was performed. After that, ** the latest method of deep distance learning ** "AdaCos" appeared.
In this article, I would like to apply AdaCos to "anomaly detection" for a simple benchmark. The entire code is here
By applying AdaCos to anomaly detection, we found the following:
AdaCos Although AdaCos is said to be the latest method for deep-distance learning, as of January 2020, it has been more than half a year since the paper was published. It has passed. However, as far as I know, it is still in the frame of "pure deep distance learning". It seems to be SOTA.
AdaCos is a method based on ArcFace etc., and the parameters used in ArcFace etc. are used. It is a method to decide automatically. The accuracy of ArcFace changes dramatically depending on the parameter selection, so Parameter tuning was very severe. However, with the introduction of AdaCos, it automatically Being able to determine the parameters frees you from this tuning task.
In addition, it has been confirmed that the accuracy is also improved, which greatly improves work efficiency and accuracy. This is a highly useful method. For details, refer to the following articles.
Read AdaCos: Adaptively Scaling Cosine Logits for Effectively Learning Deep Face Representations
There are two types of AdaCos, Fixed and Dynamic. The method used in this paper is "Fixed Ada Cos".
Also, AdaCos can only be applied when the number of classes is 3 or more. This becomes a bottleneck when applying Self-supervised learning, but the consideration there is I will do it in Next article. This is not a problem because the number of classes in this experiment is 9.
[Previous experiment](https://qiita.com/shinmura0/items/06d81c72601c7578c6d3#%E3%83%99%E3%83%B3%E3%83%81%E3%83%9E%E3%83 Perform the experiment under the same conditions as% BC% E3% 82% AF).
The code of AdaCos itself is almost the same as that of ArcheFace. However, the method of giving parameters has been changed to the AdaCos style.
The entire code is here
Fashion-MNIST
The breakdown of the data is as follows.
Normal is "sneakers", abnormal is "boots".
Quantity | Number of classes | Remarks | |
---|---|---|---|
Reference data for learning | 8000 | 8 | Excluding sneakers and boots |
Normal data for learning | 1000 | 1 | sneakers |
Test data (normal) | 1000 | 1 | sneakers |
Test data (abnormal) | 1000 | 1 | boots |
cifar-10
The breakdown of the data is as follows.
Normal is "deer", abnormal is "horse".
Quantity | Number of classes | Remarks | |
---|---|---|---|
Reference data for learning | 8000 | 8 | Excluding deer and horse |
Normal data for learning | 1000 | 1 | deer |
Test data (normal) | 1000 | 1 | deer |
Test data (abnormal) | 1000 | 1 | Horse |
"L2-Softmax Loss" and "ArcFace" show the results of the previous experiment. The result of "AdaCos" is the result of this experiment.
The median "AdaCos" is now about ** the same AUC as ArcFace. ** **
After all, I am grateful that parameter tuning is no longer necessary.
"L2-Softmax Loss" and "ArcFace" show the results of the previous experiment. The result of "AdaCos" is the result of this experiment.
Similar to Fashin-MNIST, the median "AdaCos" is about the same AUC as ArcFace.
Recommended Posts