What is CenterNet?

CenterNet is an object detection method proposed in the paper Objects as Points.

Since the size of the width and height is predicted after detecting the center of the object as a feature point, it seems to have advantages such as lighter calculation than the conventional method.

I want to do the same thing as I tried to detect the position of a tennis player, ball, and court using CenterNet, and I am doing this and that with CenterNet. ..

Although it is CenterNet, the source code is published on Github, and inference using the trained model can be done by following the Readme. However, there was not much explanation and information about learning the original data, so it was a little difficult to proceed. I wrote a simple article for sharing work memos. I'm glad if you can use it as a reference.

environment

Ubuntu 18.04 PyTorch 0.4.1

Data set preparation

I want to detect objects on the front side and back side of the tennis match image, respectively. ・ The player on the front side is "Player Front" ・ "Player Back" for the player on the back side I created the annotation data by labeling it as.

I created it using a tool called labelImg, but this is output as an xml file in PascalVOC format. CenterNet annotation data can only read COCO format json files, so you need to convert the xml file to a json file. If you are in the same situation, please refer to the article I wrote Convert Pascal VOC format xml file to COCO format json file. If.

Category ID is ・ PlayerFront: 1 ・ PlayerBack: 2 I assigned it as.

For annotation data, create the following two files, training data and test data. Please note that if the file names are different, an error message "file not found" will be output during learning. -Pascal_trainval0712.json: COCO format data set that stores training data information -Pascal_test2007.json: COCO format data set that stores test data information

Two annotation data files are stored so that the directory structure is ↓. CenterNet/data/voc/annotations/ |--pascal_trainval0712.json |--pascal_test2007.json

Then, place the image data in the images directory. CenterNet/data/voc/images/ |--**.jpg

Source code modification

Rewrite self.class_name on the 30th line of pascal.py according to the assigned category ID.

`/src/lib/datasets/dataset/pascal.py`


    # self.class_name = ['__background__', "playerup", "playerdown", "bird", "boat",
    #  "bottle", "bus", "car", "cat", "chair", "cow", "diningtable", "dog", 
    #  "horse", "motorbike", "person", "pottedplant", "sheep", "sofa", 
    #  "train", "tvmonitor"]
    self.class_name=['__background__', "playerFront", "playerBack"]

Learn with the prepared dataset

python main.py ctdet --exp_id pascal_dla_384 --dataset pascal --num_epochs 500 --lr_step 50,100,200,300,400

About arguments Learning with ctdet object detection (CenterNet) --exp_id Specify the network pascal_dla_384, pascal_dla_512, pascal_resdcn18_384, pascal_resdcn18_512, etc. Please refer to MODEL ZOO Pascal VOC for the correspondence table of the recommended number of GPUs for each model. --dataset pascal Learn with the pascal VOC method (20 classes) --num_epocks ** Learn with the number of epochs ** --lr_step 50,100,200,300,400 ~~ Save the model when the number of epochs is 50,100,200,300,400 ~~ The learning rate is reduced to 1/10 at the specified epoch timing.

If the number of training data is small, there is a high possibility that a model with insufficient training can be created even if the training is performed by the above method. In this case, it is better to train by fine tuning, which trains additionally using the trained model.

--load_model Specify the trained model with the file name as an argument. The trained model can be downloaded at MODEL_ZOO.md.

For fine tuning, I referred to issues on Github, Transfer learning on very small dataset # 307.

python main.py ctdet --exp_id pascal_dla_384 --dataset pascal --num_epochs 500 --lr_step 50,100,200,300,400 --load_model ../models/ctdet_pascal_dla_384.pth

Log files and training models are stored in / exp / ctdet / pascal_dla_384 /.

inference

Make inferences using the trained model. Modify the code in CenterNet / src / lib / utils / debugger.py before making the inference. Since there is a declaration of pascal_class_name on line 439, change it to "PlayerFront" and "PlayerBack".

Then, inference is performed with the following command.

python demo.py ctdet --demo ../data/voc/images/**.jpg --dataset pascal --load_model ../exp/ctdet/pascal_dla_384/model_last.pth --debug 2

--debug 2 You can check not only the detection result image but also the heat map image.

[PYTHON] About learning method with original data of CenterNet (Objects as Points)