Tensorflow 2 Object Detection with Colab
My short notes on using google colab to train Tensorflow Object Detection.
In this note, I use TF2 Object detection to read captcha. You will need 200–300 captcha to train. You need to train 40,000–50,000 steps. This will take 12 -13 hours of training in colab(CPU). It will take only 1–2 hours if you use GPU instead. Check point and Logs will take 3–5GB in your Google drive.
You can change run time type to GPU acceleration.
These are references articles.
You can create Image label by follow this article.
Go to https://colab.research.google.com/notebooks/intro.ipynb and create new notebook name tf2_captcha_test.ipynb.
Set up path in your google drive. You can see example in my github project below. My captcha pictures are only for example (15 pictures). You should add your dataset or use public dataset to train.
Start with mounting your Google drive. Google will authen and verify your account.
Install Tensorflow model from https://github.com/tensorflow/models
Install Object Detection API.
Check installation by running model_builder_tf2.py
Create pbtxt file. It mapped id with label. You can use my file in my repo as an example. It contain 36 items (a-z and 0–9)
Create tfrecord for train and val. You will get two new files, pascal_train.record and pascal_val.record
You can use my_create_pascal_tf_record from my repo. I make some change from objectdetection API example (create_pascal_tf_record.py)
Create config file. I use template for ssd_mobilenet_v2_320x320_coco17_tpu-8. You can copy from example objectdetection/config/tf2/ssd_mobilenet_v2_320x320_coco17_tpu-8.config to create new file “ssd_mobilenet_v2_320x320_coco17_tpu-8-collab.config”
Change path and parameter in .config file.
You can monitor by running Tensorboard. Log dir. is in folder ../train/train
Start training and save checkpoint to our train folder.
Next, Export your model to use later.
Load your model.
Run Test
You will see the result like these samples. These are results after training around 50,000 steps with 250 datasets