Category Archives: Containers

Fine-Tuned MobilenetV2 on MyriadX and Coral

Fine-tuning MV2 using a GPU is not hard – especially if using the tensorflow Object Detection API in a docker container. It turns out that deploying a quantize-aware model to a Coral Edge TPU is also not hard.

Doing the same thing for MyriadX devices is somewhat harder. There doesn’t appear to be a good way to take your quantized model and convert it to MyriadX. If you try to convert your quantized model to openvino you get recondite errors; if you turn up logging on the tool you see the errors happen when the tool hits “FakeQuantization nodes.”

Thankfully, for OpenVINO you can just retrain without quantization and things work fine. As such it seems like you end up training twice – which is less than ideal.

Right now my pipeline is as follows:

  • Annotate using LabelImg – installable via brew (OS X)
  • Train (using my NVidia RTX 2070 Super). For this i use Tensorflow 1.15.2 and a specific hash of the tensorflow models dir. Using a different version of 1.x might work – but using 2.x definitely did not, in my case.
  • For Coral
    • Export to a frozen graph
    • Export the frozen graph to tflite
    • Compile the tflite bin to the edgetpu
    • Write custom c++ code to interface with the corale edgetpu libraries to run the net on images (in my case my code can grab live from a camera or from a file)
  • For DepthAI/MyriadX
    • Re-train without the quantized-aware portion of the pipeline config
    • Convert the frozen graph to openvino’s intermediate representation (IR)
    • Compile the IR to a MyriadX blob
    • Push the blob to a local depthai checkout

Let’s go through each section stage of the pipeline:

Annotation

For my custom data set I gathered a sample dataset consisting of video clips. I then exploded those clips using ffmpeg:

ffmpeg -i $1 -r 1/1 $1_%04d.jpg

I then installed LabelImg and annotated all my classes. As I got more proficient in LabelImg I could do about one image every second or two – fast enough, but certainly not as fast as Tesla’s auto labeler!

Note that for my macbook, I found the following worked for getting labelimg:

conda create --name deeplearning
conda activate deeplearning
pip install pyqt5==5.15.2 lxml
pip install labelImg
labelImg

Training

In my case I found that mobilenetv2 meets all my needs. Specifically it is fast, fairly accurate, and it runs on all my devices (both in software, and on the Coral and OAKDLite)

There are gobs of tutorials on training mobilenetv2 in general. For example, you can grab one of the great tensorflow docker images. By no means should you assume that once you have the image all is done – unless you are literally just running someone elses tutorial. The moment you throw in your own dataset you’re going to have to do a number of things. And most likely they will fail. Over and over. So script them.

But before we get to that let’s talk versions. I found that the models produced by tensorflow 2 didnt work well with any of my HW accelerators. Version 1.15.2 worked well, however, so i went with that. I even tried other versions of tensorflow 1x and had issues. I would like to dive into the cause of the issues – but have not done so yet.

See my github repo for an example Dockerfile. Note that for my GPU (an RTX 2070 Super) I had to work around memory growth issues by patching the Tensor Flow Object Detection model_main.py. I also modified my pipeline batch sizes (to 6, from the default 24). Without these fixes the training would explode mysteriously with an unhelpful error message. If only those blasted bitcoin miners hadn’t made GPUs so expensive perhaps I could upgrade to another GPU with more memory!

It is also worth noting that the version of mobilenet i used, and the pipeline config, were different for the Coral and Myraid devices, e.g.:

  • Coral: ssd_mobilenet_v2_quantized_300x300_coco_2019_01_03
  • Myriad: ssd_mobilenet_v2_coco_2018_03_29

Update: it didn’t seem to matter which version of mobilenet i used – both work.

I found that I could get near a loss of 0.3 using my dataset and pipeline configuration after about 100,000 iterations. On my GPU this took about 3 hours, which was not bad at all.

Deploying to Coral

Coral requires quantized training exported to tflite. Once exported to tflite you must compile it for the edgetpu.

On the coral it is simple enough to change from the default mobilenet image to your custom image – literally the filenames change. Must point it at your new labelMap (so it can correctly map the classnames) and image.

Deploying to MyriadX

Myriad was a lot more difficult. To deploy a trained model one must first convert it to the OpenVINO IR formart, as follows:

source /opt/intel/openvino_2021/bin/setupvars.sh && \
    python /opt/intel/openvino_2021/deployment_tools/model_optimizer/mo.py \
    --input_model learn_tesla/frozen_graph/frozen_inference_graph.pb \
    --tensorflow_use_custom_operations_config /opt/intel/openvino_2021/deployment_tools/model_optimizer/extensions/front/tf/ssd_v2_support.json \
    --tensorflow_object_detection_api_pipeline_config source_model_ckpt/pipeline.config \
    --reverse_input_channels \
    --output_dir learn_tesla/openvino_output/ \
    --data_type FP16

Then the IR can be converted to blob format by running the compile_tool command. I had significant problems with the compile_tool mainly because it didnt like something about my trained output. In the end I found the cause was that openvino simply doesn’t like the nodes being put into the training graph due to quantization. Removing that from the pipeline solved this. However this cuts the other way for Coral – it ONLY accepts quantized-aware training (caveat: in some cases you could post-process quant however Coral explicitly states this doesn’t work in all cases).

Once the blob file was ready i put the blob, and a json snippet, in the resources/nn/<custom> folder inside my depthai checkout. I was able to see full framerate (at 300×170) or about 10fps at full resolution (from a file). Not bad!

Runtime performance

On the google Coral I am able to create an extremely low-latency pipeline – from capture to inference in only a few milliseconds. Inference itself completes in about 20ms with my fine-tuned model – so I am able to operate at full speed.

I have not fully characterized the MyriadX based pipeline latency. Running from a file I was able to keep up at 30 fps. My Coral pipeline involved me pulling the video frames using my own C/C++ code down – including the YUV->BGR colorspace conversion. Since the MyriadX is able to grab and process on its device, it does have the potential to be low latency – but this has yet to be tested.

Object detection using the MyriadX (depthai oakd-lite) and Google Coral produced similar results.

InfluxDB Queries on HomeAssistant Data

I recently added in influx db + grafana + mariadb to one of my HA instances. I was surprised at how easy it was to get HA happy. Essentially nothing more than running the docker instances for each of these components and adding minimal yaml to HA.

When i went to query the data there were a few surprises that I thought I’d note. First, as noted in https://community.home-assistant.io/t/missing-sensors-in-influxdb/144948, HA puts sensors that have a unit name inside a measurement of that name. Thus if i show measurements that match my model_y, for example, all i get are binary (e.g. unittless) sensors:

> show measurements with measurement =~ /sensoricareabout/
name: measurements
name
----
binary_sensor.sensoridontcareabout1
climate.sensoridontcareabout2
device_tracker.sensoridontcareabout3
...

However if i query series, similarly, i see the measurement i care about:

> show series where "entity_id" =~ /sensoricareabout/
key
---
°F,domain=sensor,entity_id=sensoricareabout_foo_1
°F,domain=sensor,entity_id=sensoricareabout_foo_2

Thus, oddly, to query the data i want i form a query something like this:

 select value from "°F" where entity_id = 'sensoricareabout_foo_1'

A couple things. So, it is odd to me to have to query from what I essentially view as a table °F… but that is probably more HA’s fault than influxdb. Second, i think the quoting is a little weird. Note above that the measurement is double quoted, whereas the tag is single quoted. This is important. You can sometimes (maybe?) leave the double quotes off the former but the latter must strictly be single quotes…

The moral of the story is that if you want to find the data you are looking for you need to look under ‘%’ or the actual unit of measurement. For example in grafana i would:

  • click on explore
  • select ‘%’
  • Add a where clause to filter by entity_id
  • Possibly find my series there…
    • But if not remove the where, and switch from % to the unit of measurement, and repeat by adding the where clause

This all works well but is a little unintuitive and might make one think HA has forgotten to put the data into influx.

Containers for Fun and Profit

With all the whirl about containers I decided I could wait no longer to join the fray. Here is a log of some of the things I learned.

Basic test, e.g. what the heck is this?

I spun up a cent7 vm several months back. Apparently my repos were a tad old and things didnt work until i did a yum update first, then reboot. Then I could systemctl enable docker, systemctl start docker

Note you MUST sudo in for docker commands to work

Also note: cent7 has docker in it natively – no need to wget install it (unless you want to use some of the new features like the networking module)

The following is your smoke test:

sudo docker run hello-world 

Test 2: Expanded sample

This worked flawlessly…

Test 3: Interactive sample

Ditto, worked perfectly

Test 4: Do my own thing

I made a “looper.py” app with the following code:

#!/usr/bin/python
import time
while True:
	print "Hi! ", time.time()
	time.sleep(1)

I then made the following Dockerfile

FROM docker.io/centos:latest
RUN yum install python -y
COPY looper.py /
CMD /looper.py

And create the image with

	docker build -t looper .

When i ran the build one of the things I noticed is that python was already installed in the centos image. I modified the Dockerfile by removing the RUN line, and one cool thing is that when i re-ran the build command, the python install layer was automatically removed, and everything else was basically a noop. In other words docker appears to do a good job at being efficient.

I then ran my looper:

	docker run looper

Nothing happened… So i thought…and though… and eventually decided to try and attach. Ubeknowest to me, by doing docker run i WAS attached, but nonetheless I learned a few things:

  • To attach you need your container id
  • To get your container id you run “docker ps”

Once I did a docker attach to my container id, i saw nothing, still. I did a Ctrl-C and viola, my looper output appeared! I suspected buffering, which turned out to be the case. I modified looper as follows:

#!/usr/bin/python
import time
import sys
while True:
	print "Hi! ", time.time()
	sys.stdout.flush()
	time.sleep(1)

Then rebuilt, and re-ran, and it all worked.

Note that this only runs the command in the foreground. To run it in the background:

	docker run -d looper

You can then docker ps, find the cid, docker attach to it. But… you cannot detach (without sending a SIGKILL)! The docks say Ctrl-P + Ctrl+Q will detach, but this appears to only work if you use the following command when running it:

	docker run -tdi looper

Where t means create a tty, and i means “keep stdin open even if not attached”. This works well.

Note that each time i make changes to looper, when i rebuild it takes at most 20 seconds.. if no changes, docker takes milliseconds…

Test 5: layers

What if the container modifies a file?

I modified looper.py to write to stdout and a file:

#!/usr/bin/python
import time
import sys
while True:
	msg = "Hi! " + str(time.time())
	print msg
	with open('myfile','a') as f:
		f.write(msg + '\n')
	sys.stdout.flush()
	time.sleep(1)

For fun i created a file named “myfile”, then build the image, then ran the container. When it runs i can do a docker diff:

# docker diff f04849645523
A /myfile

And to be clear, this means the file was added in the image. Docker wont let the app reach into my own version of “myfile”.

What if i want to see the file? In older versions of docker, apparently you had a few options, such as running ssh, or making a snapshot, but now its easy:

	docker exec -t -i <cid> /bin/bash

You can then just cat the file, etc. If you actually want to copy the files out,
you can export the whole filesystem (docker export ) as a tar, but this seems nuts. If you just want a single file, use docker cp:

docker cp <cid>:<src> <dest>

Test 6: CPU limit

You can do a couple things:

1) Limit the share of cpu usage across multiple containers. This is done by specifying a relative weighting (with -c)

2) Pin the process to certain cpus with —cpuset-cpus=

I havent been able to find an equivalent to the simple “limit to N processors” idea on virtual machines. The weighting is fairly close.