Improving Azure Custom Vision Object Recognition by using and correcting the prediction pictures

3 minute read


Last month I wrote about integrating Azure Custom Vision Object Recognition with HoloLens to recognize to label objects in 3D space. I wrote the prediction went pretty well, although I used only 35 pictures. I also wrote the process of taking, uploading and labeling pictures is quite tedious.

Improving on the go

It turns out Custom Vision retained all the pictures I uploaded in the cause of testing. So every time I used my HoloLens and asked Custom Vision to locate toy aircraft, it stored that in the cloud with it's prediction. And the fun thing is, you can use those pictures to actually improve your model again.

After some playing around with my model my (for the previous blog post about this subject) , I clicked the prediction tab, and I found about 30 pictures - each for every time I used the model from my HoloLens. I could use those to improve my model. After that, I did some more testing using the HoloLens to show you how it's done. So, I clicked the Predictions tab and there were a couple of more pictures:


If we select the first picture, we see this:


The model has already annotated the areas where it thinks is a an airplane in red. Interestingly now the model is a lot better than it originally was (when it only featured my pre-loaded images) as it now recognizes the DC-3 Dakota on top - that it has never seen before - as an airplane! And even the X-15 (the black thing on the left ) is recognized. Although the X-15 had a few entries in the training images it barely looks like an airplane (for all intents and purpose it was more a spaceship with wings to facilitate a landing).

I digress. You need to click every area you want to confirm:


And when you are done, and all relevant areas are white:


Simply click the X top right. The image will now disappear from the "Predictions" list and end up in the "Training images" list

Some interesting things to note

The model really improved from adding the new images. Not only did it recognize the DC3 'Dakota' that had not been in the training images, but also this Tiger Moth model (the bright yellow one) that it had never  seen before:


Also, it stopped recognizing or doubting things like the HoloLens pouch that's lying there, and my headphones and hand were also recognized as 'definitely not an airplane'


Yet, I also learned it's dangerous to take the same background over and over again. Apparently the model starts to rely on that. If I put the Tiger Moth on a dark blue desk chair in stead of a light blue bed cover


Yes...  the model is quite confident airplane in the picture but it's not very good at pinpointing it.


And as far as the Curtiss P40 'Kittyhawk' goes - even though it has been featured extensively in both the original training pictures and the ones I added from the Predictions, this no is success either. The model is better at pinpointing the aircraft, but considerably less sure it is an aircraft. And the outer box, that includes the chair, gives a 30.5%. So in looks that to make this model even more reliable I still need more pictures but then on other background, more different lighting, etc.


You don't have to take very much pictures up front to incrementally improve a Custom Vision Object Recognition model - you can just iterate on it's predictions and improve them. It feels a bit like teaching a toddler how to build something from Legos - you first show the principle, then let them muck around, and every time things goes wrong, you show how it should have been done. Gradually they get the message. Or at least, that's what you hope. ;)

No (new) code this time, as the code from last time is unchanged.

Disclaimer - I have no idea how much prediction pictures are stored and for how long - I can imagine not indefinitely, and not an unlimited amount. But I can't attach numbers to that.