Adapting Custom Vision Object Recognition Windows ML code for use in Mixed Reality applications

6 minute read

Intro

In November I wrote about a Custom Vision Object Detection experiment that I did, which allowed the HoloLens I was wearing to recognize not only what objects where in view, but also where they approximately were in space. You might remember this picture:

You might also remember this one:

Apart from being a very cool new project type, it also showed a great limitation. You could only use an online model. You could not download it in the form of, for instance, an ONNX model to use with Windows ML. It worked pretty well, don't get me wrong, but maybe you are out and about and your device can't always reach the online model. Well guess what recently changed:

Yay! Custom Vision Object Detection now support downloadable models that can be use in Windows ML.

Download model and code

After you have changed the type from 'General" to "General (compact)" and saved that change, hit the "Performance" tab, then you will see the "Export" option appear (no idea why this is at "Performance", but what the heck:

So if you click that, you get a bit of an unwieldy screen that looks like this:

We are going to select the ONNX standard because that is what we can in Windows Machine Learning - inside an UWP app running on the HoloLens. Please select version 1.2:










The result is a ZIP file containing the following folders and files:




We are only going to need the model.onnx file (in the next blog post). For now I want to concentrate on the file that is inside the CSharp folder - ObjectDetection.cs. That file is very fine for using in a regular UWP app. However, although they are running on top of UWP, HoloLens apps are all but regular UWP apps.

Challenges in incorporating the C# code in an Unity project

Some interesting challenges lay ahead:

  • Unity for HoloLens knows this unusual concept of having two Visual studio solutions: one for use in the Unity editor, and a second one that is generated from the first. But the first one, the Unity solution needs to be able to swallow all the code, even if it's UWP and will never run in the editor. To make that possible, we will have to put some stuff into preprocessor directives to be able to generate the deployment project at all
  • The code uses C# 7.0 concepts - tuples - that are not supported by the C# version (4.0) supported in all but the newest versions of Unity, that I am not using here for various reasons
  • I also found a pretty subtle bug in the code that only happens in a Unity runtime

I will address all three things.

Testing in a bare bones project - here come the errors

So, I created an empty HoloLens project basically doing nothing: just imported the Mixed Reality Toolkit and hit all three configuration options in the Mixed Reality Toolkit/config menu. Then I added the ObjectDetection.cs to the project and immediately Unity started to balk:

Round 1 - preprocessor directives

The first round of fixing is pretty simple - just put everything he editor balks about between preprocessor directives:

#if !UNITY_EDITOR 
#endif

You can do this the rough way - by basically putting the whole file in these directives - or only put the minimum stuff in directives. I usually opt for the second way. So we need to put the following parts between these preprocessor directives.

First, this part in the using section of the start of the file:

    using System;
    using System.Collections.Generic;
    using System.Diagnostics;
    using System.Linq;
#if !UNITY_EDITOR
    using System.Threading.Tasks;
    using Windows.AI.MachineLearning;
    using Windows.Media;
    using Windows.Storage;
#endif

Then this part, at the start of the ObjectDetection class:

    public class ObjectDetection
    {
        private static readonly float[] Anchors = ....

        private readonly IList<string> labels;
        private readonly int maxDetections;
        private readonly float probabilityThreshold;
        private readonly float iouThreshold;
#if !UNITY_EDITOR
        private LearningModel model;
        private LearningModelSession session;
#endif

Then the following methods need to be put into entirely between these preprocessor directives:

  • Init
  • ExtractBoxes
  • Postprocess

And then both Unity and Visual Studio stop complaining about errors. So let's build the UWP solution...

Oops. Well I already warned you about this.

Round 2 - Tuples are a no-no

Although the very newest versions of Unity support C# 7.0 the majority of the versions that are used today for various reasons (mainly hologram stability) do not. But the code generated by CustomVision has some tuples in it. The culprit is ExtractBoxes:

private (IList<BoundingBox>, IList<float[]>) ExtractBoxes(TensorFloat predictionOutput,
float[] anchors)

So we need to refactor this to C# 4 style code. Fortunately, this is not quite rocket science.

First of all, we define a class with the same properties as the tuple:

internal class ExtractedBoxes
{
    public IList<BoundingBox> Boxes { get; private set; }
    public IList<float[]> Probabilities { get; private set; }

    public ExtractedBoxes(IList<BoundingBox> boxes, IList<float[]> probs)
    {
        Boxes = boxes;
        Probabilities = probs;
    }
}

I have added this to the ObjectDetection.cs file, just behind the end of the ObjectDetection class definition. Then, we only need to change the return value of the method ExtractBoxes from

private (IList<BoundingBox>, IList<float[]>)

to

 return new ExtractedBoxes(boxes, probs);

We also have to change the mode PostProcess, the place where ExtractBoxes is used:

private IList<PredictionModel> Postprocess(TensorFloat predictionOutputs)
{
    var (boxes, probs) = this.ExtractBoxes(predictionOutputs, ObjectDetection.Anchors);
    return this.SuppressNonMaximum(boxes, probs);
}

needs to become

private IList<PredictionModel> Postprocess(TensorFloat predictionOutputs)
{
    var extractedBoxes = this.ExtractBoxes(predictionOutputs, ObjectDetection.Anchors);
    return this.SuppressNonMaximum(extractedBoxes.Boxes, extractedBoxes.Probabilities);
}

and then, dear reader, Unity will finally build the deployment UWP solution. But there is still more to do.

Round 3 - fix a weird crashing bug

When I tried this in my app - and you will have to take my word for it - my app randomly crashed. The culprit, after long debugging, turned out to be this line:

private IList<PredictionModel> SuppressNonMaximum(IList<BoundingBox> boxes, 
IList<float[]> probs) { var predictions = new List<PredictionModel>(); var maxProbs = probs.Select(x => x.Max()).ToArray(); while (predictions.Count < this.maxDetections) { var max = maxProbs.Max();

I know, it doesn't make sense. I have not checked this in plain UWP, but apparently the implementation of Max() in the Unity player on top of UWP doesn't like to calculate the Max of and empty list. My app worked fine as long as there were recognizable object in view. If there were none, it crashed. So, I changed that piece to check for probs not being empty first:

private IList<PredictionModel> SuppressNonMaximum(IList<BoundingBox> boxes, IList<float[]> probs)
{
    var predictions = new List<PredictionModel>();
    // Added JvS
    if (probs.Any())
    {
        var maxProbs = probs.Select(x => x.Max()).ToArray();

        while (predictions.Count < this.maxDetections)
        {
            var max = maxProbs.Max();

And then your app will still be running when there's no predictions.

Round 4 - some minor fit & finish

Because I am lazy and it makes life easier when using this from a Unity app, I added this little overload of the Init method:

public async Task Init(string fileName)
{
    var file = await StorageFile.GetFileFromApplicationUriAsync(new Uri(fileName));
    await Init(file);
}

This will need to be in and #if !UNITY_EDITOR preprocessor directive as well. This method allows me to call the method like this without first getting a StorageFile:

_objectDetection.Init("ms-appx:///Data/StreamingAssets/model.onnx");

Conclusion

With these adaptions you have a C# file that will allow you to use Windows ML from both Unity and regular UWP apps. In a following blog post I will actually show a refactored version of the Toy Aircraft Finder to show how things work IRL.

There is no real demo project this time (yet) but if you want to download the finished file already, you can do so here.