Apple AI researchers introduce ‘MobileOne’, a new mobile backbone that reduces the inference time on an iPhone12 to less than one millisecond

in a recent research paper, a group of Apple researchers stressed that the problem is to reduce the cost of latency while increasing the accuracy of efficient designs by identifying key bottlenecks that affect device latency.

While reducing the number of floating point operations (FLOPs) and parameter counts has led to efficient mobile designs with high accuracy, variables such as memory access and parallelism continue to adversely affect latency costs during inference.

The research team introduces MobileOne, a unique and efficient neural network backbone for mobile devices, in the new publication An Improved One Millisecond Mobile Backbone, which reduces the inference time on an iPhone12 to less than one millisecond and achieves 75.9% top-1 accuracy at ImageNet.

The key contributions of the team are summarized as follows:

  • Team presents MobileOne, a revolutionary architecture that runs on a mobile device in less than a millisecond and provides state-of-the-art image classification accuracy within efficient model topologies. Their model’s performance also applies to desktop CPUs.
  • In today’s efficient networks, they are investigating performance limitations on activations and branches that result in huge latency costs on mobile.
  • The effects of reconfigurable branches of train time and dynamic regularization relaxation in training are investigated. They work together to overcome optimization bottlenecks that can occur while training small models.
  • Their model generalizes to additional tasks, such as object detection and semantic segmentation, and outperforms previous efficient approaches.

The article begins with an overview of MobileOne’s architectural blocks, which are intended for convolutional layers processed in depth and point layers. The basis is Google’s MobileNet-V1 block, consisting of 3*3 depth convolutions followed by 1*1 point convolutions. Overparameterization branches are also used to improve model performance.

MobileOne uses a depth scaling strategy similar to MobileNet-V2: shallower early stages with higher input quality and slower layers. There is no data movement charge as this setup does not require a multi-branch architecture at the time of inference. Compared to multi-branch systems, this allows the researchers to grow model parameters aggressively without hefty latency penalties.

MobileOne has been tested with mobile devices in the ImageNet benchmark. On an iPhone12, the MobileOne-S1 model achieved a blazing-fast inference time of less than one millisecond, while achieving 75.9% top-1 accuracy in the tests. MobileOne’s adaptability has also been proven in other computer vision applications. The researchers successfully used it as a backbone feature extractor for a single-shot object detector and in a Deeplab V3 segmentation network.

In this section, the research team examined the relationship between prominent metrics – FLOPs and parameter counting – and latency on a mobile device. They also look at how various architectural design decisions affect latency on the phone. Based on the results of the evaluation, they discuss our design and training procedure.

Overall, the study confirms that the proposed MobileOne is an efficient general purpose backbone that produces state-of-the-art results and is several times faster on mobile devices compared to existing efficient designs.

This Article is written as a summary article by Marktechpost Staff based on the paper 'An Improved One millisecond Mobile Backbone'. All Credit For This Research Goes To Researchers on This Project. Checkout the paper, reference post.

Please Don't Forget To Join Our ML Subreddit

Leave a Comment

Your email address will not be published.