Skip to content

MutualNet: Adaptive ConvNet via Mutual Learning from Network Width and Resolution (ECCV'20 Oral)

License

Notifications You must be signed in to change notification settings

PandaXu313/MutualNet

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

50 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MutualNet: Adaptive ConvNet via Mutual Learning from Network Width and Resolution (ECCV'20 Oral) [arXiv]

This work proposes a method to train a network that is executable at dynamic resource constraints (e.g., FLOPs) during runtime. The proposed mutual learning scheme for input resolution and network width significantly improves the accuracy-efficiency tradeoffs over Slimmable Networks on various tasks such as image classification, object detection and instance segmentation. The method is also promising to serve as a plug-and-play strategy to boost a single network. It substantially outperforms the powerful AutoAugment in both efficiency (GPU search hours: 15000 vs. 0) and accuracy (ImageNet: 77.6% vs. 78.6%).

Install

  • PyTorch 1.0.1, torchvision 0.2.2, Numpy, pyyaml 5.1.
  • Follow the PyTorch example to prepare ImageNet dataset.

Run

Training

To train MobileNet v1, run the codes below

python train.py app:apps/mobilenet_v1.yml

Training hyperparameters are in the .yml files. width_mult_list is just used to print training logs for corresponding network widths. During testing, you can assign any desired width between the width lower bound and upper bound. To train other models, just use the corresponding .yml files.

Testing

Modify test_only: False to test_only: True in .yml file to enable testing.

Modify pretrained: /PATH/TO/YOUR/WEIGHTS to assign trained weights.

Modify width_mult_list to test more network widths.

python train.py app:apps/mobilenet_v1.yml

Results and model weights

Performance over the whole FLOPs specturm

Comparison with US-Net under different backbones on ImageNet.

Model weights: [MobileNet v1], [MobileNet v2] Results compared with US-Net

Scaling up model compared with EfficienNet

The best model scaling on MobileNet v1 compared with EfficientNet

Model Best Model Scaling FLOPs Top-1 Acc
EfficientNet d=1.4, w=1.2, r=1.3 2.3B 75.6%
MutualNet (Model) w=1.6, r=1.3 2.3B 77.1%

Boosting performance of a single network

Top-1 accuracy on Cifar-10 and Cifar-100

WideResNet-28-10 GPU search hours Cifar-10 Cifar-100
Baseline 0 96.1% 81.2%
Cutout 0 96.9% 81.6%
Mixup 0 97.3% 82.5%
AutoAugment 5000 97.4% 82.9%
Fast AutoAugment 3.5 97.3% 82.7%
MutualNet 0 97.3% 83.8%

Compared with state-of-the-art performance boosting methods on ImageNet

ResNet-50 Additional Cost Top-1 Acc
Baseline \ 76.5%
Cutout \ 77.1%
Mixup \ 77.9%
CutMix \ 78.6%
KD Teacher Network 76.5%
SENet SE Block 77.6%
AutoAugment 15000 GPU search hours 77.6%
Fast AutoAugment 450 GPU search hours 77.6%
MutualNet (Model) \ 78.6%

Reference

- The code is based on the implementation of Slimmable Networks.

About

MutualNet: Adaptive ConvNet via Mutual Learning from Network Width and Resolution (ECCV'20 Oral)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%