MutualNet: Adaptive ConvNet via Mutual Learning from Network Width and Resolution (ECCV'20 Oral) [arXiv]

This work proposes a method to train a network that is executable at dynamic resource constraints (e.g., FLOPs) during runtime. The proposed mutual learning scheme for input resolution and network width significantly improves the accuracy-efficiency tradeoffs over Slimmable Networks on various tasks such as image classification, object detection and instance segmentation. The method is also promising to serve as a plug-and-play strategy to boost a single network. It substantially outperforms the powerful AutoAugment in both efficiency (GPU search hours: 15000 vs. 0) and accuracy (ImageNet: 77.6% vs. 78.6%).

Install

PyTorch 1.0.1, torchvision 0.2.2, Numpy, pyyaml 5.1.
Follow the PyTorch example to prepare ImageNet dataset.

Run

Training

To train MobileNet v1, run the codes below

python train.py app:apps/mobilenet_v1.yml

Training hyperparameters are in the .yml files. width_mult_list is just used to print training logs for corresponding network widths. During testing, you can assign any desired width between the width lower bound and upper bound. To train other models, just use the corresponding .yml files.

Testing

Modify test_only: False to test_only: True in .yml file to enable testing.

Modify pretrained: /PATH/TO/YOUR/WEIGHTS to assign trained weights.

Modify width_mult_list to test more network widths.

python train.py app:apps/mobilenet_v1.yml

Results and model weights

Performance over the whole FLOPs specturm

Comparison with US-Net under different backbones on ImageNet.

Model weights: [MobileNet v1], [MobileNet v2]

Scaling up model compared with EfficienNet

The best model scaling on MobileNet v1 compared with EfficientNet

Model	Best Model Scaling	FLOPs	Top-1 Acc
EfficientNet	d=1.4, w=1.2, r=1.3	2.3B	75.6%
MutualNet (Model)	w=1.6, r=1.3	2.3B	77.1%

Boosting performance of a single network

Top-1 accuracy on Cifar-10 and Cifar-100

WideResNet-28-10	GPU search hours	Cifar-10	Cifar-100
Baseline	0	96.1%	81.2%
Cutout	0	96.9%	81.6%
Mixup	0	97.3%	82.5%
AutoAugment	5000	97.4%	82.9%
Fast AutoAugment	3.5	97.3%	82.7%
MutualNet	0	97.3%	83.8%

Compared with state-of-the-art performance boosting methods on ImageNet

ResNet-50	Additional Cost	Top-1 Acc
Baseline	\	76.5%
Cutout	\	77.1%
Mixup	\	77.9%
CutMix	\	78.6%
KD	Teacher Network	76.5%
SENet	SE Block	77.6%
AutoAugment	15000 GPU search hours	77.6%
Fast AutoAugment	450 GPU search hours	77.6%
MutualNet (Model)	\	78.6%

Reference

- The code is based on the implementation of Slimmable Networks.

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
apps		apps
imgs		imgs
models		models
utils		utils
.gitignore		.gitignore
ComputePostBN.py		ComputePostBN.py
LICENSE		LICENSE
README.md		README.md
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MutualNet: Adaptive ConvNet via Mutual Learning from Network Width and Resolution (ECCV'20 Oral) [arXiv]

Install

Run

Training

Testing

Results and model weights

Performance over the whole FLOPs specturm

Scaling up model compared with EfficienNet

Boosting performance of a single network

Reference

About

Releases

Packages

Languages

License

PandaXu313/MutualNet

Folders and files

Latest commit

History

Repository files navigation

MutualNet: Adaptive ConvNet via Mutual Learning from Network Width and Resolution (ECCV'20 Oral) [arXiv]

Install

Run

Training

Testing

Results and model weights

Performance over the whole FLOPs specturm

Scaling up model compared with EfficienNet

Boosting performance of a single network

Reference

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages