Welcome > Artificial Intelligence > ML > Reading Notes on Dr. Mi Zhang's Publications > 2020 Neurips

2020 Neurips

NeurIPS’20 Unsupervised Learning

NeurIPS’20: Does Unsupervised Architecture Representation Learning Help Neural Architecture Search?

Overview: This proposed a method to avoid bias in NAS.

Problem: jointly learn architecture representations (Training) and optimize search (Search) could introduce bias.

Solution: Decouple training and search, realized by unsupervised learning:

First, The Training: use unsupervised training to find a set of architecture representationss in latent space;
- Unsupervised training captures structural information of architectures;
- These architectures cluster better and distribute more smoothly in the latent space, which facilitates the downstream architecture search (the next step).
- These architectures are visualized in the latent space (Figure 4), which shows this method is better than traditional ones in the following way:
- It better preserve structural similarity of local neighborhoods;
- It capture topology and operation similarity, which helps cluster architectures with similar accuracy.
Second, The Search: Used two strategies.
- Reinforcement learning (RL);
- Bayesian optimization (BO).

Key techniques

Variational Graph Isomorphism Autoencoder (for Unsupervised Learning)

Evaluation

Search space used in NAS best practices checklist. From three works:
- NAS-Bench-101
- NAS-Bench-201
- DARTS
Evaluated two aspects:
- Performance of Training (titled Pre-training performance in the paper), compared with
- GAE, VGAE; for a) Accuracy b) Validty c) Uniqueness (Table 1)
- Supervised architecture representation learning;
  - for predictive performance (Figure 2)
  - for Distribution of L2 distance (Figure 3)
  - Latent space 2D visualization (Figure 4)
  - Architecture cells decoded from the latent space (Figure 5)
- Performance of Search. Compared with
- Adjacency matrix-based encoding
  - Random Search (RS)
  - Regularized Evolution (RE)
  - REINFORCE
  - BOHB
- Cell-based NAS methods
  - Random Search (RS)
  - ENAS
  - ASHA
  - RS WS
  - SNAS
  - DARTS
  - BANANAS

Can we do it?

Lele: No. For me, this contains only pure ML concepts, which most of them I do not understand, both the novel method (the new autoencoder) and the steps in evaluation strategy.

Questions or New ideas?

Lele: It seems like another way to decouple the two NAS stage: search space training and searching. From this aspect, it is similar to what we have read from Song Han’s paper once for all.

Once for all designs a special super netowork that could be used as the only trained network as “search space” in down stream search stage; during search, only a subnetwork will be “choosen” with simple rules.
Here the “search space” are get from unsupervised learning.

But the goal of once-for-all is different with this paper. Once-for-all is to use NAS to search for “small” enough networks in size; But this paper aims to improve the NAS in general way, without considering the actual application scenarios of the network.

For potential new ideas, we can have two directions here:

One is to improve NAS itself, by exploring some statistical features of the “latent space”, like this paper;
Another is to customize the NAS to better fix certain usage scenarios: such as once for all paper, they target using NAS to quickly find a best network with size constraints (for smaller devices).

If you could revise
the fundmental principles of
computer system design
to improve security...

... what would you change?