2020 Neurips

NeurIPS’20 Unsupervised Learning

NeurIPS’20: Does Unsupervised Architecture Representation Learning Help Neural Architecture Search?

Overview: This proposed a method to avoid bias in NAS.

Problem: jointly learn architecture representations (Training) and optimize search (Search) could introduce bias.

Solution: Decouple training and search, realized by unsupervised learning:

  • First, The Training: use unsupervised training to find a set of architecture representationss in latent space;
    • Unsupervised training captures structural information of architectures;
    • These architectures cluster better and distribute more smoothly in the latent space, which facilitates the downstream architecture search (the next step).
    • These architectures are visualized in the latent space (Figure 4), which shows this method is better than traditional ones in the following way:
    • It better preserve structural similarity of local neighborhoods;
    • It capture topology and operation similarity, which helps cluster architectures with similar accuracy.
  • Second, The Search: Used two strategies.
    • Reinforcement learning (RL);
    • Bayesian optimization (BO).
Key techniques
  • Variational Graph Isomorphism Autoencoder (for Unsupervised Learning)
Evaluation
  • Search space used in NAS best practices checklist. From three works:

    • NAS-Bench-101
    • NAS-Bench-201
    • DARTS
  • Evaluated two aspects:

    • Performance of Training (titled Pre-training performance in the paper), compared with
    • GAE, VGAE; for a) Accuracy b) Validty c) Uniqueness (Table 1)
    • Supervised architecture representation learning;
      • for predictive performance (Figure 2)
      • for Distribution of L2 distance (Figure 3)
      • Latent space 2D visualization (Figure 4)
      • Architecture cells decoded from the latent space (Figure 5)
    • Performance of Search. Compared with
    • Adjacency matrix-based encoding
      • Random Search (RS)
      • Regularized Evolution (RE)
      • REINFORCE
      • BOHB
    • Cell-based NAS methods
      • Random Search (RS)
      • ENAS
      • ASHA
      • RS WS
      • SNAS
      • DARTS
      • BANANAS
Can we do it?

Lele: No. For me, this contains only pure ML concepts, which most of them I do not understand, both the novel method (the new autoencoder) and the steps in evaluation strategy.

Questions or New ideas?

Lele: It seems like another way to decouple the two NAS stage: search space training and searching. From this aspect, it is similar to what we have read from Song Han’s paper once for all.

  • Once for all designs a special super netowork that could be used as the only trained network as “search space” in down stream search stage; during search, only a subnetwork will be “choosen” with simple rules.
  • Here the “search space” are get from unsupervised learning.

But the goal of once-for-all is different with this paper. Once-for-all is to use NAS to search for “small” enough networks in size; But this paper aims to improve the NAS in general way, without considering the actual application scenarios of the network.

For potential new ideas, we can have two directions here:

  • One is to improve NAS itself, by exploring some statistical features of the “latent space”, like this paper;
  • Another is to customize the NAS to better fix certain usage scenarios: such as once for all paper, they target using NAS to quickly find a best network with size constraints (for smaller devices).

More

Created Nov 19, 2020 // Last Updated Aug 31, 2021

If you could revise
the fundmental principles of
computer system design
to improve security...

... what would you change?