Data Science Asked by Alla on February 24, 2021
Convolutional Neural networks are used in supervised learning meaning models are always “set in stone” after training (architecture and paramters) so this might not even be possible, but is there any research done on playing around with the data paths, model size (number of layers) and architecture during runtime, i.e after training is done, for instance creating a model that can be modified online to use less or more recources, skip layers or use portions of the model.
There is some work done recently on creating flexible frameworks for training and designing networks, but that’s always “offline”. There is also an interesting paper on training two models, “big” and “little” for the same application and using an accuracy/power trade-off policy to deploy one of the two.
It's a great question and one I have thought about a lot too.
Are you are talking about a network structure that evolves/mutates during training, or one that simply chooses whether or not to use all available resources? While I would never give a straight-up No to such a questions (especially the former), I can't say I have really seen it in publications. The cloest thing that comes to mind is a recent paper from DeepMind, which utilises a combination of "model-free and model-based aspects"; however, this more high-level than network architectures specifically, as you refer to them.
Perhaps you are considering a model, which is forced to choose between one resource and another? An example of the latter would be whether or not to use a skip connection or a dense layer (assuming shapes all match). Yann LeCun says (the video linked below) that such paths in a network not only provide more flexibility/complexity, but additionally ease the optimisation problem and provide a source of regularisation.
In any case, I am sure people are doing that, making use of dynamic networks created using PyTorch, the control flow operators in Tensorflow, the new tensorflow eager library (which aims to replicate dynamic networks like PyTorch produces).
One could also argue that providing a monolithic network to a model affords it the flexibility to use the resources (i.e. the weights) that it chooses. It doesn't have to use them all. Just as batch normslisation layers allow the model to parametrically adjust the extent to which they are used.
There are great discussions on what is required in the long run, heading toward Artificial General Intelligence (a recent overview in the context of RL part1 & part2. If we should be training models end-to-end, how much information should we give an algorithm learning to play a game and
The tools that would allow people to do that, like you mention, are coming along, but I haven't yet seen a paper that explicitly outlines how that is done. It would be great is someone else gives you a more satisfying answer! ;-)
Answered by n1k31t4 on February 24, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP