Team Members
Megha Chandra Nandyala, Amisha Himanshu Somaiya, Shao-Jung Kan, Wenzheng Zhao
WalkThrough
For a comprehensive understanding of all the information present here, refer our detailed written file. For detailed design rationale, read our README
Motivation
Setting and tuning these hyperparameters is extremely difficult because there are large number of hyperparameters, and each hyperparameter can take on multiple values. One hyperparameter can affect another, thus understanding the complex dependencies requires a lot of experimentation which is time-consuming and expensive. Worst of all there is no universal solution, as one set of hyperparameters won’t work for a different problem/model. So we provide a comprehensive approach to this problem. To know more about hyperparameters refer to this link.
Tuning Strategy:
Starting with a model configuration
Model architectures typically have various hyperparameters that determine the model's size and other details (e.g. number of layers, layer width, type of activation function). We suggest starting with small standard models and gradually increasing complexity. Try to choose standard model configuration, so one doesn’t have to spend a lot of time tuning model hyper-parameters. Choose a starting optimizer, epoch, and performance metrics.
Throughout this site, we will be using ResNet50 model with Adam optimizer trained on CIFAR-10 dataset for 50 epochs as the base configuration unless specified otherwise.