摘要
In deeper layers, ResNet heavily depends on skip connections and Relu. Although skip connections have demonstrated their usefulness in networks, a major issue arises when the dimensions between layers are not consistent. In such cases, it is necessary to use techniques such as zero-padding or projection to match the dimensions between layers. These adjustments increase the complexity of the network architecture, resulting in an increase in parameter number and a rise in computational costs. Another problem is the vanishing gradient caused by utilizing Relu. In our model, after making appropriate adjustments to the inception blocks, we replace the deeper layers of ResNet with modified inception blocks and Relu with our non-monotonic activation NMAF). To reduce parameter number, we use symmetric factorization and 1x1 convolutions. Utilizing these two techniques contributed to reducing the parameter number by around 6 M parameters, which has helped reduce the run time by 30 s/epoch. Unlike Relu, NMAF addresses the deactivation problem of the non-positive number by activating the negative values and outputting small negative numbers instead of zero in Relu, which helped in enhancing the convergence speed and increasing the accuracy by 5%, 15%, and 5% for the non-noisy datasets, and 5%, 6%, 21% for non-noisy datasets.
