Find in Library
Search millions of books, articles, and more
Indexed Open Access Databases
TanhSoft—Dynamic Trainable Activation Functions for Faster Learning and Better Performance
oleh: Koushik Biswas, Sandeep Kumar, Shilpak Banerjee, Ashish Kumar Pandey
Format: | Article |
---|---|
Diterbitkan: | IEEE 2021-01-01 |
Deskripsi
Deep learning, at its core, contains functions that are the composition of a linear transformation with a nonlinear function known as the activation function. In the past few years, there is an increasing interest in the construction of novel activation functions resulting in better learning. In this work, we propose three novel activation functions with learnable parameters, namely TanhSoft-1, TanhSoft-2, and TanhSoft-3, which are shown to outperform several well-known activation functions. For instance, replacing ReLU with TanhSoft-1, TanhSoft-2, and Tanhsot-3 improves top-1 classification accuracy by 6.06%, 5.75%, and 5.38% respectively on VGG-16(with batch-normalization), by 3.02%, 3.25% and 2.93% respectively on PreActResNet-34 in CIFAR-100 dataset, by 1.76%, 1.93%, and 1.82% respectively on WideResNet 28-10 in Tiny ImageNet dataset. TanhSoft-1, TanhSoft-2, and Tanhsot-3 outperformed ReLU on mean average precision (mAP) by 0.7%, 0.8%, and 0.6% respectively in object detection problem on SSD 300 model in Pascal VOC dataset.