ELECTRA: Pre-training Text Encoders As Discriminators Rather Than Generators

Review of paper by Kevin Clark1, Minh-Thang Luong2, Quoc V. Le2, and Christopher D. Manning1, 1Stanford University and 2Google Brain, 2020

This paper describes a new training approach for Transformer network architectures used for language modeling tasks. The authors demonstrate that their technique results in greatly improved training efficiency and better performance on common benchmark datasets (GLUE, SQuAD) compared to other state-of-the-art NLP models of similar size.

CONTINUE READING >

N-BEATS: Neural Basis Expansion Analysis For Interpretable Time Series Forecasting

Review of paper by Boris N. Oreshkin, Dmitri Carpov, Nicolas Chapados, Element AI, and Yoshua Bengio, MILA, 2019

This paper presents a block-based deep neural architecture for univariate time series point forecasting that is similar in its philosophy to very deep models (e.g. ResNet) used in more common deep learning applications such as image recognition. Furthermore, the authors demonstrate how their approach can be used to build predictive models that are interpretable.

CONTINUE READING >

Compounding the Performance Improvements of Assembled Techniques in a Convolutional Neural Network

Review of paper by Jungkyu Lee, Taeryun Won, and Kiho Hong, Clova Vision, NAVER Corp, 2019

A great review of many state-of-the-art tricks that can be used to improve the performance of a deep convolutional network (ResNet), combined with actual implementation details, source code, and performance results. A must-read for all Kaggle competitors or anyone who wants to achieve maximum performance on computer vision tasks.

CONTINUE READING >

No more pages to load