Convolutional Neural Networks for Sentence Classification¶
Why this mattered¶
TBD
Abstract¶
We report on a series of experiments with convolutional neural networks (CNN) trained on top of pre-trained word vectors for sentence-level classification tasks. We show that a simple CNN with little hyperparameter tuning and static vectors achieves excellent results on multiple benchmarks. Learning task-specific vectors through fine-tuning offers further gains in performance. We additionally propose a simple modification to the architecture to allow for the use of both task-specific and static vectors. The CNN models discussed herein improve upon the state of the art on 4 out of 7 tasks, which include sentiment analysis and question classification.
Related¶
- cite → Gradient-based learning applied to document recognition — Kim adapts the convolution-and-pooling architecture popularized by LeCun et al. for document recognition to sentence-level text classification.
- cite → Speech recognition with deep recurrent neural networks — Kim cites Graves et al. as evidence that neural sequence models had recently achieved strong results in speech recognition, motivating deep learning for NLP.
- cite → ImageNet classification with deep convolutional neural networks — Kim cites AlexNet as a landmark demonstration that deep convolutional networks trained with dropout and ReLU-style nonlinearities can achieve major classification gains.
- cite → Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank — Kim compares his CNN sentence classifier against Socher et al.'s recursive neural models on the Stanford Sentiment Treebank.
- enables ← Gradient-based learning applied to document recognition — LeCun's document-recognition CNN demonstrated convolution and pooling for pattern extraction, which sentence CNNs adapted from images to word-sequence classification.