Paper Note: Text Understanding from Scratch

Zhang, Xiang, and Yann LeCun. “Text understanding from scratch." arXiv preprint arXiv:1502.01710(2015).

CNN (convolutional neural network) has been widely used on image problems, but this article replicate the success of CNN on the text understanding problem. They propose a method to convert a sentence into some character-level image-like representation, feed it into a well-designed CNN, and train the model to classify the sentence.

Binary Encoding

70 characters are the alphabets: abcdefghijklmnopqrstuvwxyz0123456789-,;.!?:’’’/\|_@#$%ˆ&* ̃‘+-=<>()[]{}

Every character in a sentence is encoded to a one-of-m (m is the alphabet size, which is 70) code. Sentence length l, thus the input size m x l, a column is a char. The result is somehow like the Braille used for assisting blind reading.

螢幕快照 2016-05-19 上午11.40.19.png


CNN Model

螢幕快照 2016-05-19 上午11.42.33.png

The input is as described at the above. Two models (Large / Small) are proposed, both of which have 9-layers (6 conv + 3 fc).


Data Augmentation using Thesaurus

They do synonym replacement for words to do augmentation. This is very different to what we do on image data augmentation like flipping, cropping , random rotation.


What’s more, the model can be used on other languages than only English. They experiment it on Chinese news data and get good results.



  1. DBpedia Ontology Classification
  2. Amazon Review Sentiment Analysis
  3. Yahoo! Answers Topic Classification
  4. AG’s news corpus
  5. Chinese news

螢幕快照 2016-05-19 上午11.55.02.png螢幕快照 2016-05-19 上午11.55.41.png

螢幕快照 2016-05-19 下午1.13.18.png螢幕快照 2016-05-19 下午1.13.50.png

螢幕快照 2016-05-19 下午1.17.12.png

Paper Note: Text Understanding from Scratch


在下方填入你的資料或按右方圖示以社群網站登入: 標誌

您的留言將使用 帳號。 登出 /  變更 )

Google photo

您的留言將使用 Google 帳號。 登出 /  變更 )

Twitter picture

您的留言將使用 Twitter 帳號。 登出 /  變更 )


您的留言將使用 Facebook 帳號。 登出 /  變更 )

連結到 %s