Skip to main content
Warning: this assignment is out of date. It may still need to be updated for this year's class. Check with your instructor before you start working on this assignment.
This assignment is due on Friday, February 14, 2020 before 01:30PM.
You can download the materials for this assignment here:

Homework 3: Natural Language Understanding

Instructions

In this homework, you will be experiment with the ROCSTories dataset, which consists of 5-sentence long stories. You will build a few classifiers for the Story Cloze task. The task involves predicting which of two candidate 5th sentences best ends a story.

You should walk through the IPython Notebook here. It walks you through building a sentiment-based system for predicting the correct next ending as well as two neural network-based approaches that learn a classifier for the task.

You should create a writeup describing the experiments you conduct. We will be primarily grading you on this writeup, although you should also submit your code. Your writeup should include

  • Your accuracy on the two validation sets and the 2016 test set using…
    • A sentiment-based classifier
    • A neural network trained only on the train set, as well as 2 variants.
    • A neural network trained in a supervised way on the validation set, as well as 2 variants.
  • Descriptions of your approaches for each of the above methods/variants. Try to make your descriptions detailed enough that another student could reimplement your approach from them.
  • An error analysis:
    • Find a couple examples your sentiment classifier gets wrong and discuss them.
    • Are there any examples that all of your classifiers get wrong? What do you notice about them?

What parameters can your vary?

Here are some ideas:

  • The learning rate and optimizer
  • The number of training steps
  • The number of layers and types of activation functions in the model architecture
  • The batch size
  • For the supervised model that trains on the validation set, can you also incorporate the unlabeled training data?

What to submit

  • A file containing your writeup, as either report.pdf or report.md.
  • Your modified IPython notebook as rocstories.ipynb.