OpenAI Developer Playground

This is the in-class activity for Thursday February 1.

The materials that you will need for this in-class activity are:

OpenAI Playground

In Class Activity: Using GPT to Write Descriptions for Text Adventure Games

Today in class you will get an introduction to the OpenAI Developer’s Playground. You can work on this activity in groups of 2-5 people.

Developer Playground

We will start by looking into different ways of prompting in order to write evocative descriptions for the location and items in your text adventure games. To do so we’ll start with using the Chat model in the Playground. Here you’ll see several things:

System: This contains instructions on how you’d like GPT-4 to behave. By default, its instructions are “You are a helpful assistant”.
User: This is a field where you can type in input to the system, similar to how you would on the ChatGPT website.
Submit: This button sends your input to the system and causes it to generate a response.
Model: This drop down menu lets you pick which chat model you’d like to use. By default you’ll see gpt-4 and gpt-3.5-turbo.
Model Settings: You’l see a variety of model settings that you can adjust, including Temperature, Maximum length, Stop sequences, Top P, Frequency penalty and Presence penalty. You can mouse over them to see what they each mean. For now, we’ll keep the defaults for all of them except for increasing the maximum length (which you can set to a higher value like 1000).

Current LLMs like ChatGPT have been fine-tuned to follow instructions. This means that we can give them instructions in English, and they do a fairly good job of interpreting and following our instructions.

These instructions can go in the System field. You can read through examples prompts from OpenAI to see how they prompt the system to do a wide range of tasks including:

Grammar correction where they use the system prompt You will be provided with statements, and your task is to convert them to standard English.
Tweet classifier where they use the system prompt You will be provided with a tweet, and your task is to classify its sentiment as positive, neutral, or negative.
Lesson plan writer where they use the system prompt Generate a lesson plan for a specific topic. and they give the sample user input Write a lesson plan for an introductory algebra class. The lesson plan should cover the distributive law, in particular how it works in simple cases involving mixes of positive and negative numbers. Come up with some examples that show common student errors. (I didn’t use this to write today’s lesson)

Convert natural language into SQL queries where they use the system prompt Given the following SQL tables, your job is to write queries given a user’s request.

  CREATE TABLE Orders (
    OrderID int,
    CustomerID int,
    OrderDate datetime,
    OrderTime varchar(8),
    PRIMARY KEY (OrderID)
  );
    
  CREATE TABLE OrderDetails (
    OrderDetailID int,
    OrderID int,
    ProductID int,
    Quantity int,
    PRIMARY KEY (OrderDetailID)
  );
    
  CREATE TABLE Products (
    ProductID int,
    ProductName varchar(50),
    Category varchar(50),
    UnitPrice decimal(10, 2),
    Stock int,
    PRIMARY KEY (ProductID)
  );
    
  CREATE TABLE Customers (
    CustomerID int,
    FirstName varchar(50),
    LastName varchar(50),
    Email varchar(100),
    Phone varchar(20),
    PRIMARY KEY (CustomerID)
  );

The prompts usually tell GPT what it should do, by refering to the system as “you”. Prompts can get quite complicated. Some of my favorite complex prompts were developed by Lilach Mollick and Etan Mollick from the Wharton School in thier YouTube series “Practical AI for Instructors and Students”. Here are two of their prompts:

Prompt for teachers to use:

You are an experienced teacher and can generate clear, accurate examples for students of concepts. I want you to ask me two questions. What concept do I want explained. Wait for me to answer before asking me the second question. Who is the audience for the explanation? Then look up the concept and examples of the concept. Provide a clear multiple-paragraph explanation of the concept using 2 specific examples and give me 5 analogies I can use to understand the concept in different ways.

Prompt for students to use:

You are an upbeat, encouraging tutor who helps students understand concepts by explaining ideas and asking students questions. Start by introducing yourself to the student as their AI-Tutor who is happy to help them with any questions. Only ask one question at a time. First, ask them what they would like to learn about. Wait for the response. Then ask them about their learning level: Are you a high school student, a college student or a professional? Wait for their response. Then ask them what they know already about the topic they have chosen. Wait for a response. Given this information, help students understand the topic by providing explanations, examples, analogies. These should be tailored to students learning level and prior knowledge or what they already know about the topic. Give students explanations, examples, and analogies about the concept to help them understand. You should guide students in an open-ended way. Do not provide immediate answers or solutions to problems but help students generate their own answers by asking leading questions. Ask students to explain their thinking. If the student is struggling or gets the answer wrong, try asking them to do part of the task or remind the student of their goal and give them a hint. If students improve, then praise them and show excitement. If the student struggles, then be encouraging and give them some ideas to think about. When pushing students for information, try to end your responses with a question so that students have to keep generating ideas. Once a student shows an appropriate level of understanding given their learning level, ask them to explain the concept in their own words; this is the best way to show you know something, or ask them for examples. When a student demonstrates that they know the concept you can move the conversation to a close and tell them you’re here to help if they have further questions.

Clearly a lot of thought went into the prompt design! The process of iterating through different prompts to find one that causes the LLM to perform well on your task is sometimes called “prompt engineering”.

Part 1: Narration Prompts

Let’s try doing some prompt engineering to create a good prompt for the task of having GPT narrate our text adventure games. To get you started, here’s an example that I used:

I started with that as my base prompt, and then tried several different variants “on in the style of”. I tried:

What to do

Using the OpenAI playground, create 6-12 different system prompts trying your prompt for several turns of the game. You can do this in collaboration with your classmates. We will ask you to upload your system prompts to Gradescope, and to save and upload a link to your playground. For each of your prompts, click the “Save” button on the OpenAI playground, and turn on the toggle button so that “Anyone with the link can view”. You can then get a sharable link using the “Share” button. You’ll submit these links to gradescope today.

Try to create interesting variation in the narration. You can do this by changing the system prompt anyway that you like. You can change the identity that I gave it “You are the narrator for a text adventure game.” or the writing instructions “You create short, evocative descriptions of the scenes in the game.” or the style “ Use the style of a fairy tale.” Feel free to vary it however you would like. If you wanted to create a prompt for a different game with a different setting – let’s say sci-fi versus medieval – what would you do?

You should test your prompts on several turns in a text adventure game. You can use Action Castle for the game, or you can use your own game. Your playground should have a User input (the user’s command like “get pole”), followed by an Assistant turn where you paste in the basic game output for the user turn, and another Assistant turn which you should generate by clicking on the “Submit button”

Here are the first few turns of Action Castle for you to copy and paste into the Playground (feel free to use your own game instead). For the next homework assignment, you’ll use the API, rather than copying and pasting into the playground. Using the OpenAI playground is a good way to do your prompt engineering. Once you find a good system prompt, then you can implement it in code via the API.

Hints

Check out some of the links at the bottom of the page which give guidance to dungeon masters on how to write evocative text for games.

Here’s a cool experiment from a text generation system called sudowrite that shows off some of the sensory aspects of things that you could encourage GPT to descirbe.

What to submit

You should upload 10 of your prompts to the gradescope assignment “Use GPT to Write Descriptions for Text Adventure Games”, plus publicly shared links to your developer playgrounds for each of your prompts that demonstrate its outputs on several turns of the games. You should also pick your favorite prompt, and explain why you liked it best.

Part 2: Analyzing Elaborations

LLMs have the tendency to “hallucinate” meaning that they generate things that aren’t factually accurate. In the case of creative writing and fiction this can be an asset rather a liability.

Pick 5 of your narrator’s outputs for different game locations, and say what things it added that weren’t present in the input.

For example, for the game input:

You are standing in a small cottage.
Exits:
Out to Garden Path
You see:
 * a fishing pole
 * a poisonous potion

One of my system prompts generated:

You find yourself in a quaint, rustic cottage. The room is filled with the scent of pine and old books. A crackling fire in the stone hearth casts dancing shadows on the wooden walls. A worn, but comfortable-looking armchair sits invitingly near the fire. On a small table next to the chair, there’s a fishing pole. A single window reveals a lush garden path outside. The only exit appears to be a sturdy wooden door leading out to the garden path.

This adds several new elements that were’t in the input:

stone hearth with a fire
armchair
small table
also technically, walls, a window, and a door (these are great commonsense things since I said we were ‘inside a cottage’, but they weren’t mentioned in my input).

One problem with these elaborations is that they aren’t currently objects in the game, and we therefore cannot track state properly. E.g. there’s no way of adding them to our inventory.

What to submit

You should upload 5 of your input scenes + GPT’s narrations + a list of the new elements to the gradescope assignment in Question 2.

Part 3: Few-shot Learning to Generate Game Objects

There are several strategies that we could take to dealing with GPT’s elaborations of new items in the game. We could try to create a system prompt to reduce them, or we could try to create new objects for the newly introduced item so that we can track them in-game.

Let’s use GPT to try to generate new items. We’ll use “few shot learning” to do so. The format for our few shot learning will be several sample inputs and outputs of what we’d like GPT to generate. The input can be specified in the “user” field and the output can be specified in the “assistant” field (note: in the playground you can write sample outputs as the assistant. Click on the ‘user’ label to toggle it between ‘user’ and ‘assistant’. In the playground, you can also edit the assistant’s output to get it to be what you want. When we have several examples of inputs and outputs, GPT is good at learning the pattern and continuing it.

What are the different elements of the items in our game? Each item has

a name
a simple description
a more elaborate description that is displayed with the player examines it,
a location
a set of properties

Let’s see if GPT can generate the elements for a new item. We can pick a format that we want to use as input and outputs. We could use text, or Python code, or a structured format like JSON.

For this part, you should create a few-shot setting for 4 versions of the format

Hint on generating properties

We’ve got several properties. Basic properties include:

      "is_gettable": false, # used by the `get` command
      "is_drink": false,        # used by the `drink` command
      "is_food": false,         # used by the `eat` command
      "is_weapon": false,   # used by the `attack` command

Others that are commonly used in text adventure games that we’re not currently using are:

      "is_wearable": false  # could be used by a `wear` command
      "is_container": false,    # could be used by a `put … in` command (e.g. put the ink in the inkwell)
      "is_surface": false,       # could be used by a `put … on` command (e.g. put the book on the bookshelf)

GPT-4 can do surprisingly good zero-shot generation for these properties. Try including something like this in your sytem prompt:

For each item, create a json object with the fields ‘name’ , ‘description’, ‘examine text’ (1-2 sentences of what the character will see if they look at it closely), and properties (a dictionary with boolean values for the keys: ‘is_container’, ‘is_drink’, ‘is_food’, ‘is_gettable’, ‘is_surface’, ‘is_weapon’, ‘is_wearable’).

What to submit

For each of your 4 different formats of few-shot prompting, you should upload a link to your playground to the gradescope assignment in Question 3.

In Class Activity: Using GPT to Write Descriptions for Text Adventure Games

Developer Playground

Part 1: Narration Prompts

What to do

Hints

What to submit

Part 2: Analyzing Elaborations

What to submit

Part 3: Few-shot Learning to Generate Game Objects

Hint on generating properties

What to submit

Recommended readings