# HW4: Building an AI Game Master

*Author: Andrew Zhu (andrz@seas.upenn.edu)*

In this homework, we'll use Kani to build a simple Game Master (GM) for the [Mausritter TTRPG](https://losing-games.itch.io/mausritter). Your GM will help walk you through creating a mouse and rolling dice to overcome challenges in the game. Along the way, you'll learn the basics of the [Kani library](https://kani.readthedocs.io/en/latest/), function calling with LLMs, and how a LLM can use structured information to act as a game state manager.

We won't implement more complex systems like combat or spellcasting in this homework, but extending this system to make a more comprehensive GM (e.g. adding inventory tracking, combat, spellcasting, structured tracking of NPCs, ...) would make for a great class final project.

## What You'll Need

- A copy of the [Mausritter rules](https://losing-games.itch.io/mausritter) (available for free on itch.io)
- Your [Helicone key](https://www.helicone.ai/developer)
- Recommended: the [Kani documentation](https://kani.readthedocs.io/en/latest/)
- Optional: the [d20 documentation](https://d20.readthedocs.io/en/latest/start.html)

# Part 0: Setup

*Reference: [Installation (Kani Documentation)](https://kani.readthedocs.io/en/latest/install.html)*

First, let's install the libraries we're using and configure Kani to use the Helicone proxy. Kani requires Python 3.10 or higher - if you encounter an error saying "No matching distribution found", you may need to upgrade your Python version.

## Virtual Environment

Before we get started, I recommend setting up a virtual environment for this homework. To set up a virtual environment, use the following terminal commands:

Mac/Linux:
```shell
python3.10 -m venv ./venv
./venv/bin/activate
pip install jupyter
```

Windows:
```shell
python3.10 -m venv venv
venv\Scripts\activate.bat
pip install jupyter
```

You may need to restart your IDE after creating a new virtual environment for it to detect the interpreter. Now, when we install dependencies, it won't affect the global environment.

## Line Wrapping

If you're using VSCode, you may want to enable word wrap for your notebook so long outputs on a single line automatically fold to multiple lines. You can find the wordwrap option by clicking on the gear icon in the upper right hand corner of your .ipynb file, then click "Customize Notebook Layout", then scroll down to the bottom of the settings and select "Notebook Output Word Wrap".

In [None]:
# Install dependencies
!pip install -q d20 'kani[openai]' openai

We'll use GPT-4 as our LLM of choice, so we'll configure it as our *engine* (a Kani concept used to provide a standard interface to a LLM).

<div>
<img src="https://kani.readthedocs.io/en/latest/_images/concepts-figure.png" width="700"/>
</div>

To save you money with all of your prompting to GPT-4, you can use the Helicone account for the class. You should set your Helicone API Key (which you can find [here](https://www.helicone.ai/developer)) on the terminal with the command `export HELICONE_API_KEY=sk-helicone-cp-###########`. If you're using VS Code, you can launch it from your terminal with the `code` command and prefix it with this export statement on the same line. For example:

```
cd your/homework/dir
source venv/bin/activate
HELICONE_API_KEY=sk-helicone-cp-###########  code .
```

Alternatively, you can enter it by running the next cell. For more information on using the Helicone proxy, see [this Ed post](https://edstem.org/us/courses/50468/discussion/4413041).

We'll use the `engine` global throughout the rest of this homework.

In [ ]:
import os
from getpass import getpass

# Set up your Helicone API key here
if "HELICONE_API_KEY" not in os.environ:
    print("You didn't set your Helicone key to the HELICONE_API_KEY env var on the command line.")
    os.environ["HELICONE_API_KEY"] = getpass("Please enter your Helicone API Key now: ")

In [None]:
# Set up a GPT-4 engine using the Helicone proxy
from kani.engines.openai import OpenAIEngine

engine = OpenAIEngine(api_key=os.environ["HELICONE_API_KEY"], model="gpt-4", api_base="https://oai.hconeai.com/v1")

# Part 1: Dice Rolling & Function Calling

First, let's get familiar with the libraries we're going to use. 

## 1.1: Dice Rolling

To implement dice rolling, we'll use `d20`, a Python dice library I wrote for D&D. This library parses RPG dice notation and rolls the specified dice. For this homework, we'll use its output as Markdown, but if you're interested in a deeper dive into the library, the output is a [structured eval tree](https://d20.readthedocs.io/en/latest/expression.html).

Mausritter uses standard RPG dice notation throughout.

For example:
- `d20` means: Roll a single 20-sided die
- `1d8` means: Roll a single 8-sided die
- `3d6` means: Roll three 6-sided dice, add them together

In certain circumstances, you'll need to roll some dice and keep only the N highest (e.g. 4d6, keep highest 3). `d20` expresses this in "keep-highest" notation:

- `4d6kh3` means: Roll four six-sided dice, add the highest three together
- `2d20kl1` means: Roll two d20s, keep the lower ("disadvantage" in D&D, "advantage" in Mausritter)

For more examples, check out [the documentation](https://d20.readthedocs.io/en/latest/start.html#examples). Try running this cell a couple times to see what the result of these rolls are! Feel free to add more rolls to experiment, too.

In [None]:
from d20 import roll

print(roll("d20"))
print(roll("1d8"))
print(roll("3d6"))
print(roll("4d6kh3"))
print(roll("2d20kl1"))

In [None]:
# d20 can also handle basic math:
print(roll("2d6 * 3"))
print(roll("1d6 * 10 + 1d6"))

The `roll()` function returns a `RollResult` object, which contains the stringified roll (`RollResult.result`), its numeric total (`RollResult.total`), and the eval tree. You can use this, for example, to roll stats for a mouse!

Mausritter defines your mouse's stats as such:

> For each attribute, in order, roll 3d6. Keep the two highest dice results for a value between 2—12.

To make sure you understand how to use d20, let's this into a roll statement for one attribute, and print the result and final rolled total here.

In [None]:
# TODO: roll one attribute and print its roll and value
attr_result = roll("...")
print(...)
print(...)

## 1.2: Function Calling

*Reference: [Function Calling (Kani Documentation)](https://kani.readthedocs.io/en/latest/function_calling.html)*

Now, let's give a LLM the ability to roll dice! With Kani, you can write functions in Python and expose them to the model with just one line of code: the `@ai_function` decorator.

To create a kani with function calling, make a subclass of Kani and write your functions as methods. In order for a language model to effectively know what our AI functions do, we need to document them. We do this inline in the function: through type annotations and the docstring.

The allowed types are:

- Python primitive types (`None`, `bool`, `str`, `int`, `float`)
- an enum (`enum.Enum`)
- a list or dict of the above types (e.g. `list[str]`, `dict[str, int]`, `list[SomeEnum]`)

When the AI calls into the function, kani validates the AI’s requested parameters and guarantees that the passed parameters are of the annotated type by the time they reach your code.

By default, the function’s description will be taken from its docstring, and name from the source.

To specify the descriptions of parameters, you can provide an AIParam annotation using a `typing.Annotated` type annotation.

For example, you might annotate a parameter `timezone: str` with an example, like `timezone: Annotated[str, AIParam("The IANA time zone, e.g. America/New_York")]`.

Let's use this to expose the `roll()` function to GPT-4. Your method should take in the dice as a string, and return the resulting Markdown roll to the model.

In [None]:
from kani import Kani, ai_function, ChatMessage


class DiceKani(Kani):
    @ai_function()
    def roll(
        self,
        # TODO: add a parameter to this method to tell the model what to pass to it
    ):
        """
        TODO: Add documentation to this method to tell the model how to use it
        HINT: Try giving the model examples, or telling it to use RPG notation
        """
        # TODO: roll the dice provided by the model and return the result to it
        result = ...
        return result

Now, let's create an instance of your Kani with dice rolling and ask it to roll dice. In addition to standard queries like "roll me 4d6," an LLM can translate more complex queries from natural language to RPG dice syntax. Try asking it to roll an attribute for a mouse using Mausritter's natural language description, or even something that relies on background knowledge, like damage for a fireball spell.

To exit the chat session, send the word `!stop`.

In [None]:
from kani import chat_in_terminal

dice_ai = DiceKani(engine)
chat_in_terminal(dice_ai, stopword="!stop", verbose=True)

Easy, right? You can add more functions easily by defining more methods in your Kani subclass, and implement complex logic in the function body too. Let's use this to do something a little more complex: creating a character.

# Part 2: Character Creation

*Reference: Mausritter rulebook, pg. 8; [Function Calling (Kani Documentation)](https://kani.readthedocs.io/en/latest/function_calling.html)*

Before we get started playing Mausritter, you need to make a mouse! Luckily, making a mouse is easy: just roll for stats, background, and details. We'll record the equipment your mouse starts with in this homework, but won't implement a full structured inventory system (unless you are doing an extension).

## 2.1: Character Creator Agent

Let's build an AI agent to help walk us through character creation. Since character creation is fairly algorithmic, it's possible to write a character generator (like the one at https://mausritter.com/mouse/) without the use of an LLM at all - but in this homework, we'll use GPT-4 to parse the rules and output a mouse, ready to go.

First, let's define the goal: your agent should output a `Mouse` as defined here. Each of the structured character attributes matches those defined in the Mausritter rules, and we'll also add a LLM-generated prose description of your mouse.


In [None]:
import dataclasses
from dataclasses import dataclass


@dataclass
class Mouse:
    # structured character attributes
    strength: int
    dexterity: int
    will: int
    hp: int
    pips: int
    background: str
    birthsign: str
    disposition: str
    coat: str
    physical_detail: str
    name: str

    # LLM-generated
    description: str = ""


# Here's an example Mouse:
horatio = Mouse(
    name="Horatio Seedfall",
    background="Ale brewer",
    strength=11,
    dexterity=10,
    will=8,
    hp=2,
    birthsign="Wheel",
    disposition="Industrious / Unimaginative",
    coat="Chocolate, flecked",
    physical_detail="Tiny body",
    pips=3,
    description=(
        'Horatio Seedfall, more commonly known as "Ale Brewer" in his local mouse village, is a pint-sized powerhouse.'
        " His fur, a rich chocolate hue speckled with an array of lighter flecks, is reminiscent of the fine, roasted"
        " barley he uses in his brewing. His small size might deceive the unassuming observer, but beneath that tiny"
        " body of his lies a heart as tenacious as a bear's."
    ),
)

Now, we'll define a subclass of the `DiceKani` we made in part 1. This means that this Kani will also have access to roll dice!

You have a lot of freedom on how to implement the mouse creator here. Maybe you'll add functions to roll on each of the tables in the rulebook? Maybe you'll generate your mouse's background and other story attributes using the tables only for inspiration? Maybe you'll be able to simply prompt GPT-4 with the mouse creation instructions?

Remember, we aren't tracking inventory in this homework in the structured JSON (unless you are doing an extension), so you don't need to generate structured data regarding items. **You should have the LLM write down what items come with your mouse's background in their prose description, though.**

Regardless of the approach you choose, your Kani should call the provided `make_mouse` function at least once. This function shows how you can use function calling to make GPT-4 output a fairly large amount of structured data by presenting the desired data format as function parameters.

In [None]:
class MouseCreatorKani(DiceKani):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.mouse = None

    # TODO: add additional ai_functions here if needed
    # e.g. roll_attribute, lookup_background, roll_birthsign, generate_description, etc.

    # your Kani should call this at least once and output a valid mouse!
    # this parses the parameters GPT-4 generates and passes them to the Mouse constructor,
    # then saves them to an instance attribute so we can use the created Mouse later
    @ai_function()
    def make_mouse(
        self,
        # TODO: Modify the types of the parameters of this method to add model prompting if needed
        strength: int,
        dexterity: int,
        will: int,
        hp: int,
        pips: int,
        background: str,
        birthsign: str,
        disposition: str,
        coat: str,
        physical_detail: str,
        name: str,
        description: str,
    ):
        """
        TODO: Add documentation to this method to tell the model how to use it
        """
        self.mouse = Mouse(
            strength=strength,
            dexterity=dexterity,
            will=will,
            hp=hp,
            pips=pips,
            background=background,
            birthsign=birthsign,
            disposition=disposition,
            coat=coat,
            physical_detail=physical_detail,
            name=name,
            description=description,
        )
        # TODO: Tell GPT-4 that you successfully created or updated the mouse
        return "..."

You can add additional prompting here to tell the agent how to use your provided functions (e.g. a high-level flow).

In [None]:
# TODO: instruct GPT-4 on how to call the functions you provided to it to create a mouse.
# Not sure where to start? Try giving it the whole mouse creation rules
MOUSE_CREATOR_PROMPT = """
...
"""

Finally, chat with your agent and make a mouse! We'll use your mouse in the last part to play a small game of Mausritter.

In [None]:
# Run this cell to reset the state of your mouse creator
mouse_creator_ai = MouseCreatorKani(engine, always_included_messages=[ChatMessage.user(MOUSE_CREATOR_PROMPT)])

In [None]:
import time

# You can rerun this cell to continue conversation with your mouse creator
# Send "!stop" when you are done making your mouse
chat_in_terminal(mouse_creator_ai, stopword="!stop", verbose=True)
mouse_creator_ai.save(f"creator-{int(time.time())}.json")

Let's retrieve your mouse and print them here, to make sure everything saved correctly. Let's also save it to a JSON file in case you rerun this notebook and want to continue from here.

In [None]:
import json

my_mouse = mouse_creator_ai.mouse
mouse_dict = dataclasses.asdict(my_mouse)

with open("my-character.json", "w") as f:
    json.dump(mouse_dict, f, indent=2)

mouse_dict

In [ ]:
# To load a saved mouse, run this cell
with open("my-character.json") as f:
    mouse_dict = json.load(f)
my_mouse = Mouse(**mouse_dict)

If all goes well, you now have a mouse to play with! In the cell below, write down your high-level approach to the character creator -- e.g. did you give the model flexible control over the process or strict step-by-step instructions -- and any insights you learned while iterating on the agent.

Some things you could mention: Do you think a player who's never played a TTRPG before could create a mouse using your character creator by chatting with your agent and asking questions where they don't understand? How might you implement this agent in part of a larger system?

In the next part, we'll make a Kani that can roll checks and saves, and play a small game of Mausritter.

In [None]:
CHARACTER_CREATOR_INSIGHTS = """
TODO: Add your thoughts here
"""

## 2.2: Use DALL-E to generate a character portrait

*Reference: [Images (OpenAI Documentation)](https://platform.openai.com/docs/api-reference/images)*

Now we've generated a structured representation of your mouse and a prose description of what they look like, but can we turn this collection of attributes into an image? Let's use DALL-E 3 to generate a portrait of your mouse!

You'll need to use your personal OpenAI account for this, as Helicone cannot proxy image generation requests. It costs $0.04-$0.12 to generate one image with DALL-E 3, depending on the resolution and quality settings (https://openai.com/pricing). Run the next cell to set up your personal OpenAI API key.

In [None]:
if "OPENAI_API_KEY" not in os.environ:
    print("You didn't set your OpenAI key to the OPENAI_API_KEY env var on the command line.")
    os.environ["OPENAI_API_KEY"] = getpass("Please enter your OpenAI API Key now: ")

In [ ]:
import httpx
from openai import OpenAI

# create the OpenAI client
dalle_client = OpenAI()
http = httpx.Client()

# make a folder to save generated images
os.makedirs("dalle", exist_ok=True)

This code cell defines what you'll need to call the DALL-E API. It provides the `generate_image(prompt)` function which will call the Generate Image endpoint and download the generated image.

DALL-E 3 will also revise the prompt you provide to it by default, augmenting it with more synthetic details like we did in HW3. This can often help the model generate higher-fidelity images, but can also introduce weird hallucinations into the final image. The `generate_image` function will print out both the original prompt and its revision as sent to the image generation model.

In [ ]:
import re
from IPython.display import Image, display


def make_safe_filename(name, ext=".png"):
    """Ensure that a filename is safe to save to disk (replace any non-alphanumeric characters with an underscore)."""
    name = re.sub(r"[^a-zA-Z0-9-]+", "_", name)
    name = name[:100]
    return f"{name}{ext}"


def generate_image(prompt):
    """Generate an image, save it to disk, and display it.

    Returns a dict {"image_path": "...", "original_prompt": "...", "revised_prompt": "..."}.
    """

    # generate the image using the OpenAI API
    resp = dalle_client.images.generate(
        prompt=prompt,
        model="dall-e-3",
        response_format="url",
        quality="hd",  # feel free to change to "standard" for some cost savings
        size="1024x1024",  # also try "1024x1792" for portrait or "1792x1024" for landscape
        style="vivid",  # "vivid" or "natural"
    )
    image = resp.data[0]

    # download the generated image from URL (expires after 60m)
    fp = f"dalle/{make_safe_filename(image.revised_prompt)}"
    with open(fp, "wb") as f:
        with http.stream("GET", image.url) as r:
            for data in r.iter_bytes():
                f.write(data)

    # show the prompt, DALL-E rewrite, and the image
    print("Original prompt:", prompt)
    print()
    print("Revised prompt:", image.revised_prompt)
    print()
    print(f"Saved to {fp}")
    display(Image(fp, width=500, height=500))

    return {"image_path": fp, "original_prompt": prompt, "revised_prompt": image.revised_prompt}

Now it's up to you to generate an image for your mouse you created in part 2.1. You could manually write a prompt, create a template based off the structured attributes you have available, or create a new Kani agent to write prompts for you! Whichever you choose, please describe your approach below.

Feel free to modify the commented parameters in the `dalle_client.images.generate` call above.

In [ ]:
# TODO: Use generate_image() to create a portrait for your mouse

In [ ]:
# TODO: What approach did you use to create prompts for DALL-E? What strengths and weaknesses does this approach have? If you tried multiple approaches, how do they compare?
DALLE_APPROACHES = """
...
"""

# Part 3: Playing the Game

*Reference: Mausritter rulebook, pg. 12 & pg. 18*

Finally, let's make a GM Kani! If you're familiar with [AI Dungeon](https://aidungeon.com/), the concept here is similar: the model will act as our GM, defining the world and challenges in it. Unlike AI Dungeon, though, we have a character that we created and the ability to roll based off that character sheet!

Let's implement Saves (Mausritter p. 12):

> When you describe your mouse doing something risky where the outcome is uncertain and failure has consequences, the GM will ask you to make a save against either STR, DEX or WIL.
> To make a save, roll a d20. If the result is less than or equal to the relevant attribute, your mouse succeeds, and suffers no consequences. If the result is over the attribute, your mouse fails, and suffers the consequences described by the GM.

Again, we'll make a Kani that subclasses our DiceKani in part 1. You should implement the `roll_save` method that takes in the stat and if the roll is at advantage or disadvantage, and returns whether or not the save was successful.

In [None]:
class MausritterKani(DiceKani):
    # This Kani should reference your mouse - let's pass it in the constructor.
    def __init__(self, *args, mouse: Mouse, **kwargs):
        super().__init__(*args, **kwargs)
        self.mouse = mouse

    @ai_function()
    def roll_save(
        self,
        # TODO: add necessary parameters
    ):
        """TODO: describe this method for the LLM"""
        # TODO: roll a save for the given stat, compare it to your mouse's stats, and tell the LLM if you succeed or fail
        return ...

In [None]:
# TODO: instruct GPT-4 on how to call the functions you provided to it and high-level instructions to run the game.
# You should also include information on the character you created so the GM can reference it.
# Not sure where to start? Try telling it to introduce the world first.
# You might also want to look at the play example on pg. 18 of the Mausritter rulebook.
GM_PROMPT = """
...
"""

In [None]:
# Run this cell to reset the state of your game
# You can also add existing chat_history here if you'd like
gm_ai = MausritterKani(engine, mouse=my_mouse, system_prompt=GM_PROMPT)

Now, explore the world created by your model! When you're done, send `!stop`. The game transcript will be automatically saved to `gm.json`.

In [None]:
# You can rerun this cell to continue the game
# Send "!stop" when you are done making your mouse
chat_in_terminal(gm_ai, stopword="!stop", verbose=True)
gm_ai.save(f"gm-{int(time.time())}.json")

## Structureless Comparison

Cool, isn't it? But is it really *better* to provide LLMs with this structured information and access to tools?

To answer this question, let's try playing without structure: only prompt GPT-4 to act as the GM WITHOUT access to your character sheet or dice rolling.

In [None]:
# TODO: instruct GPT-4 on how to act as a Mausritter GM.
# You should NOT include information on function calling, but may tell it about your character (your choice).
GM_PROMPT_STRUCTURELESS = """
...
"""

In [None]:
# We'll use an unmodified Kani here -- it won't have access to dice rolling.
gm_ai_structureless = Kani(engine, system_prompt=GM_PROMPT_STRUCTURELESS)

In [None]:
# You can rerun this cell to continue the game
# Send "!stop" when you are done making your mouse
chat_in_terminal(gm_ai_structureless, stopword="!stop", verbose=True)
gm_ai_structureless.save(f"gm-structureless-{int(time.time())}.json")

Did you notice any significant difference in the gameplay in the short term? How about in the long term? Why do you think this might be? Could you think of any other functions you could expose to the GM to improve its story coherence over multiple play sessions? Write down your thoughts here.

In [None]:
AI_GM_THOUGHTS = """
TODO: Write your thoughts here.
"""

That's it for HW4! Hopefully you've gained an understanding of how structured game state can influence LLMs, and how to use Kani with function calling to give LLMs powerful capabilities. These skills will come in handy as you begin working on your final projects!

## What you should submit

You should submit to Gradescope:
- this notebook with all TODOs and free-response sections completed
- the latest saved transcripts of your character creator agent and play (with a structured and unstructured AI GM)
- the image(s) you generated of your character

# Extension Ideas (for groups of 3+)

- Implement more of the Mausritter mechanics (e.g. inventory, NPCs, spells, or combat)
- Give your AI GM the ability to call `generate_image` to visualize the game as it's played