One Thousand And One Neural Network Nights

Samples from the GPT-2 neural network are generally short – a few paragraphs – because it can only write 3 or 4 paragraph of text in a single sample. (This is vastly better than earlier networks like char-rnn).

I wanted to try out GPT-2 creating a single unbroken sample by feeding each sample into the next over and over again, on the vanilla GPT-2, just to see where it went.

I discovered that the bane of this neural network is a list. With the default 345M model almost every single run ended in an infinite list (Bible verses, Roman Numerals, vaguely sequential numbers.) In between there were a few megabytes of climate speeches, but everything ended in numbers staitons. May do a ‘absurdly long lists’ posts later. But if you need to defeat an evil robot powered by the GPT-2 neural network don’t go with the classic approach of “This statement is a lie.” Start a list because once a neural network stats counting IT CAN NOT STOP.

I still wanted to try a longer sample. One Thousand And One Nights is sort of a single story, sort of a series of short stories. Meandering narratives, asides, stories inside stories – story told by design to never end – it already sounds a lot like what you get out of a neural network! So I began with first paragraph of One Thousand and One Nights.


One Thousand and One Nights Story Text

Easy screenshots on Windows the built-in snipping tool - Start -->snip.
You can also link to any particular night with your current browser url.

(Some earlier attempts available with the 'Default Model' selector above.)

How It Works

The GPT-2 network can only make samples 1024 tokens long. One way to cheat this limitation is to keep the last sample and use it as the contextual prompt for the next sample, for as much you can fit into the 1024 token memory. By repeating this process you get a sample with no breaks. I added to each sample 100 tokens time a time so every sample has the previous 924 tokens as context. (This takes forever, especially since you can't batch the samples because each sample depends on the one before it.)

Because of way GPT-2 was trained, it will occasionally spit out a endoftext marker indicating the end of a sample, and the start of a new one. In the story you see here the first paragraph of the first story is the original text. The story then continues until this endoftext marker naturally comes up. Then the next day begins.

A Pinch of Nudging...

Before I generate the sample for the next day, I add the single sentence you see at the top of every page, "...Then, when it was the eighth night, SHE CONTINUED: ' which is the beginning sentence in the Penguin translation that starts each story. The context from the previous story remains.

Other Long Experiments

Link to some complete Star Trek scripts (where I keep generating until it literally says THE END on a line).

Leave a Reply

Proudly powered by WordPress | Theme: Baskerville 2 by Anders Noren.

Up ↑