Random AI Prompt: untangling the word lists
The lists at the heart of the generator get a deep cleanup — folders, a filename-based SFW model, dictionary-grounded categorisation, and group files.
Lists are the heart of
Random AI Prompt. Each is a
plain text file of options, one of which is picked at random when referenced — so
{color} in a prompt becomes one line from color.txt. Over the years the lists
had grown into a tangle: a giant unsorted dictionary, proper nouns mixed with common
words, and clunky duplicate lists. This day was a long, careful overhaul of all of
it.
Folders and path-suffix resolution
The flat pile of files moved into category folders, with resolution by path suffix so a bare name, a partial path, or a full path all work:
data/lists/
color.txt → {color}
danbooru/d/general-sfw.txt → {general-sfw} or {d/general-sfw}
artist/dhigh.txt → {dhigh} or {artist/dhigh}
That keeps the ~78 existing {name} references working while allowing deep
organisation underneath.
A filename-based SFW model
The trickiest part was content gating, which went through several iterations before settling on a rule keyed entirely off the filename — no special configuration, no runtime filtering. A mixed-topic list is two files plus an implicit combined reference:
data/lists/danbooru/d/
general-sfw.txt # the safe half
general-nsfw.txt # the gated half
general-sfw.json # { "description": "Danbooru general descriptor tags (SFW)." }
The resolver combines them by mode: {general} yields the SFW half when adult
content is off and both halves when it’s on, while {general-sfw} is always just the
safe half. The design goal was that a safe-only user never has to type anything
special to stay safe. A content-safety pass also tightened the lists, keeping
ordinary adult content gated rather than deleted, with real place names and artist
handles protected from false positives.
Group files instead of hardcoded composites
Composite lists became real .group files — each line references another list (or
group), resolved by the same suffix lookup:
# data/lists/artist/digipa.group
# Group: the three digital-painting impact artist lists.
artist/dhigh
artist/dmed
artist/dlow
Folders with two or more lists become implied groups automatically; optional
<list>.json sidecars supply tooltips in the editor; and keyword became a reserved
wildcard that draws a word from any loaded list.
Letting a dictionary do the sorting
The cleanup ran on a principle worth stating: for proper nouns, a model’s world knowledge is the right classifier (no dictionary knows Achernar is a star), but for parts of speech a real dictionary states the answer rather than guessing it. So the dictionary dump was re-sorted into parts of speech using WordNet as the authority, and a roughly 8,800-entry “keyword” junk drawer of proper nouns was hand-classified into people, places, organisations, mythology, astronomy, religion, history, and more — with coverage checks so nothing was silently dropped. Several parallel review passes over the curated lists then caught and fixed hundreds of misfiled entries (surnames sitting in a first-names list, and the like).
References
- Random AI Prompt repository · documentation site
- Princeton WordNet ·
wordpos(the Node interface used)