More diversion

posted in The Bag of Holding

Published January 29, 2014

I got tired of having my IRC bot be a plugin to a Chrome plugin, so I started tearing apart the CIRC Chrome extension and stripping it down to serve as a bot hosting platform. The UI already has some tweaks to support bot features, and having direct access to the whole Chrome extension API is very liberating.

On the downside, it's chewing up an increasingly large amount of my free time, and taking away valuable energy from other projects (*cough* Epoch *cough*). But I sort of needed the distraction for a while, and now it's just too much fun to quit without a decent project to show for all of it.

Of course no IRC bot is complete without the ability to converse, so I started writing up a simple order-2 Markov chain sentence generator. These are pretty bog standard in the language generation world for low-fidelity conversation.

The big trick with Markov chains is training data. An order-2 chain requires a pretty substantial amount of English to inspect before it starts sounding even vaguely sensible. Most of the output is just direct quotes from what it was fed, since it doesn't have enough contextual information to "understand" how to rearrange phrases and sentences yet.

I'm currently at something like 75KB of JSON representing the internal state of the Markov chain system, and it still sounds pretty idiotic. I'm confident based on past experiments that it can be improved quite a bit - the only factor is how much data I can feed it.

I'm secretly having it listen to IRC channels and selecting the longer and more lucid-looking sentences to add to the training repository. This should be a fun way to add some flavor to the bot.

If I decide to continue hacking on this, I'll probably start by cleaning up the rest of the CIRC code I... erm... borrowed. Once that's done, I'm considering posting a copy of the extension code on my scribblings site for other intrepid Chrome users to futz with.

As a stretch goal, I kind of want to write a Bayesian classifier to try and get a bit of a stimulus/response thing going, but that's a ways down the road in all probability.

Previous Entry And now for a brief diversion

Next Entry Why YOU should embed a web server in your game engine

1 likes 2 comments

Comments

Aardvajk

Sounds fun. Diversion is good and will probably be better for Epoch in the long run.

January 30, 2014 02:22 PM

Navyman

Diversions can often remind you why you enjoyed the primary project in the first place.

January 31, 2014 11:05 AM

You must log in to join the conversation.

Don't have a GameDev.net account? Sign up!

ApochPiQ

Author

More diversion

Comments

ApochPiQ

Latest Entries

A Few Farewells

Code Reuse In Actual Practice

Source-Level Debugging For Epoch Programs

Using Poison to Reverse Engineer Code

Using Poison to Reverse Engineer Code

Debugging Information Success

Debugging Information Success

Debugging Epoch Programs

Debugging Epoch Programs

Epoch 64-bit compiler progress

More diversion

Comments

ApochPiQ

Latest Entries

A Few Farewells

Code Reuse In Actual Practice

Source-Level Debugging For Epoch Programs

Using Poison to Reverse Engineer Code

Using Poison to Reverse Engineer Code

Debugging Information Success

Debugging Information Success

Debugging Epoch Programs

Debugging Epoch Programs

Epoch 64-bit compiler progress

Reticulating splines