Tuesday, October 12, 2010

A Google App Engine failure

Long ago, I wrote a Perl script that generates random names for my roleplaying games. It's a simple thing, but it can take input lists from any language and spit out similar-sounding made-up names. It's a powerful, but simple tool, and it seemed a natural fit for my first exploration of Google App Engine. Sadly, it didn't work out that way, and I thought it might serve as a useful caution to others who might plan the same sort of work.

The fundamental problem is that my app is IO-hungry. It reads in the entire source list every time someone asks for a made-up word, crunches it down into first-parts, mid-parts and end-parts (2-3 letter segments which are rooted at the beginning or end of the word or neither). We then sort the lists of parts according to frequency of occurrence and perform a weighted, random pick of a first part, then each subsequent part is chosen in the same way, but from a subset of all of the parts, which overlaps the previous segment. The combination of weighted choice and overlapping leads to words which tend to be pronounceable in the source language of the input list.

This process of reading and processing all of the words every time wasn't something I was going to be able to do in Google App Engine, however, since costs are associated with resources consumption. So, I set out to store the pre-digested versions of the input lists as sorted word-segments in the Google App Engine datastore. This is where my problems began. While it's entirely possible to store the data this way, what I found was that my need to access so many records from the database as I performed my random walk down the lists of word-parts left GAA gasping for breath. In practical terms, I'd created the world's slowest tool for producing babble. Of this, I'm sure my mother feels proud.

Frankly, I'm not sure what I can do about this. GAA just doesn't seem to have been designed for this sort of thing. A shame, really. Of course, I could pre-compute a queue of results for each source namelist and keep re-populating them with a periodic job, but that really seems like a cheesy way to solve a problem that takes a few seconds for my original Perl script.

6 comments:

  1. If your original Perl script did it all in a few seconds, then you probably could have just ported that over. Python and Perl have similar performance, and you have 30 seconds to get the job done.

    ReplyDelete
  2. "In practical terms, I'd created the world's slowest tool for producing babble. Of this, I'm sure my mother feels proud."

    I just love that quote :)

    Thanks for the interesting post. It's been illustrative to watch folks bang their heads against Google App Engine, it seems like it was architected for a comparatively common niche in the computing ecosystem, and if your problem definition falls outside of that niche, then you can pretty much forget about implementing it using App Engine.

    For what it's worth I think Google could and should have done a better job of educating prospective users about this, as there have been instances where people have spent WAY more time than you did with this somewhat toy example, only to walk away frustrated in the end.

    ReplyDelete
  3. Well! Interestingly written review. Looking for industry automation solutions it is the right way to get software development companies.

    ReplyDelete
  4. A lot of thanks for your fresh review. Casino webmasters always look for affiliate program online casino to increase their revenue income from best casinos or poker rooms.

    ReplyDelete
  5. We appreciate your help. Turn your attention on low cost home insurance to save your money on house policy.

    ReplyDelete
  6. Thanks for mentioning this online info. For auto owners who is searching for new autos, check auto quotes from top auto insurance companies.

    ReplyDelete