2019-12-16

Gremlin Development

We wanted to build a server that when requested could return a large amount of data from a particular state. This isn’t something either of us had much experience with, and finding the right tools to store and share our database was quite challenging. Initially, we looked at using a SQL database but found there were challenges with rapidly returning our heavily linked data, particularly surrounding transpositions. This turned into something of a running theme with transpositions repeatedly causing unexpected problems. Eventually, we came across graph databases which are perfect for storing the data behind our graph!

We wanted to use the Gremlin query language to return data, as it seemed well suited to our needs. We were first drawn to using Azure Cosmos DB, Microsoft’s cloud graph database, by their generous free trial. However, we found Cosmos ill-fitted for our needs, in particular, they only allow for a subset of the Gremlin language, seemingly intentionally cutting off its computational power. This made returning only data above a threshold of games very challenging.

Ultimately we decided to set up a Janusgraph server on a VM. We primarily did this to unlock all of Gremlin’s features, but also found this was cheaper and faster than Cosmos. Setting up the JanusGraph server wasn’t the easiest but gave us a lot more control to optimize for our use case. For instance graph databases are often distributed across multiple PCs, however, with Janusgraph we could select a backend (Berkeley DB) optimized for use on a single VM.

Now with our server up and running, we had to find a way to pay for it! We launched Janusgraph on an Azure VM with a 30-day free trial to get the website off the ground, after that we would need supporters to help pay for it. But first we needed an audience which I will talk about in the next blog.