TinkerPop as of Spring 2011
TinkerPop is an open source software development group focused on technologies in the emerging graph database ecosystem. The group started in the Fall of 2009 and has been actively pushing out technologies since. This post highlights the current state of some of the more popular products made freely available as of Spring 2011.
Blueprints: A Property Graph Model Interface
Blueprints was one of the first TinkerPop products. There are numerous graph database providers. The usual suspects include Neo4j, OrientDB, DEX, InfiniteGraph, Sones, HyperGraphDB, and others. Each database has its own particular graph data model and tools for their respective manipulation and analysis. Blueprints is an attempt to provide a common API. In this way, any tool written to the Blueprints API can work over various graph database providers. The hope is to prevent vendor lock-in as well as make software that is generally useful to developers in the graph scene. Blueprints is relatively simple to implement and has few strict requirements of the underlying graph database. There are some conveniences provided that include: representing an RDF store as a graph database, representing a graph database as an RDF store, support for exposing a graph database as a JUNG graph, etc. The future of the project includes more implementations by more graph database vendors and the development of more utilities and tools that make use of the Blueprints API.
Pipes: A Dataflow Framework using Process Graphs
Pipes was a later development in the TinkerPop suite. The purpose of Pipes is to provide atomic operations that are commonly used when doing a graph traversal. It was realized that every property graph analysis algorithm could be represented by a chain of operations that are an instance of one of the following categories.
- Transform: Take an object and turn it into another object. For example, given a vertex, return its outgoing edges.
- Filter: Take an object and decide whether to return it or not. For example, given an edge, should it be traversed?
- Side Effect: Take an object and return it. However, yield some side effect in the process. For example, count the number of times a particular vertex has been seen.
Pipes has come to serve as a foundation for the development of property graph traversal algorithms and languages.
Gremlin: A Graph Traversal Language
Gremlin was one of the primary reasons why TinkerPop was formed. This is apparent when realizing that TinkerPop’s mailing list is called Gremlin Users. The beauty of a property graph (and multi-relational graphs in general) is that it can be traversed in numerous ways. There are many paths through a graph when one can filter on edges, check properties, update a counter, check a counter, jump back 2 steps, etc. As such, there is a need for a language that can concisely express such traversals. Gremlin is one such language. Gremlin is a DSL written in Groovy that compiles down to Pipes. In short, Gremlin is a language for constructing data flow pipelines to traverse a graph.
Rexster: A RESTful Graph Shell
The Future of TinkerPop
There are many projects that get added and then removed from the TinkerPop suite. What sounded like a good idea at one point, loses its steam as more is learned. However, one project in the works is Frames and it just might stand the test of time. Frames provides a schema-layer to any Blueprints-enabled graph database. In contrast, RDF stores have rich schema layers via RDFS and OWL. As of today, the graph database scene relies primarily on the semantics of the high-level languages that the graph database was developed in. Frames hopes to free this constraint by allowing developers to easily create schemas with varied semantics and rules of inference.
TinkerPop is a relatively new software development group. The team has grown over the last 2 years and with each new member comes new expertise. Hopefully more developers join and help to contribute to the wonderful world of graphs.