Skip to main content

The three biggest differences between graph databases

It is important to understand that not all graph databases are created equal. As I pull back the covers on graph databases, I'm beginning to understand that graph databases tend to fall into the following categories:

RDF versus property graph
Resource description framework (RDF) graph databases, sometimes known as triple stores, offer a way of accessing data that follows W3C standards. Data sources conforming to these standard will be made more accessible without data conversion. RDF databases also natively use SparQL, with is considered a standard language for graph analytics. 

Labelled property graphs (LPG) like Neo4J were originally purpose-built and less conforming to the Web standards.  However, they may perform better for certain types of graph analysis. LPGs may use non-standard languages, like gremlin and cypher in order to achieve analysis. That said, SparQL is not exactly well-known. SQL does not have native graph functions, so you'll need to pick up a variant language - SparQL, Cypher or Gremlin.

Some products, like AnzoGraph, offer both property graphs and RDF. AnzoGraph is an example of an all-in-one product for performing both W3C-conforming RDF style analytics and LPG-style analytics.

Analytic databases (OLAP) versus operational databases (OLTP)
OLAP databases are designed for analytics that look across an entire set of data, while OLTP databases are aimed at pin-point analytics. In other words, if you want to look across the database and complete historical analysis of things that happened this month, OLAP databases do this well. If you’re more concerned about transactions, whether a seat on an airplane is available or not, whether a switch is on or off, OLTP systems are designed for this.

All operational databases support some degree of analytics but performance is impacted based on the underlying architecture.  If you have an even workload of both OLAP style queries and OLTP-style analytics, it may benefit you to split these workloads, especially if you have a high data volume.

Native engine versus “built-on” 
Some graph databases have been specifically built by starting with a native graph engine. Others have been built on top of other technologies, including Hadoop and Cassandra. It’s important to look at whether you need to manage an underlying infrastructure, or whether the engine is self-contained. Performance and management of multiple solutions are the keys here.

Understand what you're downloading before you start in on your graph database selection process.

Comments

Popular posts from this blog

Choosing a Graph Database is one of IT's Big Bold Moves (but it just may pay off)

Graph databases are becoming more important to analytics by offering a capability to store relationships and perform unique algorithms. Graph databases show relationships, true. But the real power might just be in the difficult analysis they can perform. In the graph database world, the graph relationship diagram highlights one of the unique values of graph, namely the ability to keep track of connections in the data. Graph visualizations are the first place to start when it comes to understanding the connections in the data and how the puzzle fits together. However, it is  just one  of the features that makes graph databases potentially valuable for your organization. Let’s look at a couple of examples of that potential and how they come together to empower analytics. Graph Algorithms Even though you may not necessarily visualize certain algorithms with traditional graph ball visualization, graph algorithms including Pagerank, shortest path, all paths and ...

What are graph databases?

The description of graph databases that you get when you google it are mostly academic. I see a lot of descriptions about graph databases that talk about seven bridges in Königsberg or Berners-Lee, the inventor of the internet. There are theories and visions which are fine, but for me, I still think it’s important to lead with the relevance. Why are graph databases important to you? Imagine the data that’s stored in a local restaurant chain. If you were keeping track, you’d store customer information in one database table, the items you offer in another and the sales that you’ve made in a third table. This is fine when I want to understand what I sold, order inventory and who my best customer is. But what’s missing is the connective tissue, the connection between the items, along with function in the database that can let me make the most of it. A graph database stores the same sort of data, but is also able to store linkages between the things.   John buys a lot of Pepsi, J...