Neo4j Graph Database
Graph databases are used to manage and understand how different pieces of data are connected. Unlike relational databases, which use tables, graph databases use a system of nodes and edges to show relationships. Big companies like Facebook, LinkedIn, and Netflix use these databases to keep track of user's social networks and get recommendations based on relationships. As our world gets more connected, graph databases help us make sense of all the complex data more quickly and easily. In this blog post it will be about the Neo4j graph database.

Jordan Wu
6 min·Posted

Table of Contents
What is Neo4j?
Neo4j is a popular graph database that stores nodes and relationships. It's a great solution when storing users' social networks and for getting recommendations. The world can be represented by objects. Each object can be related to other objects. This relationship between objects is much easier to understand in a graph. Other databases like SQL and NoSQL aren't good at storing graph data. It's possible to store them, but the complexity and performance is not great. Neo4j is a graph database used to store graph data.
What is a Graph Data Model?
A graph has two fundamental components, Nodes and Relationships.
Nodes are often used to represent entities that can contain properties that hold name-value pairs of data. Each node can have one or more labels to help group them. A label is a named graph construct that is used to group nodes into sets. All nodes labeled with the same label belong to the same set. The naming convention for the label is camel-case, beginning with an upper-case character like Person
, Actor
, and Director
.
A relationship connects two nodes and allows us to find related nodes of data. Just like nodes it can contain properties that hold name-value pairs of data. It has a source node and a target node that shows the direction of the arrow. Although you must store a relationship in a particular direction. A relationship will always connect two nodes and you cannot delete a node without also deleting its associated relationships. The naming convention for the relationship is to use a verb and to upper-case, using underscore to separate words like ACTED_IN
and DIRECTED
.
The graph represents people and their roles in movies. The nodes are represented in a circle with labels and have properties. The relationships connect two nodes with a direction and have properties.
Here's a simplified graph that is easier to understand. Looking at the graph you can tell which actors acted in a movie and who were the directors.
Cypher Query Language
The graph query language that Neo4j uses is called Cypher. This is used to perform actions to the graph database like read, write, update, and delete. Cypher is unique because it provides a visual way of matching patterns and relationships. It lets users focus on what to retrieve from a graph, rather than how to retrieve it and consists of three core entities: nodes, relationships, and paths.
In the example is an ASCII-art type of syntax where (nodes)-[:ARE_CONNECTED_TO]->(otherNodes)
using rounded brackets for circular (nodes)
, and -[:ARROWS]->
for relationships. It follows the follow common syntax:
// Node syntax
()
(matrix)
(:Movie)
(matrix:Movie)
(matrix:Movie {title: 'The Matrix'})
(matrix:Movie {title: 'The Matrix', released: 1997})
// Relationship syntax
-->
-[role]->
-[:ACTED_IN]->
-[role:ACTED_IN]->
-[role:ACTED_IN {roles: ['Neo']}]->
// Pattern syntax
(keanu:Person:Actor {name: 'Keanu Reeves'})-[role:ACTED_IN {roles: ['Neo']}]->(matrix:Movie {title: 'The Matrix'})
// Pattern variable
acted_in = (:Person)-[:ACTED_IN]->(:Movie)
Nodes
The data entities in a Neo4j graph database are called nodes. Nodes are referred to in Cypher using parenthesis ()
.
MATCH (n:Person {name:'Anna'})
RETURN n.born AS birthYear
In this example, the first MATCH
clause finds all Person
nodes in the graph with the name
property set to Anna
, and binds them to the variable n
. The variable n
is then passed along to the subsequent RETURN
clause, which returns the value of a different property born
belonging to the same node.
Relationships
Nodes in a graph can be connected with relationships. A relationship must have a start node, an end node, and exactly one type. Relationships are represented in Cypher with arrows -->
indicating the direction of a relationship.
MATCH (:Person {name: 'Anna'})-[r:KNOWS WHERE r.since < 2020]->(friend:Person)
RETURN count(r) As numberOfFriends
The query example above matches for relationships of type KNOWS
and with the property since
set to less than 2020
. The query also requires the relationships to go from a Person
node named Anna
to any other Person
nodes, referred to as friend
. The count()
function is used in the RETURN clause to count all the relationships bound by the r
variable in the preceding MATCH
clause (i.e. how many friends Anna
has known since before 2020
).
Paths
Paths in a graph consist of connected nodes and relationships. Exploring these paths sits at the very core of Cypher.
MATCH (n:Person {name: 'Anna'})-[:KNOWS]-{1,5}(friend:Person WHERE n.born < friend.born)
RETURN DISTINCT friend.name AS olderConnections
This example find all paths up to 5
hops away, traversing only relationships of type KNOWS
from the start node Anna
to other older Person
nodes (as defined by the WHERE
clause). The DISTINCT
operator is used to ensure that the RETURN
clause only returns unique nodes.
MATCH p=shortestPath((:Person {name: 'Anna'})-[:KNOWS*1..10]-(:Person {nationality: 'Canadian'}))
RETURN p
Paths can also be assigned variables. For example, the query binds a whole path pattern, which matches the shortest path from Anna
to another Person
node in the graph up to 10
hops away with the nationality
property set to Canadian
. In this case, the RETURN
clause returns the full path between the two nodes.
Summary
A graph database is a powerful tool to help related data. Relationships play a critical part in our daily lives from people we know to things we love. Storing the data in a graph will help connect everything together. AI is the future and with Neo4j new Generative AI features you can start building the next generation apps that perform tasks based on the user's relationship to things in the world.