Wilson Mar bio photo

Wilson Mar

Hello. Hire me!

Email me Calendar Skype call 310 320-7878

LinkedIn Twitter Gitter Instagram Youtube

Github Stackoverflow Pinterest

Don’t use this if you’re fond of SQL joins


This tutorial is an introduction for “newbies” on how to install Neo4J, configure, create a database from commands,load data, etc. Rather than innudating you with facts and conceptual words to remember, commentary here is provided along the way after you take some action, step-by-step. Like a guided walking tour. So you learn by doing.

NOTE: Nepture is Amazon graph database cloud service.

Why graph databases?

Graph databases provide the latest in the evolution of data storage mechanisms to handle complexity.


In VIDEO: Graph Databases Will Change Your Freakin’ Life published Nov 28, 2016, Ed Finkler (CTO of SaaS vendor GraphStory.com) presents this sample graph:

The two nodes (data records) are labeled of entity type “Person”. The arrow in the relationship line points from the “CHILD_OF” in the direction of the parent. A property of this relationship (named “Created”) is when the relationship was established (in “2002”).

Third-party add-ons can add a GUID to each entity.

PLUG: The ability to attach labels to relationships can be used to narrow searches. This is what makes graph database handle complexity.

Graph databases manage connection of nodes (data entities) in relationship to other nodes. Databases from Oracle, MySQL, etc. needed to join physical tables together, Unique to graph databases is

The advantage of Neo4j appears when we work with complex indirect relationships. The approach of graph databases build in “meaning” within data. This has proven useful for use cases such as recommendations, network/IT analysis, fraud detection, Internet of Things(IoT), and more.

In VIDEO: What are Graph Databases and Why should I care? Mar 23, 2017, Dave Bechberger (@bechbd of Expero Labs) explains that with Neo4j, one can easilly “travese” a graph with arbitrary hops such as “similar” to identify a recommandation:


With Neo4j, one can “travese” a graph with arbitrary hops such as “similar” without the need to build foreign key joins or bridge tables.

SQL is hard due to the need for joins and “de-normalized” physical structures. SQL makes it difficult to answer questions that were not already expected ahead of time. requiring refactoring.

PLUG: Moreover, as more relationships are added in Neo4j, performance is not degraded with joins.

Other Graph databases

This ranking lists Neo4j as the most popular graph database.

@JanusGraph ‏ JanusGraph (at http://janusgraph.org/) was open-sourced in 2017 under The Linux Foundation, with participants from Google, Hortonworks, IBM, Amazon, GRAKN.AI, Expero Labs, etc.

It’s distributed graph database with multiple scalable storage backends:

  • Apache Cassandra®
  • Apache HBase®
  • Google Cloud Bigtable
  • Oracle BerkeleyDB



Cloud services


https://docs.microsoft.com/en-us/azure/cosmos-db/spark-connector-graph Microsoft offers its Cosmos graph database running within a Azure HDInsight Spark cluster 2.0

Enterprise unlock

Neo4j installers include Desktop and Enterprise which requires buying a license (unless the application built on top of it is also open-sourced) to unlock limitations, allowing for LDAP role-based and subgraph access control, lock manager, clustering, hot backups, and monitoring. See https://neo4j.com/startup-program/?ref=developers


PROTIP: If you value your data, don’t install. Instead of hacking your way, pay a SaaS vendor such as GraphStory.com.

  1. Install Java JDK, since Neo4j is written in Java.

    Neo4j is therefore multi-platform.

    Install Neo4J on Mac

    PROTIP: If you’re installing on your laptop, I recommend use of Homebrew for its ease-of-use, even though its version can be behind the official website.

  2. Homebrew install on Mac or Linux from any folder:

    brew install neo4j

    The response presents tips for starting the server:

    ==> Downloading https://neo4j.com/artifact.php?name=neo4j-community-3.3.0-unix.tar.gz
    ######################################################################## 100.0%
    ==> Caveats
    To have launchd start neo4j now and restart at login:
      brew services start neo4j
    Or, if you don't want/need a background service you can just run:
      neo4j start
    ==> Summary
    🍺  /usr/local/Cellar/neo4j/3.3.0: 105 files, 96.9MB, built in 7 minutes 10 seconds

    Configure Environment Variable

    PROTIP: The Summary response provides a hint of where Neo4j’s binary is located.

  3. Create an environment to hold a path to the goodies:

    export NEO4J_HOME="/usr/local/Cellar/neo4j/3.3.0/libexec/"
    echo $NEO4J_HOME
    ls $NEO4J_HOME

    The response:

    LICENSES.txt   bin      conf     import      logs     run
    UPGRADE.txt certificates   data     lib      plugins

    Now skip to the Invoke section.


Manual download

The Introduction to Graph Databases and Neo4j 2h video course by Microsoft MVP Roland Guijt (@RolandGuijt, rmgsolutions.nl) was released February 5, 2015 while using Neo4j version 2.1.3 on Windows. So the UI has changed.

But if you’d like to follow along anyway:

  1. In an internet browser, go to Neo4J’s Other Releases to download:


  2. Find the link for the version you want to use.

  3. Click the “Download” button for sign-ups

  4. Get the OReilly book.

    Manual install


  5. Click “Open” to “Are you sure you want to open it?”.
  6. Click “I Agree”.
  7. Login using your email or through social media.

    NOTE: It says Java 8 is downloaded if it doesn’t exist, but I got errors.

  8. Set the $NEO4J environment variable to point to where Neo4j is installed.


  1. Invoke the neo4j executable without any parameters:


    The help response lists the sub-commands:

    Usage: neo4j { console | start | stop | restart | status | version }
  2. List the release you are using:

    echo $(neo4j version)

    The response at time of writing was:

    neo4j 3.3.0
  3. Initially we don’t want/need a background service, so:

    neo4j start

    If you had installed using Homebrew:

    Active database: graph.db
    Directories in use:
      home:         /usr/local/Cellar/neo4j/3.3.0/libexec
      config:       /usr/local/Cellar/neo4j/3.3.0/libexec/conf
      logs:         /usr/local/Cellar/neo4j/3.3.0/libexec/logs
      plugins:      /usr/local/Cellar/neo4j/3.3.0/libexec/plugins
      import:       /usr/local/Cellar/neo4j/3.3.0/libexec/import
      data:         /usr/local/Cellar/neo4j/3.3.0/libexec/data
      certificates: /usr/local/Cellar/neo4j/3.3.0/libexec/certificates
      run:          /usr/local/Cellar/neo4j/3.3.0/libexec/run
    Starting Neo4j.
    Started neo4j (pid 21292). It is available at http://localhost:7474/
    There may be a short delay until the server is ready.
    See /usr/local/Cellar/neo4j/3.3.0/libexec/logs/neo4j.log for current status.

    Alternately, to start neo4j now and automaticaly restart at login:

    brew services start neo4j

    Edit Config for upgrade

  4. Within a Terminal, edit the configuration file using “subl” or substituting “vi”, “nano”, or your preferred text editor:

    subl $NEO4J_HOME/conf/neo4j.conf

    See a description of the dozens of keys at: https://neo4j.com/docs/operations-manual/3.2/reference/configuration-settings/

  5. Find text “dbms.allow_upgrade=” within the file.
  6. Remove the # comment so your database upgrades automatically in case its built version is older than your current Neo4j version:

  7. Set logging specifications:


    Query Logging

  8. Query logging must be enabled by setting the parameter to:


    To set all queries to be logged:


    Alternately, set a threshold for the number of seconds before logging, such as 7:

  9. There are additional logging parameters that are false by default:

  10. Save the file and exit the editor.

PROTIP: Alternately, rather than doing the above by hand, I recommend that you create and store in GitHub a shell script that does the above, then execute it a single command.

   curl -o https://raw.githubusercontent.com/wilsonmar/Dockerfiles/master/Neo4j/conf/neo4j.conf

-o (lowercase o) the result will be saved in the filename provided in the command line.

-O (uppercase O) the filename in the URL will be taken and it will be used as the filename to store the result

   curl -o your.conf https://raw.githubusercontent.com/wilsonmar/Dockerfiles/master/Neo4j/conf/neo4j.conf

### Start Browser Console

  1. Open the Neo4j Browser client

    In Linux, open a browser different than the one used to display this tutorial (Firefox instead of Chrome) so you can quickly switch among the two using Command+Tab, go to the URL suggested above:


    Alternately, when running within Docker Machine, open the Neo4j Browser client using your default browser. On a Mac:

    open http://$(docker-machine ip default):7474

    Ports (within Docker)

    PROTIP: By default the Docker image for Neo4j exposes three ports for remote access:

    docker run \
     --publish=7474:7474 --publish=7687:7687 \
     --volume=$HOME/neo4j/data:/data \
     --volume=$HOME/neo4j/logs:/logs \
    • 7474 for HTTP
    • 7473 for HTTPS
    • 7687 for Bolt

    PROTIP: Neo4j v3 uses the Bolt binary protocol (instead of http/https) to communicate with the Neo4j database. Bolt operates over a TCP connection or WebSocket. Built-in TLS is enabled by default. It’s defined at https://boltprotocol.org/


    Commands can be entered in the Editor field at the top which begins with a dollar sign in gray.

  2. Click the top command entry field to the right of the dollar sign, type the first letter of commands, a colon (:), for auto-completion list of most common commands:

    Click image for larger image pop-up.

### Dark theme color for commands

  1. Click the gear icon near the lower-left corner among menu icons.

    Click image for larger image pop-up.
  2. Click Theme Dark.
  3. Click the icon again to dismiss menu contents. (it’s a toggle)

### Play intro

  1. Click the icon that looks like a book (previously this was an i icon for information).
  2. Click “Getting Started”.

Notice that these commands appear in the command field:

   :play intro

Notice that the commands now have different colors.

  1. To submit the command, press Enter or click the arrow icon at the upper-right corner.

The page says for multi-line commands to press Shift+Enter to enter multi-line mode, then press Ctrl+Enter instead of the arrow icon.

  1. Scroll down the page to see that the new response is added above the previous frame, as in a stack as a stream.
  2. Click the gray X to dismiss a content frame.
  3. Click “Operations Manual” under Useful Resources to pop up a new browser tab to the version installed: https://neo4j.com/docs/operations-manual/3.2/

### Monitoring limits

  1. Type command :sysinfo and press Enter for:


PROTIP: Execute this on a schedule to ensure that more space is allocated before need.

### Manage users and roles

PROTIP: Passwords can be changed in the command line. See: https://neo4j.com/docs/operations-manual/3.2/reference/user-management-community-edition/#userauth-list-all-users-ce


The Enterprise edition of ???

   CALL dbms.security.listUsers()

### Change password

Notice in the sample console image that the password is blank.

  1. In the Password field, type “neo4j” (lower case).
  2. Type your own password. Twice.

Note in the response your user name is “neo4j”.

Database creation commands

Neo4j comes with instructions to create two databases from the command line.

  1. Click “Write Code” and press Enter to invoke :play write code.


    Movie sample creation

  2. See instructions to build a relationship graph about among actors and directors:

    :play movie graph

    The “Create” pane appears with code that begins with command “CREATE”.

  3. Click the code for it to be posted in the command line.

  4. Click the full screen icon.


  5. BLAH: Drag and drop items to arrange the graph to your asthetic taste.


    This is like ER (Entity-Relation) diagrams for SQL databases.

    PROTIP: Neo4J data is stored the same way as illustrated by the data model, whereas with SQL data is stored in separate tables joined together.

  6. Click on a colored dot (The Polar Express).

  7. Click the “X” in the wheel to delete the node.

    PLUG: Thus, Neo4j is naturally adaptive. Entities can be added dynamically, without schema migrations required in SQL databases.

  8. Click the database icon at the top of the left menu to view a list of

    • Node Labels
    • Relationship Types
    • Property Keys



Northwind sample creation


PROTIP: Relations are not “first-class citizens” in a relational database. But they are in graph databases.

  1. Open the Northwind database which Microsoft provides with its SQL database.

    :play northwind graph

    This is a more complex database with data common to business accounting.


Neo4j uses “index read adjacency” to make itself quicker to traverse nodes instead of slower index lookups in SQL.

PROTIP: Cypher keywords ENDS, WITH, and CONTAINS are, as of v3, index-backed.

  1. To schedule resampling of an index:


Load data

  1. First, stop the service:

    Stop Browser

  2. Open a Terminal window to issue command:

    neo4j stop

    The response:

    Stopping Neo4j.. stopped

    Now you can backup and dump the database.

  3. Switch to or open a Mac/Linux Terminal instance and:

    cd $NEO4J_HOME/data/Databases
    ls -al
    cd graph.db
    ls -al

    Notice graph.db.

    PROTIP: In the Neo4j world, a physical database consists of files stored under a folder named with a .db suffix. A “graph” references a physical Neo4j database that stores data.

Backup default database

see https://neo4j.com/docs/operations-manual/3.2/tools/dump-load/ for the various attributes to add.

### Download sample database

  1. Stop the server.
  2. Open a new browser tab to download a zip file containing a sample database from:


  3. Right-click to Save Link as… (download) the Jim Webber’s Doctor Who Data Set drwho.zip file to your Downloads folder.
  4. Unzip the file to create folder drwho.
  5. Look into the folder index.

PROTIP: Neo4j uses Lucene to index, same as ElasticSearch and others.

  1. Copy the drwho folder within the Neo4j database folder:

   cp ~/Downloads/drwho  /usr/local/Cellar/neo4j/3.3.0/libexec/data/Databases

### Check db consistency

  1. See https://neo4j.com/docs/operations-manual/3.2/tools/consistency-checker/

### Use cases

Recommendations can be made. You like Tom Hanks? Here are his other movies.

More sophisticated versions of such a database are being used to detect fraud. See https://t.co/OMHaHOrYtq

Video of use cases “What SQL had to process in batch can now be processed in real-time with Neo4j.”



### More samples

The developer community has provided example data models and queries for a variety of use cases outlined in Neo4j GraphGists.



Both datasets are easily accessible using the blue “Write Code” button under the “Jump into Code” section of the guides.

Additional datasets We’re working on datasets for a variety of other uses cases. Are you looking for something in particular or have a suggestion? Reach out to devrel@neo4j.com.

The Open Source Mental Illness Neo4j database is at: https://github.com/OSMIHelp/osmi-survey-graph

Console Online


relationships from the movie “Matrix”.

As of version 2.0, indexing was added to Cypher with the introduction of schemas.[15]

Cypher Query


A big innovation by Neo4j is that it provides programmers with a flexible network structure of nodes and relationships rather than static SQL tables.

Cypher is a language akin to SQL.

Cypher Style Guide (for M08), a 10-page pdf covers Indentation and line breaks, Meta-characters, Casing, Patterns Spacing. The document presents this classical piece by Mark Needham: as a “sane query”:

MATCH (member:Member {name: 'Mark Needham'})
WITH member, topic, count(*) AS score
MATCH (topic)<-[:HAS_TOPIC]-(otherGroup:Group)
WHERE NOT (member)-[:HAS_MEMBERSHIP]->(:Membership)-[:OF_GROUP]->(otherGroup)
RETURN otherGroup.name, collect(topic.name), sum(score) AS score

PROTIP: Do not use a semicolon at the end of the statement.

execution plan with EXPLAIN and PROFILE.



Neo4j is considered among other “NOSQL” database tech that include Key-Value Stores, Column-Family Stores, and Document Databases. But these others use aggregate data models whereas graph databases such as Neo4j work with simple records and complex interactions.

Instead of SQL union statements. an example of code is:


### Direction

Neo4j API allow developers to completely ignore relationship direction when querying the graph.

MATCH (boss)-[:MANAGES*0..3]->(sub),
WHERE boss.name = "John Doe"
RETURN sub.name AS subordinate,
   count(report) AS Total


Technology Compatibility Kit (TCK)

### Railroad diagrams


EBNF grammar

### Fast?

For many applications, Neo4j offers orders of magnitude performance benefits compared to relational DBs.

  • Antlr grammar
  • EBNF grammar
  • Railroad diagrams
  • Grammar specification
  • TCK specification


  1. Click the star icon on the menu to reveal the import area to drag files.


    Only text files are dropped there, not images of Neo4j databases (.db files).

    Import your own data

    You can import your data from CSV files using the Cypher’s LOAD CSV command.


    Learn more about import in the Neo4j Developer Manual: Load CSV and Importing CSV Data into Neo4j.



    Neo4j believes in polyglot persistence (multiple ways to store connected data), with columnar, tabular and document data stored elsewhere. The various types of data integrations possible with Neo4j is at:


    For information about importing transactions into the database, see: https://neo4j.com/docs/operations-manual/3.2/tools/import/ https://neo4j.com/docs/operations-manual/3.2/tutorial/import-tool/


    cycli -f import-file.cypher

Awesome Procedures

APOC (Awesome Procedures on Cypher) are complex implementations that can’t be expressed directly on Cypher. They suppor data integration, graph algorithms, data conversion.


refers to the code repository at:


PROTIP: Have every cypher query use parameters - as stated in the Neo4j documentation.

Java coded queries


For large amounts of data, Cypher run time performance may not equal that of Java API coding of traversals and writes. So do massive writes using java api and reads and queries using parameterized Cypher queries.

Neo4j provides Native server-side extensions in Java.

Get the official drivers for Javascript, Java, .NET, and Python


Additionally, the community has built a wide variety of other drivers in languages like PHP, Ruby, Go, Haskell and more.



ALGO libraries

Neo4j has surrounded itself with a rich ecosphere of visualization and analytics tools. It’s open source query language, OpenCypher, is the most widely used graph query language.

Neo4j likely has more documentation that all other graph tools combined. When you run into problems or have questions, you’ll find a large community of users and meetup groups around the world.

If you are just getting started with graphs, you can’t go wrong by learning Neo4j.


People using graph databases call themselves “Graphistas”.

  1. Click the icon at the bottom-left corner among menu icons.

    Click image for larger image pop-up.

  2. Q&A on http://stackoverflow.com/questions/tagged/neo4j

  3. Sign up for http://neo4j.com/slack at http://neo4j-users-slack-invite.herokuapp.com/

  4. Join https://groups.google.com/forum/#!forum/neo4j

  5. Subcribe to Neo4j’s YouTube channel and view videos.

  6. Visit a Meetup group - https://www.meetup.com/Neo4j-Online-Meetup/

  7. Tweet to https://twitter.com/neo4j - #GraphViz #Neo4j #GraphDatabases

  8. Read https://neo4j.com/blog/

  9. Read https://en.wikipedia.org/wiki/Neo4j

Live in person





Ryan Boyd (LinkedIn) a SF-based ex-Googler, now Neo4j Head of Developer Relations.

Emil Eifrem (@emileifrem, emil@neotechnologies.com), CEO

Johan Svensson CTO

Neo4j, Inc. board of directors includes Rod Johnson (founder of the Spring Framework),

Philip Rathle is the products veep/

Mats Rydberg living in Sweden

Jim Webber provides “koan” style tutorial presents a set of databases which have something not right, so students learn to fix things. Brilliant approach and a great learning tool:

  • https://github.com/jimwebber/neo4j-tutorial


William Lyon (@lyonwj, lyonwj.com) is a software developer at Neo4j, an engineer on the Developer Relations team, he works primarily on integrating Neo4j with other technologies, building demo apps, helping other developers build applications with Neo4j, and writing documentation. Applying an active learning algorithm for entity de-duplication in graph data.

  • https://www.slideshare.net/neo4j/building-a-graphql-service-backed-by-neo4j
  • https://github.com/neo4j-graphql/neo4j-graphql
  • https://grandstack.io (GraphQL, React, Apollo, Neo4j)

Mark Needham

Rik van Bruggen

Michael Hunger

References and tutorials

BTW in academic communities, relationships are also called “edges” and nodes are called “vertices”.


O’Reilly’s Graph Databases 211 page ebook from May 2015 with NeoLoad 2.2:

https://github.com/graphaware/neo4j-nlp Implementation of Microsoft Concept Graph

https://www.experfy.com/training/courses/an-introduction-to-neo4j#description $80 class

neo4j-hierarchy-graph-1250x476 From https://neo4j.com/blog/7-ways-data-is-graph/

The Top 13 Resources for Understanding Graph Theory & Algorithms https://buff.ly/2w9PQFy

https://r.neo4j.com/2iSaBRi Geocoding #ParadisePapers Addresses in #Neo4j to Build Interactive Geographical Data Visualizations

https://www.slideshare.net/bachmanm/modelling-data-in-neo4j-plus-a-few-tips neo4j-property-graph-pulp-fiction.png


  • https://www.youtube.com/watch?v=78r0MgH0u0w

  • https://www.youtube.com/watch?v=vJcxRjJ982k

  • https://www.youtube.com/watch?v=NO3C-CWykkY&index=4&list=PL9Hl4pk2FsvWM9GWaguRhlCQ-pa-ERd4U

https://www.youtube.com/watch?v=jiE3wsrVUQs Using Neo4j and Machine Learning to Create a Decision Engine by Tim Ward (@jerrong, tiw@CluedIn.com) https://www.slideshare.net/neo4j/graphconnect-europe-2017-using-neo4j-and-machine-learning-to-create-a-decision-engine-cluedin