Deleting all nodes and relationships in neo4j using cypher exceeds heap space

seandavi picture seandavi · Feb 4, 2013 · Viewed 16.8k times · Source

I have been trying to run this query as recommended in the neo4j google group and in other sources online:

START n = node(*) MATCH n-[r?]-() WHERE ID(n)>0 DELETE n, r;

in order to delete all nodes and relationships between tests. When I do so from the console, I run out of java heap space. When I do so from python (using the newish graph_db.clear(), which appears uses the same query), I get a "SystemError: None" which, I assume, is the same java heap space error. I have a database with 500k nodes, only 5k relationships, and 7M properties. I am running on a Mac laptop (10.6.8) with 8GB RAM using neo4j-1.8.1. I guess I am a bit surprised that deleting nodes (with essentially no relationships, so very small subgraphs) would exceed the java heap space, but I am pretty naive about how neo4j works. Any suggestions in how to go forward are appreciated. I do know that rm -rf in the data directory and starting from scratch will work, but I thought there might be a less-drastic solution.

[cross-posted to neo4j google groups]

Answer

Axel Morgner picture Axel Morgner · Feb 4, 2013

The cypher statement above causes all nodes (besides the root node with ID 0) to be instantiated before deletion in one single transaction. This eats up too much memory when done with 500k nodes.

Try to limit the number of nodes to delete to something around 10k-50k, like e.g.:

START n = node(*) 
MATCH n-[r?]-() 
WHERE (ID(n)>0 AND ID(n)<10000) 
DELETE n, r;

START n = node(*) 
MATCH n-[r?]-() 
WHERE (ID(n)>0 AND ID(n)<20000) 
DELETE n, r;

etc.

However, there's nothing wrong with removing the entire database directory, it's good practice.