Reputation: 3390
I can't find how to return a node labels with Cypher.
Anybody knows the syntax for this operation?
Upvotes: 82
Views: 67336
Reputation: 5754
Use the labels()
function, as in this example which matches nodes with a name
property that have the value 'Alice':
MATCH (a) WHERE a.name = 'Alice'
RETURN labels(a)
The return type for labels()
is LIST<STRING>
, so it can return one or more values.
More info here: https://neo4j.com/docs/cypher-manual/5/functions/list/#functions-labels
There are multiple upvoted answers on this question, only one of which you should use (listed below as "Solution #1"). I've posted three ways of getting all in-use labels in the graph. The test data set has 109,120 nodes in the graph.
MATCH (x) RETURN count(x)
109120
db.labels()
The usage looks like this:
CALL db.labels();
On my test data set, this query completed in ~1 ms (successive runs shown):
Started streaming 9 records in less than 1 ms and completed in less than 1 ms.
Started streaming 9 records after 1 ms and completed after 1 ms.
Started streaming 9 records in less than 1 ms and completed after 1 ms.
Here's the execution plan output:
EXPLAIN CALL db.labels();
ProcedureCall
label
db.labels() :: (label :: STRING)
10 estimated rows
ProduceResults
label
label
10 estimated rows
Result
Note: estimated rows is 10, with no mention of the ~109,000 nodes in the graph.
labels()
on each node, get distinct resultsThe query looks like this:
MATCH (n) RETURN DISTINCT labels(n)
Here are several runs of that query, each more than an order of magnitude slower than solution #1:
Started streaming 9 records after 1 ms and completed after 41 ms.
Started streaming 9 records after 1 ms and completed after 36 ms.
Started streaming 9 records in less than 1 ms and completed after 37 ms.
The execution plan is more complicated, and clearly shows that all nodes in the graph are evaluated. Again, my test data set has 109,120 nodes in it, and we see exactly that number of nodes evaluated in the first step. If we had 1 million nodes in the graph, this approach would scan all 1 million (or 10 million, etc.).
EXPLAIN MATCH (n) RETURN DISTINCT labels(n)
AllNodesScan
n
n
109,120 estimated rows
Distinct
`labels(n)`
labels(n) as `labels(n)`
103,664 estimated rows
ProduceResults
`labels(n)`
`labels(n)`
103,664 estimated rows
Result
While the result is correct, this approach is significantly more expensive to evaluate than solution #1.
The query looks like this:
MATCH (n)
WITH DISTINCT labels(n) AS labels
UNWIND labels AS label
RETURN DISTINCT label
ORDER BY label
Here are several runs of this query, mid-30 ms range like solution #2:
Started streaming 9 records in less than 1 ms and completed after 33 ms.
Started streaming 9 records after 1 ms and completed after 32 ms.
Started streaming 9 records in less than 1 ms and completed after 37 ms.
Started streaming 9 records after 4 ms and completed after 36 ms.
The execution plan is similar to solution #2 at the beginning, but includes additional steps which involve nearly the entire data set:
EXPLAIN MATCH (n)
WITH DISTINCT labels(n) AS labels
UNWIND labels AS label
RETURN DISTINCT label
ORDER BY label
AllNodesScan
n
n
109,120 estimated rows
Distinct
labels
labels(n) AS labels
103,664 estimated rows
Unwind
labels, label
labels AS label
1,036,640 estimated rows
Distinct
label
label
984,808 estimated rows
Sort
label
label ASC
Ordered by label ASC
984,808 estimated rows
ProduceResults
label
label
Ordered by label ASC
984,808 estimated rows
Result
If your goal is to determine which labels exist in a graph, Solution #1 looks like the clear winner – it is not only the fasest and simplest approach, but it's performance is not bound by the number of nodes in the graph (so, it should remain fast even if you have more nodes).
I do not see any measurable benefit for using Solutions #2 or #3. Compared to Solution #1, both are slower and more complicated to write, and - unlike Solution #1 - their execution plans show that their performance is bound directly by the number of nodes in the graph. They will run more slowly with larger data sets.
Upvotes: 0
Reputation: 1
match(n) where n.name="abc" return labels(n)
it returns all the labels of the node "abc"
Upvotes: 0
Reputation: 16355
Neo4j 3.0 has introduced the procedure db.labels()
witch return all available labels in the database. Use:
call db.labels();
Upvotes: 47
Reputation: 1334
If you want to get the labels of a specify node, then use labels(node)
; If you only want to get all node labels in neo4j, then use this function instead: call db.labels;
, never ever use this query: MATCH n RETURN DISTINCT LABELS(n)
. It will do a full table scan, which is very very slow..
Upvotes: 4
Reputation: 2782
To get all distinct node labels:
MATCH (n) RETURN distinct labels(n)
To get the node count for each label:
MATCH (n) RETURN distinct labels(n), count(*)
Upvotes: 110
Reputation: 1304
If you want all the individual labels (not the combinations) you can always expand on the answers:
MATCH (n)
WITH DISTINCT labels(n) AS labels
UNWIND labels AS label
RETURN DISTINCT label
ORDER BY label
Upvotes: 25
Reputation: 23955
If you're using the Java API, you can quickly get an iterator of all the Label
s in the database like so:
GraphDatabaseService db = (new GraphDatabaseFactory()).newEmbeddedDatabase(pathToDatabase);
ResourceIterable<Label> labs = GlobalGraphOperations.at(db).getAllLabels();
Upvotes: 4
Reputation: 2592
There is a function labels(node) that can return all labels for a node.
Upvotes: 63