The difference between blank nodes and variables in SPARQL queries

Andy White picture Andy White · Feb 17, 2014 · Viewed 7.2k times · Source

I've studied SPARQL specification on the topic and also found this answer rather interesting. However definitions are complicated enough, so I still don't see the answer for my question.

I can't find any example of query with blank nodes that returns different results than the same query with variables in place of blank nodes.

For example is there any case when the following queries return different results:

  1. SELECT ?a ?b
    WHERE {
        ?a :predicate _:blankNode .
        _:blankNode :otherPredicate ?b .
    }
    
  2. SELECT ?a ?b
    WHERE {
        ?a :predicate ?variable .
        ?variable :otherPredicate ?b .
    }
    

Maybe there are more complex queries that cause different behavior?

In particular I wonder is there any examples of different results of queries executed on an RDF graph that doesn't have blank nodes.

Thanks.

PS. Yes, I know that blank nodes can be used only in one BasicGraphPattern as opposed to variables. But this is not the difference I'm talking about.

Answer

Joshua Taylor picture Joshua Taylor · Feb 17, 2014

The answer that you linked to is about blank nodes in the data that is being queried, not about blank nodes in the query. You're absolutely right that blank nodes in the query act just like variables. The specification says this (emphasis added):

4.1.4 Syntax for Blank Nodes

Blank nodes in graph patterns act as variables, not as references to specific blank nodes in the data being queried.

Blank nodes are indicated by either the label form, such as "_:abc", or the abbreviated form "[]". A blank node that is used in only one place in the query syntax can be indicated with []. A unique blank node will be used to form the triple pattern. Blank node labels are written as "_:abc" for a blank node with label "abc". The same blank node label cannot be used in two different basic graph patterns in the same query.

As such, your queries

SELECT ?a ?b
WHERE {
    ?a :predicate _:blankNode .
    _:blankNode :otherPredicate ?b .
}
SELECT ?a ?b
WHERE {
    ?a :predicate ?variable .
    ?variable :otherPredicate ?b .
}

behave identically. The benefit of using a blank node instead of a variable is that you can use some more compact syntax. In this case, you could write:

SELECT ?a ?b
WHERE {
    ?a :predicate [ :otherPredicate ?b ] .
}

Actually, in this case, since you're only looking for one property on the thing that the blank node matches, you could use a property path:

SELECT ?a ?b
WHERE {
    ?a :predicate/:otherPredicate ?b .
}