[Pellet-users] Duplicated results using Pellet with the Jena API
Evren Sirin
evren at clarkparsia.com
Wed Oct 31 14:52:13 UTC 2007
On 10/30/07 8:55 PM, Ibach, Brandon L wrote:
> Now we're getting somewhere. :) I've been able to reproduce Bruno's results and have further produced a very minimal test case that demonstrates the issue. The following, compiled against Pellet 1.5.0, will list just three statements for the plain model, but six for the Pellet model. I haven't yet had a chance to try to track this down, nor to try it with Pellet 1.5.1.
>
Thanks for the complete test case Brandon. It was very easy to reproduce
and fix the problem with the test case. The bug is fixed in the SVN and
we'll incorporate the fix for the next release. FWIW, this bug affects
only queries of the form model.listStatements(null,p,o) but not
model.listStatements(s,p,null).
Cheers,
Evren
> -Brandon :)
>
> --- dupStmts.owl ---
> <?xml version="1.0"?>
> <rdf:RDF
> xmlns="http://example.com/dupStmts.owl#"
> xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
> xmlns:owl="http://www.w3.org/2002/07/owl#"
> xml:base="http://example.com/dupStmts.owl">
> <owl:Ontology rdf:about=""/>
> <owl:ObjectProperty rdf:about="#op"/>
> <owl:Thing rdf:about="#a"/>
> <owl:Thing rdf:about="#b"><op rdf:resource="#a"/></owl:Thing>
> <owl:Thing rdf:about="#c"><op rdf:resource="#a"/></owl:Thing>
> <owl:Thing rdf:about="#d"><op rdf:resource="#a"/></owl:Thing>
> </rdf:RDF>
>
> --- dupStmts.java ---
> import java.util.Iterator;
>
> import org.mindswap.pellet.jena.PelletReasonerFactory;
>
> import com.hp.hpl.jena.ontology.OntModel;
> import com.hp.hpl.jena.rdf.model.Model;
> import com.hp.hpl.jena.rdf.model.ModelFactory;
> import com.hp.hpl.jena.rdf.model.Property;
> import com.hp.hpl.jena.rdf.model.Resource;
> import com.hp.hpl.jena.rdf.model.StmtIterator;
>
> public class dupStmts {
> private static final String ONTURI = "http://example.com/dupStmts.owl#";
>
> public static void main(String[] args) {
> Resource a;
> Property op;
> StmtIterator iter;
>
> // create an empty plain RDF model
> Model model = ModelFactory.createDefaultModel( );
> model.read("file:dupStmts.owl");
>
> System.out.println("---- Plain model -----------");
> a = model.getResource(ONTURI + "a");
> op = model.getProperty(ONTURI + "op");
> System.out.println("OP: " + (op == null ? "-" : op.toString()) +
> " A: " + (a == null ? "-" : a.toString()));
> iter = model.listStatements(null, op, a);
> while (iter.hasNext()) { System.out.println(iter.nextStatement().toString()); }
>
> model.close();
>
> // create an empty ontology model using Pellet spec
> OntModel ontModel = ModelFactory.createOntologyModel( PelletReasonerFactory.THE_SPEC );
> ontModel.read("file:dupStmts.owl");
>
> System.out.println("\n---- Ontology model --------");
> a = ontModel.getResource(ONTURI + "a");
> op = ontModel.getProperty(ONTURI + "op");
> System.out.println("OP: " + (op == null ? "-" : op.toString()) +
> " A: " + (a == null ? "-" : a.toString()));
> iter = ontModel.listStatements(null, op, a);
> while (iter.hasNext()) { System.out.println(iter.nextStatement().toString()); }
>
> ontModel.close();
> }
> }
>
>
>
>> -----Original Message-----
>> From: pellet-users-bounces at lists.owldl.com
>> [mailto:pellet-users-bounces at lists.owldl.com] On Behalf Of
>> Bruno Antunes
>> Sent: Tuesday, October 30, 2007 2:49 PM
>> To: 'Evren Sirin'; pellet-users at lists.owldl.com
>> Subject: Re: [Pellet-users] Duplicated results using Pellet
>> with the Jena API
>>
>> Hi again,
>>
>> Thanks for your support. I now have more details to share.
>> The ontology I'm
>> using is defined as follows:
>>
>> <?xml version="1.0"?>
>> <rdf:RDF
>> xmlns="ont.owl#"
>> xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>> xmlns:owl="http://www.w3.org/2002/07/owl#"
>> xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
>> xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
>> xml:base="ont.owl">
>> <owl:Ontology rdf:about=""/>
>> <owl:Class rdf:ID="Word"/>
>> <owl:Class rdf:ID="Concept"/>
>> <owl:ObjectProperty rdf:ID="hyponymOf">
>> <rdfs:domain rdf:resource="#Concept"/>
>> <rdfs:range rdf:resource="#Concept"/>
>> </owl:ObjectProperty>
>> <owl:ObjectProperty rdf:ID="referencedBy">
>> <rdfs:domain rdf:resource="#Concept"/>
>> <rdfs:range rdf:resource="#Word"/>
>> </owl:ObjectProperty>
>> <owl:DatatypeProperty rdf:ID="lexicalForm">
>> <rdfs:domain rdf:resource="#Word"/>
>> <rdfs:range
>> rdf:resource="http://www.w3.org/2001/XMLSchema#string"/>
>> </owl:DatatypeProperty>
>> <owl:DatatypeProperty rdf:ID="waitingValidation">
>> <rdfs:domain rdf:resource="#Concept"/>
>> <rdfs:range
>> rdf:resource="http://www.w3.org/2001/XMLSchema#boolean"/>
>> </owl:DatatypeProperty>
>> <owl:DatatypeProperty rdf:ID="informationContent">
>> <rdfs:domain rdf:resource="#Concept"/>
>> <rdfs:range
>> rdf:resource="http://www.w3.org/2001/XMLSchema#float"/>
>> </owl:DatatypeProperty>
>> <owl:DatatypeProperty rdf:ID="indexedSDKE">
>> <rdfs:range
>> rdf:resource="http://www.w3.org/2001/XMLSchema#string"/>
>> <rdfs:domain rdf:resource="#Concept"/>
>> </owl:DatatypeProperty>
>> <owl:DatatypeProperty rdf:ID="synsetID">
>> <rdfs:range
>> rdf:resource="http://www.w3.org/2001/XMLSchema#string"/>
>> <rdfs:domain rdf:resource="#Concept"/>
>> </owl:DatatypeProperty>
>> <owl:DatatypeProperty rdf:ID="globalTagCount">
>> <rdfs:domain rdf:resource="#Concept"/>
>> <rdfs:range rdf:resource="http://www.w3.org/2001/XMLSchema#int"/>
>> </owl:DatatypeProperty>
>> </rdf:RDF>
>>
>> I've tried the "listStatements" method with both persistent
>> and in memory
>> approaches, the results were the same. I then tried a SPARQL
>> query and again
>> the same results. The code I've used is:
>>
>> OntModel jOntModel =
>> ModelFactory.createOntologyModel(PelletReasonerFactory.THE_SPEC);
>> jOntModel.read("file:c:\\do.owl");
>> String query = "PREFIX do: <ont.owl#> " +
>> "SELECT ?resource " +
>> "WHERE " +
>> "{ " +
>> " ?resource do:hyponymOf do:concept-1740 " +
>> "}";
>>
>> ResultSet results =
>> QueryExecutionFactory.create(QueryFactory.create(query),
>> jOntModel).execSelect();
>> while (results.hasNext())
>> System.out.println(results.next().toString());
>> jOntModel.close();
>>
>> I'm really lost here. I only get duplicated results when
>> using something
>> like this:
>>
>> jOntModel.listStatements(null, propertyX, resourceY);
>>
>> If I use the "listStatements" method without parameters I get
>> the complete
>> list of statements without duplicated values.
>>
>> Best regards,
>> Bruno
>>
>> -----Original Message-----
>> From: Evren Sirin [mailto:evren at clarkparsia.com]
>> Sent: terça-feira, 30 de Outubro de 2007 15:52
>> To: Bruno Antunes
>> Cc: 'Ibach, Brandon L'; pellet-users at lists.owldl.com
>> Subject: Re: [Pellet-users] Duplicated results using Pellet
>> with the Jena
>> API
>>
>> On 10/30/07 8:51 AM, Bruno Antunes wrote:
>>
>>> Hi,
>>>
>>> After printing the statements I got this result:
>>>
>>> [ont.owl#concept-2056, ont.owl#concept-1740, ont.owl#hyponymOf]
>>> [ont.owl#concept-5598, ont.owl#concept-1740, ont.owl#hyponymOf]
>>> [ont.owl#concept-16236, ont.owl#concept-1740, ont.owl#hyponymOf]
>>> [ont.owl#concept-16236, ont.owl#hyponymOf, ont.owl#concept-1740]
>>> [ont.owl#concept-5598, ont.owl#hyponymOf, ont.owl#concept-1740]
>>> [ont.owl#concept-2056, ont.owl#hyponymOf, ont.owl#concept-1740]
>>>
>>> I don't understand why I'm getting the same statements
>>>
>> twice, but with the
>>
>>> predicate switched with the object.
>>>
>>>
>> It is not the same statement if the predicate and object is
>> switched ;)
>> It is weird that you are getting triples where a property is in the
>> object position. You can compare the results of the
>> listStatements from
>> a raw model and inference model to see if these statements
>> are inferred
>> or not. My guess is these statements are asserted but if not
>> you can use
>> Pellet's explanation feature to see why the inference happens.
>>
>> Cheers,
>> Evren
>>
>>
>>
>>
>>> Regards,
>>> Bruno
>>>
>>> -----Original Message-----
>>> From: Evren Sirin [mailto:evren at clarkparsia.com]
>>> Sent: terça-feira, 30 de Outubro de 2007 0:26
>>> To: Bruno Antunes
>>> Cc: 'Ibach, Brandon L'; pellet-users at lists.owldl.com
>>> Subject: Re: [Pellet-users] Duplicated results using Pellet
>>>
>> with the Jena
>>
>>> API
>>>
>>> OntModel.listStatements is guaranteed not to return
>>>
>> duplicate results.
>>
>>> The only possibility for duplicate results in the code fragment you
>>> provide can be due to the URI problems. If either
>>>
>>> jOntModel.getIndividual(DO_BASE_NAMESPACE + cptID)
>>>
>>> or
>>>
>>> jOntModel.getProperty(DO_PROPERTY_HYPONYMOF_URI)
>>>
>>> returns null there might be more than one possibility for
>>>
>> the object (or
>>
>>> predicate) for the same subject that would explain why you
>>>
>> see the same
>>
>>> resource more than once in the output. You can check if any
>>>
>> of these
>>
>>> other parameters are null to pinpoint the problem. Also you
>>>
>> can print
>>
>>> the statements themselves (rather than only subjects) will
>>>
>> make sure
>>
>>> that there are no duplicates returned.
>>>
>>> Cheers,
>>> Evren
>>>
>>>
>>> On 10/29/07 6:41 PM, Bruno Antunes wrote:
>>>
>>>
>>>> Ok. Thanks for answering quickly. I'll do some tests
>>>>
>> according to what
>>
>>>> you've pointed out and I'll keep you posted.
>>>>
>>>> Best regards,
>>>> Bruno
>>>>
>>>> -----Original Message-----
>>>> From: Ibach, Brandon L [mailto:brandon.l.ibach at lmco.com]
>>>> Sent: segunda-feira, 29 de Outubro de 2007 22:32
>>>> To: Bruno Antunes; pellet-users at lists.owldl.com
>>>> Subject: {Spam?} RE: RE: [Pellet-users] Duplicated results
>>>>
>> using Pellet
>>
>>>>
>>>>
>>> with
>>>
>>>
>>>> the Jena API
>>>>
>>>> Bruno,
>>>> Unfortunately, this doesn't really give me much more to go on.
>>>> >From your description and assuming, for the moment, that
>>>>
>> this is not a
>>
>>>> bug in Pellet, I see two possible sources of the problem.
>>>>
>> The first is
>>
>>>> the use of the database for storage. Not that I don't trust the
>>>> interface to the database, but if I were running into this
>>>>
>> problem, I'd
>>
>>>> want to see if I could get the same results if the
>>>>
>> database were not in
>>
>>>> the picture.
>>>>
>>>> The second possibility, which I think is more likely, is that
>>>> you're getting these duplicates because of some OWL
>>>>
>> inference that you
>>
>>>> did not expect. The fact that you didn't see this with
>>>>
>> the Jena OWL
>>
>>>> reasoner may well be due to the fact that Jena's OWL
>>>>
>> reasoner is not
>>
>>>> complete, while Pellet is, so the inference that is
>>>>
>> causing the problem
>>
>>>> may be one that the Jena reasoner does not fully implement.
>>>>
>>>> For both of these issues, my approach would be to create a
>>>> fairly small dataset and a standalone Java class that I
>>>>
>> can compile and
>>
>>>> run against the dataset to demonstrate the problem. As I
>>>>
>> said, the mere
>>
>>>> exercise of producing these two items may very well help
>>>>
>> you to find the
>>
>>>> problem yourself, but if not, providing them to the
>>>>
>> mailing list will
>>
>>>> give myself and/or others everything we would need to observe the
>>>> problem and begin to diagnose the cause.
>>>>
>>>> And don't worry about avoiding complexity... if these sort of
>>>> problems could be solved without a little complexity, the
>>>>
>> stuff wouldn't
>>
>>>> be nearly as much fun. ;)
>>>>
>>>> -Brandon :)
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>> -----Original Message-----
>>>>> From: Bruno Antunes [mailto:bema at student.dei.uc.pt]
>>>>> Sent: Monday, October 29, 2007 5:59 PM
>>>>> To: Ibach, Brandon L; pellet-users at lists.owldl.com
>>>>> Subject: RE: RE: [Pellet-users] Duplicated results using
>>>>> Pellet with the Jena API
>>>>>
>>>>> Ok. I'll try to explain as far as I can without getting
>>>>>
>> complex... :)
>>
>>>>> The ontology I'm dealing with stores data similar to WordNet.
>>>>> I have a class
>>>>> called "Concept" and a property "hyponymOf" between instances
>>>>> of this class.
>>>>> To find all the instances which are hyponyms of a specific
>>>>> instance I use
>>>>> something like this:
>>>>>
>>>>> OntClass jOntClassConcept =
>>>>> jOntModel.getOntClass(DO_CLASS_CONCEPT_URI);
>>>>> Individual jIndividualConcept =
>>>>> jOntModel.getIndividual(DO_BASE_NAMESPACE +
>>>>> cptID);
>>>>> StmtIterator jStmtIterator = jOntModel.listStatements(null,
>>>>> jOntModel.getProperty(DO_PROPERTY_HYPONYMOF_URI),
>>>>>
>> jIndividualConcept);
>>
>>>>> while (jStmtIterator.hasNext()) {
>>>>> Individual jIndividual =
>>>>> (Individual)jStmtIterator.nextStatement().getSubject().as(Indi
>>>>> vidual.class);
>>>>> hyponymOfList.add(jIndividual);
>>>>> }
>>>>>
>>>>> The results I get in the "hyponymOfList" are all duplicated.
>>>>> I've made some
>>>>> queries in the database storage and found no duplicated
>>>>> statements. This
>>>>> code worked well with the OWL reasoner of the Jena API. So I
>>>>> suppose it is
>>>>> something related with the Pellet inference. I hope it became
>>>>> clearer... :)
>>>>>
>>>>> Thanks,
>>>>> Bruno
>>>>>
>>>>> -----Original Message-----
>>>>> From: Ibach, Brandon L [mailto:brandon.l.ibach at lmco.com]
>>>>> Sent: segunda-feira, 29 de Outubro de 2007 21:40
>>>>> To: Bruno Antunes; pellet-users at lists.owldl.com
>>>>> Subject: {Spam?} RE: [Pellet-users] Duplicated results using
>>>>> Pellet with the
>>>>> Jena API
>>>>>
>>>>> Bruno,
>>>>> There is likely a reasonable explanation for why you're getting
>>>>> these duplicates, but without significantly more detail,
>>>>>
>> I'd guess it
>>
>>>>> would be impossible for anyone to tell you what the reason
>>>>> is. Can you
>>>>> provide a minimal, yet complete, ontology and code sample that
>>>>> demonstrate the problem? Sometimes just the process of
>>>>>
>> creating these
>>
>>>>> can help you to spot the problem.
>>>>>
>>>>> -Brandon :)
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: pellet-users-bounces at lists.owldl.com
>>>>>> [mailto:pellet-users-bounces at lists.owldl.com] On Behalf Of
>>>>>> Bruno Antunes
>>>>>> Sent: Monday, October 29, 2007 5:30 PM
>>>>>> To: pellet-users at lists.owldl.com
>>>>>> Subject: [Pellet-users] Duplicated results using Pellet with
>>>>>> the Jena API
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I'm using Pellet with the Jena API and I'm getting a
>>>>>>
>>>>>>
>>>>>>
>>>>> strange, or not,
>>>>>
>>>>>
>>>>>
>>>>>> behavior when I use the "listStatements(Resource s, Property
>>>>>> p, RDFNode o)"
>>>>>> method of the "Model" interface.
>>>>>>
>>>>>> When I call this method to find the resources to which
>>>>>> applies a specific
>>>>>> property with a specific object, I'm getting duplicated results.
>>>>>>
>>>>>> For instance, if I use something like this:
>>>>>>
>>>>>> myModel.listStatements(null, myProperty, myInstance);
>>>>>>
>>>>>> I get the results just like this:
>>>>>>
>>>>>> Resource1
>>>>>> Resource1
>>>>>> Resource2
>>>>>> Resource2
>>>>>> ...
>>>>>>
>>>>>> Hope someone could help me with this. I'm using Jena v2.5.2
>>>>>>
>>>>>>
>>>>>>
>>>>> and Pellet
>>>>>
>>>>>
>>>>>
>>>>>> v1.5.0 with persistent storage in a MySQL database.
>>>>>>
>>>>>> Thanks in advance, best regards,
>>>>>> Bruno Antunes
>>>>>>
>>>>>> _______________________________________________
>>>>>> Pellet-users mailing list
>>>>>> Pellet-users at lists.owldl.com
>>>>>> http://lists.owldl.com/mailman/listinfo/pellet-users
>>>>>> _______________________________________________
>>>>>>
>>>>>> Sponsored by Clark & Parsia, LLC http://clarkparsia.com/
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>> _______________________________________________
>>>> Pellet-users mailing list
>>>> Pellet-users at lists.owldl.com
>>>> http://lists.owldl.com/mailman/listinfo/pellet-users
>>>> _______________________________________________
>>>>
>>>> Sponsored by Clark & Parsia, LLC http://clarkparsia.com/
>>>>
>>>>
>>>>
>>>
>>>
>> _______________________________________________
>> Pellet-users mailing list
>> Pellet-users at lists.owldl.com
>> http://lists.owldl.com/mailman/listinfo/pellet-users
>> _______________________________________________
>>
>> Sponsored by Clark & Parsia, LLC http://clarkparsia.com/
>>
>>
More information about the Pellet-users
mailing list