[Pellet-users] Slow inference/Restrict inference
Alejandro Rodríguez González
jalo.javier at gmail.com
Tue Feb 5 22:14:08 UTC 2008
Hi again,
I have a new question about this topic.
I was making some tests and i saw that the reasoning of query types are
very fast when the individuals are in the loaded ontology.
If for example i create a new individual:
Individual i1 = modelo.createIndividual(this.getOntologyURI() +
"#PR_SYMS_A_B_C", modelo.getResource(this.getOntologyURI() +
"#Consult")); // PR_SYMS_A_B_C
i1.addProperty(hasSymp, modelo.getResource(getOntologyURI() +
"#SYM_A"));
i1.addProperty(hasSymp, modelo.getResource(getOntologyURI() +
"#SYM_B"));
i1.addProperty(hasSymp, modelo.getResource(getOntologyURI() +
"#SYM_C"));
And i try to get the inferenced classes, it takes aproximately 30
seconds (Apart of prepare,classify and realize).
The question is, this inference can be optimized? Thanks :-)
Evren Sirin escribió:
> On 2/4/08 2:54 PM, Alejandro Rodríguez González wrote:
>> Hi Evren,
>>
>> In first place thanks for your answer, this project was starting to
>> exasperate me.. :-)
>>
>> I was reading your email and the attached code carefully, and making
>> some tests. In effect, the problem was with the query that i was
>> making but.. i have a doubt.
>>
>> I was testing to make the query type without make prepare,classify
>> and realize, and it take so many time.. i suppose that i will make
>> prepare,classify and realize in order to make the querys. Its this true?
>
> Yes, if you call prepare/classify/realize explicitly upfront then most
> of the reasoning will be done at that time and the subsequent queries
> will be faster (provided that your queries correspond to what has been
> cached during classification and realization which is why I suggested
> not using queries with a null value in the predicate position).
>
>>
>> Because if i don't make this, when i make a query, pellet will make
>> the prepare, classify and realize for me?
>
> Yes, depending on the type of the query classification and/or
> realization will be triggered. This will cause the first query to be
> very slow compared to subsequent queries.
>
>>
>> I think that if i only need make prepare, classify and realize one
>> time (when program starts for example), will not be a problem, its
>> correct this approach?
>
> Yes, doing the reasoning upfront generally makes sense.
>
> Cheers,
> Evren
>
>>
>> Thanks!!
>>
>>
>> Evren Sirin escribió:
>>> Hi Alejandro,
>>>
>>> I think you need to change your query not your ontology to get
>>> better performance (trying a subset of the ontology will no doubt
>>> improve the performance but I don't think it is required). Currently
>>> you are running the following query
>>>
>>> model.listStatements(i, null, (RDFNode) null);
>>>
>>> which would try to find all the types, property assertions, same as
>>> and different from inferences regarding that individual. This is
>>> going to take considerable time especially because querying same as
>>> and different from assertions is generally slow. Repeating this for
>>> all 600 individuals in the ontology will be quite slow.
>>> If you just query types and property assertions things would be much
>>> better. I modified your code as shown at the and of this message and
>>> put explicit timing measurements to show which operation is taking
>>> how long. The explicit calls to classify and realize are not queried
>>> but just done in the code to time these two operations separately.
>>> Results I get on my laptop are like this (timings give in
>>> milliseconds):
>>>
>>> Read | 8573
>>> Prepare | 1062
>>> Classify | 924
>>> Realize | 36002
>>> QueryTypes | 17
>>> QueryProperties | 6
>>> QuerySames | 0
>>> QueryDifferents | 32525
>>> QueryAll | 31842
>>>
>>> The operations Read, Prepare, Classify and Realize are all one-time
>>> operations that take a total of 46 seconds. QueryAll is the query
>>> you were trying which takes 33sec. QueryTypes, QueryProperties,
>>> QuerySames and QueryDifferents breaks up the query into four
>>> disjoint queries (the union of the results to those four queries is
>>> exactly the same set of results to QueryAll). As you can see it is
>>> just querying differentFrom's that is taking all the time (even
>>> though you set UNA to true Pellet tries all combinations of
>>> individuals to see if they are different from or not). I would think
>>> that just QueryTypes and QueryProperties is what you are interested
>>> in and they take total of 23ms (two order of magnitude faster than
>>> QueryAll).
>>> Also note that the performance of QueryType and QueryProperties will
>>> not be affected by UNA option. So the decision to use UNA should be
>>> based on semantic considerations not performance results.
>>>
>>> Cheers,
>>> Evren
>>>
>>>
>>> private void createModelToLoadData() {
>>> PelletOptions.USE_UNIQUE_NAME_ASSUMPTION = true;
>>> Timers timers = new Timers();
>>>
>>> OntModel model = ModelFactory.createOntologyModel(
>>> PelletReasonerFactory.THE_SPEC );
>>> timers.startTimer("Read");
>>> model.read( "http://www.jalojavier.es/humandisease.owl" );
>>> timers.stopTimer("Read");
>>> timers.startTimer("Prepare");
>>> model.prepare();
>>> timers.stopTimer("Prepare");
>>>
>>> timers.startTimer("Classify");
>>> ((PelletInfGraph) model.getGraph()).getKB().classify();
>>> timers.stopTimer("Classify");
>>> timers.startTimer("Realize");
>>> ((PelletInfGraph) model.getGraph()).getKB().realize();
>>> timers.stopTimer("Realize");
>>> Individual i1 = model.getIndividual(
>>> "http://www.jalojavier.es/humandisease.owl" + "#PR_SYMS_A_B_C" ); //
>>> PR_SYMS_A_B_C
>>>
>>> int count = 0;
>>> timers.startTimer("QueryTypes");
>>> count += countStatements( model, i1, RDF.type, null );
>>> timers.stopTimer("QueryTypes");
>>> timers.startTimer("QueryProperties");
>>> for( Iterator i = model.listOntProperties(); i.hasNext(); )
>>> count += countStatements( model, i1, (Property) i.next(),
>>> null );
>>> timers.stopTimer("QueryProperties");
>>> timers.startTimer("QuerySames");
>>> count += countStatements( model, i1, OWL.sameAs, null );
>>> timers.stopTimer("QuerySames");
>>> timers.startTimer("QueryDifferents");
>>> count += countStatements( model, i1, OWL.differentFrom, null );
>>> timers.stopTimer("QueryDifferents");
>>> System.out.println( "Count for the first 4 queries: "
>>> + count );
>>> timers.startTimer("QueryAll");
>>> count = countStatements( model, i1, null, null );
>>> timers.stopTimer("QueryAll");
>>> System.out.println( "Count for the last query: " +
>>> count );
>>> timers.print( true, null );
>>> }
>>>
>>> private int countStatements(OntModel m, Resource s, Property p,
>>> RDFNode o) {
>>> int c = 0;
>>> for( StmtIterator i = m.listStatements( s, p, o );
>>> i.hasNext(); ) {
>>> i.nextStatement();
>>> c++;
>>> }
>>> return c;
>>> }
>>>
>>> On 1/31/08 2:25 PM, Alejandro Rodríguez González wrote:
>>>> Hello,
>>>>
>>>> I ask a few days ago about how to restrict the inference domain in
>>>> pellet but no one answered me so.. i will try to make again the
>>>> question (maybe no one understand me, i don't know).
>>>>
>>>> I have an ontology ( http://www.jalojavier.es/humandisease.owl )
>>>> and Jena+pellet code ( http://rafb.net/p/k1yHie11.html ) to make
>>>> the inferences.
>>>>
>>>> The problem is that the inference that i try to make with the code
>>>> mentioned take a lot of time (near to 180 seconds)..
>>>>
>>>> I make some test splitting the ontology into small parts, and i
>>>> think that the problem is the number of instances of the ontology
>>>> (now i have near to 600 individuals and 1500 classes)..
>>>>
>>>> I think that may be it's possible to optimize the inference speed
>>>> making a restriction over the inference domain. I have "Diaseses",
>>>> "Symptoms", and "Lab Test" superclasses that are involucrated in
>>>> the inferences, but, the results of the inferences are only
>>>> subclasses of "Diseases" (that only has near to 30 instances).
>>>>
>>>> It's possible say to pellet that only must search the results in
>>>> this superclass and ignore the rest?
>>>>
>>>> Or any other solution that optimize the inference speed..
>>>>
>>>> Thanks.
>>>> _______________________________________________
>>>> Pellet-users mailing list
>>>> Pellet-users at lists.owldl.com
>>>> http://lists.owldl.com/mailman/listinfo/pellet-users
>>>> _______________________________________________
>>>>
>>>> Sponsored by Clark & Parsia, LLC http://clarkparsia.com/
>>>>
>>>
>>>
>>
>
>
More information about the Pellet-users
mailing list