[Pellet-users] Slow inference/Restrict inference
Evren Sirin
evren at clarkparsia.com
Mon Feb 4 21:03:07 UTC 2008
On 2/4/08 2:54 PM, Alejandro Rodríguez González wrote:
> Hi Evren,
>
> In first place thanks for your answer, this project was starting to
> exasperate me.. :-)
>
> I was reading your email and the attached code carefully, and making
> some tests. In effect, the problem was with the query that i was
> making but.. i have a doubt.
>
> I was testing to make the query type without make prepare,classify and
> realize, and it take so many time.. i suppose that i will make
> prepare,classify and realize in order to make the querys. Its this true?
Yes, if you call prepare/classify/realize explicitly upfront then most
of the reasoning will be done at that time and the subsequent queries
will be faster (provided that your queries correspond to what has been
cached during classification and realization which is why I suggested
not using queries with a null value in the predicate position).
>
> Because if i don't make this, when i make a query, pellet will make
> the prepare, classify and realize for me?
Yes, depending on the type of the query classification and/or
realization will be triggered. This will cause the first query to be
very slow compared to subsequent queries.
>
> I think that if i only need make prepare, classify and realize one
> time (when program starts for example), will not be a problem, its
> correct this approach?
Yes, doing the reasoning upfront generally makes sense.
Cheers,
Evren
>
> Thanks!!
>
>
> Evren Sirin escribió:
>> Hi Alejandro,
>>
>> I think you need to change your query not your ontology to get better
>> performance (trying a subset of the ontology will no doubt improve
>> the performance but I don't think it is required). Currently you are
>> running the following query
>>
>> model.listStatements(i, null, (RDFNode) null);
>>
>> which would try to find all the types, property assertions, same as
>> and different from inferences regarding that individual. This is
>> going to take considerable time especially because querying same as
>> and different from assertions is generally slow. Repeating this for
>> all 600 individuals in the ontology will be quite slow.
>> If you just query types and property assertions things would be much
>> better. I modified your code as shown at the and of this message and
>> put explicit timing measurements to show which operation is taking
>> how long. The explicit calls to classify and realize are not queried
>> but just done in the code to time these two operations separately.
>> Results I get on my laptop are like this (timings give in milliseconds):
>>
>> Read | 8573
>> Prepare | 1062
>> Classify | 924
>> Realize | 36002
>> QueryTypes | 17
>> QueryProperties | 6
>> QuerySames | 0
>> QueryDifferents | 32525
>> QueryAll | 31842
>>
>> The operations Read, Prepare, Classify and Realize are all one-time
>> operations that take a total of 46 seconds. QueryAll is the query you
>> were trying which takes 33sec. QueryTypes, QueryProperties,
>> QuerySames and QueryDifferents breaks up the query into four disjoint
>> queries (the union of the results to those four queries is exactly
>> the same set of results to QueryAll). As you can see it is just
>> querying differentFrom's that is taking all the time (even though you
>> set UNA to true Pellet tries all combinations of individuals to see
>> if they are different from or not). I would think that just
>> QueryTypes and QueryProperties is what you are interested in and they
>> take total of 23ms (two order of magnitude faster than QueryAll).
>> Also note that the performance of QueryType and QueryProperties will
>> not be affected by UNA option. So the decision to use UNA should be
>> based on semantic considerations not performance results.
>>
>> Cheers,
>> Evren
>>
>>
>> private void createModelToLoadData() {
>> PelletOptions.USE_UNIQUE_NAME_ASSUMPTION = true;
>> Timers timers = new Timers();
>>
>> OntModel model = ModelFactory.createOntologyModel(
>> PelletReasonerFactory.THE_SPEC );
>> timers.startTimer("Read");
>> model.read( "http://www.jalojavier.es/humandisease.owl" );
>> timers.stopTimer("Read");
>> timers.startTimer("Prepare");
>> model.prepare();
>> timers.stopTimer("Prepare");
>>
>> timers.startTimer("Classify");
>> ((PelletInfGraph) model.getGraph()).getKB().classify();
>> timers.stopTimer("Classify");
>> timers.startTimer("Realize");
>> ((PelletInfGraph) model.getGraph()).getKB().realize();
>> timers.stopTimer("Realize");
>> Individual i1 = model.getIndividual(
>> "http://www.jalojavier.es/humandisease.owl" + "#PR_SYMS_A_B_C" ); //
>> PR_SYMS_A_B_C
>>
>> int count = 0;
>> timers.startTimer("QueryTypes");
>> count += countStatements( model, i1, RDF.type, null );
>> timers.stopTimer("QueryTypes");
>> timers.startTimer("QueryProperties");
>> for( Iterator i = model.listOntProperties(); i.hasNext(); )
>> count += countStatements( model, i1, (Property) i.next(),
>> null );
>> timers.stopTimer("QueryProperties");
>> timers.startTimer("QuerySames");
>> count += countStatements( model, i1, OWL.sameAs, null );
>> timers.stopTimer("QuerySames");
>> timers.startTimer("QueryDifferents");
>> count += countStatements( model, i1, OWL.differentFrom, null );
>> timers.stopTimer("QueryDifferents");
>> System.out.println( "Count for the first 4 queries: " +
>> count );
>> timers.startTimer("QueryAll");
>> count = countStatements( model, i1, null, null );
>> timers.stopTimer("QueryAll");
>> System.out.println( "Count for the last query: " +
>> count );
>> timers.print( true, null );
>> }
>>
>> private int countStatements(OntModel m, Resource s, Property p,
>> RDFNode o) {
>> int c = 0;
>> for( StmtIterator i = m.listStatements( s, p, o );
>> i.hasNext(); ) {
>> i.nextStatement();
>> c++;
>> }
>> return c;
>> }
>>
>> On 1/31/08 2:25 PM, Alejandro Rodríguez González wrote:
>>> Hello,
>>>
>>> I ask a few days ago about how to restrict the inference domain in
>>> pellet but no one answered me so.. i will try to make again the
>>> question (maybe no one understand me, i don't know).
>>>
>>> I have an ontology ( http://www.jalojavier.es/humandisease.owl ) and
>>> Jena+pellet code ( http://rafb.net/p/k1yHie11.html ) to make the
>>> inferences.
>>>
>>> The problem is that the inference that i try to make with the code
>>> mentioned take a lot of time (near to 180 seconds)..
>>>
>>> I make some test splitting the ontology into small parts, and i
>>> think that the problem is the number of instances of the ontology
>>> (now i have near to 600 individuals and 1500 classes)..
>>>
>>> I think that may be it's possible to optimize the inference speed
>>> making a restriction over the inference domain. I have "Diaseses",
>>> "Symptoms", and "Lab Test" superclasses that are involucrated in the
>>> inferences, but, the results of the inferences are only subclasses
>>> of "Diseases" (that only has near to 30 instances).
>>>
>>> It's possible say to pellet that only must search the results in
>>> this superclass and ignore the rest?
>>>
>>> Or any other solution that optimize the inference speed..
>>>
>>> Thanks.
>>> _______________________________________________
>>> Pellet-users mailing list
>>> Pellet-users at lists.owldl.com
>>> http://lists.owldl.com/mailman/listinfo/pellet-users
>>> _______________________________________________
>>>
>>> Sponsored by Clark & Parsia, LLC http://clarkparsia.com/
>>>
>>
>>
>
More information about the Pellet-users
mailing list