[Pellet-users] Slow inference/Restrict inference

Alejandro Rodríguez González jalo.javier at gmail.com
Mon Feb 4 19:54:13 UTC 2008


Hi Evren,

In first place thanks for your answer, this project was starting to 
exasperate me.. :-)

I was reading your email and the attached code carefully, and making 
some tests. In effect, the problem was with the query that i was making 
but.. i have a doubt.

I was testing to make the query type without make prepare,classify and 
realize, and it take so many time.. i suppose that i will make 
prepare,classify and realize in order to make the querys. Its this true?

Because if i don't make this, when i make a query, pellet will make the 
prepare, classify and realize for me?

I think that if i only need make prepare, classify and realize one time 
(when program starts for example), will not be a problem, its correct 
this approach?

Thanks!!


Evren Sirin escribió:
> Hi Alejandro,
>
> I think you need to change your query not your ontology to get better 
> performance (trying a subset of the ontology will no doubt improve the 
> performance but I don't think it is required). Currently you are 
> running the following query
>
> model.listStatements(i, null, (RDFNode) null);
>
> which would try to find all the types, property assertions, same as 
> and different from inferences regarding that individual. This is going 
> to take considerable time especially because querying same as and 
> different from assertions is generally slow. Repeating this for all 
> 600 individuals in the ontology will be quite slow.
> If you just query types and property assertions things would be much 
> better. I modified your code as shown at the and of this message and 
> put explicit timing measurements to show which operation is taking how 
> long. The explicit calls to classify and realize are not queried but 
> just done in the code to time these two operations separately. Results 
> I get on my laptop are like this (timings give in milliseconds):
>
> Read            |      8573
> Prepare         |      1062
> Classify        |       924
> Realize         |     36002
> QueryTypes      |        17
> QueryProperties |         6
> QuerySames      |         0
> QueryDifferents |     32525
> QueryAll        |     31842
>
> The operations Read, Prepare, Classify and Realize are all one-time 
> operations that take a total of 46 seconds. QueryAll is the query you 
> were trying which takes 33sec. QueryTypes, QueryProperties, QuerySames 
> and QueryDifferents breaks up the query into four disjoint queries 
> (the union of the results to those four queries is exactly the same 
> set of results to QueryAll). As you can see it is just querying 
> differentFrom's that is taking all the time (even though you set UNA 
> to true Pellet tries all combinations of individuals to see if they 
> are different from or not). I would think that just QueryTypes and 
> QueryProperties is what you are interested in and they take total of 
> 23ms (two order of magnitude faster than QueryAll).
> Also note that the performance of QueryType and QueryProperties will 
> not be affected by UNA option. So the decision to use UNA should be 
> based on semantic considerations not performance results.
>
> Cheers,
> Evren
>
>
>    private void createModelToLoadData() {
>        PelletOptions.USE_UNIQUE_NAME_ASSUMPTION = true;
>              Timers timers = new Timers();
>
>        OntModel model = ModelFactory.createOntologyModel( 
> PelletReasonerFactory.THE_SPEC );
>              timers.startTimer("Read");
>        model.read( "http://www.jalojavier.es/humandisease.owl" );
>        timers.stopTimer("Read");
>              timers.startTimer("Prepare");
>        model.prepare();
>        timers.stopTimer("Prepare");
>
>        timers.startTimer("Classify");
>        ((PelletInfGraph) model.getGraph()).getKB().classify();
>        timers.stopTimer("Classify");      
>        timers.startTimer("Realize");
>        ((PelletInfGraph) model.getGraph()).getKB().realize();
>        timers.stopTimer("Realize");                          
> Individual i1 = model.getIndividual( 
> "http://www.jalojavier.es/humandisease.owl" + "#PR_SYMS_A_B_C" ); // 
> PR_SYMS_A_B_C
>
>        int count = 0;
>              timers.startTimer("QueryTypes");
>        count += countStatements( model, i1, RDF.type, null );
>        timers.stopTimer("QueryTypes");
>              timers.startTimer("QueryProperties");
>        for( Iterator i = model.listOntProperties(); i.hasNext(); )
>            count += countStatements( model, i1, (Property) i.next(), 
> null );
>        timers.stopTimer("QueryProperties");
>              timers.startTimer("QuerySames");
>        count += countStatements( model, i1, OWL.sameAs, null );
>        timers.stopTimer("QuerySames");
>              timers.startTimer("QueryDifferents");
>        count += countStatements( model, i1, OWL.differentFrom, null );
>        timers.stopTimer("QueryDifferents");
>              System.out.println( "Count for the first 4 queries: "  + 
> count );
>              timers.startTimer("QueryAll");
>        count = countStatements( model, i1, null, null );
>        timers.stopTimer("QueryAll");
>              System.out.println( "Count for the last query: "  + count );
>              timers.print( true, null );
>    }
>
>    private int countStatements(OntModel m, Resource s, Property p, 
> RDFNode o) {
>        int c = 0;
>        for( StmtIterator i = m.listStatements( s, p, o ); i.hasNext(); 
> ) {
>            i.nextStatement();
>            c++;
>        }
>        return c;
>    }
>
> On 1/31/08 2:25 PM, Alejandro Rodríguez González wrote:
>> Hello,
>>
>> I ask a few days ago about how to restrict the inference domain in 
>> pellet but no one answered me so.. i will try to make again the 
>> question (maybe no one understand me, i don't know).
>>
>> I have an ontology ( http://www.jalojavier.es/humandisease.owl ) and 
>> Jena+pellet code ( http://rafb.net/p/k1yHie11.html ) to make the 
>> inferences.
>>
>> The problem is that the inference that i try to make with the code 
>> mentioned take a lot of time (near to 180 seconds)..
>>
>> I make some test splitting the ontology into small parts, and i think 
>> that the problem is the number of instances of the ontology (now i 
>> have near to 600 individuals and 1500 classes)..
>>
>> I think that may be it's possible to optimize the inference speed 
>> making a restriction over the inference domain. I have "Diaseses", 
>> "Symptoms", and "Lab Test" superclasses that are involucrated in the 
>> inferences, but, the results of the inferences are only subclasses of 
>> "Diseases" (that only has near to 30 instances).
>>
>> It's possible say to pellet that only must search the results in this 
>> superclass and ignore the rest?
>>
>> Or any other solution that optimize the inference speed..
>>
>> Thanks.
>> _______________________________________________
>> Pellet-users mailing list
>> Pellet-users at lists.owldl.com
>> http://lists.owldl.com/mailman/listinfo/pellet-users
>> _______________________________________________
>>
>> Sponsored by Clark & Parsia, LLC http://clarkparsia.com/
>>   
>
>



More information about the Pellet-users mailing list