[Pellet-users] Pellet performance on a Jena project

Rinke Hoekstra hoekstra at uva.nl
Mon May 26 07:50:17 UTC 2008


(sorry for crosspost)

Hi,

Actually, I would really like to know why the listStatements() method  
is so slow on some ontologies. I suppose that many Jena-based projects  
use this method to retrieve reasoner results on a Pellet model, and  
this performance hog will affect many users.

Cheers,

	Rinke

On 21 mei 2008, at 18:32, Matteo Montalto wrote:

> Rinke Hoekstra ha detto: in data 21/05/2008 17.50:
>> Hi Matteo,
>
> Hi again Rinke, thanks for the help. :)
>
>> You can trigger Pellet by running:
>>        model.prepare();
>> and then either one of the following:
>>        ((PelletInfGraph) model.getGraph()).getKB().classify();
>>        ((PelletInfGraph) model.getGraph()).getKB().realize();
>
> I'm trying just now. Talkin with ppl in the jena-dev mailing list, I  
> noticed that prepare() does exactly what I'm looking for... if the  
> reasoning is a fwd chaining-based one. Here's a /snip from that  
> conversation:
> "For a forward rule engine that is everything, once the prepare is  
> done
> you have all the answers in "working memory" and no more inference is
> needed at query time.
>
> For a backward rule engine prepare would do nothing, until you have a
> specific query to answer you sit tight.
>
> For Pellet I don't know specifically but my guess is that they would  
> do
> the classification (generate the inferred class hierarchy) and cache  
> that."
>
> That was my doubt... dunno "how" pellet triggers... :-)
> But I'm trying your solution right now.... without success; just  
> adding the only prepare() call gives an OutOfMem Exception: java  
> heap space. (just to specify, I'm using a -Xmx1536M flag for the VM).
> Since I see you know both jena and pellet, can I ask you to try my  
> code on your machine? Seems so strange to me Pellet can't manage a  
> 50 k rdf/xml ontology; I guess maybe there's something wrong in me/ 
> my code/my machine xD. .
> The ontology is here:
> http://www.megafileupload.com/en/file/65625/family_swrl_originale-owl.html
>
> The code is really simple:
> imports:
> java.io.*;
> com.hp.hpl.jena.ontology.*:
> com.hp.hpl.jena.rdf.model.*;
> org.mindswap.pellet.jena.*;
>
> and here's the sketch of the main():
> Re:
> OntModel pelletModel =  
> ModelFactory.createOntologyModel(PelletReasonerFactory.THE_SPEC);
> pelletModel.read("file:myOntFile.owl");
> pelletModel.prepare();
> ((PelletInfGraph) model.getGraph()).getKB().realize();
>
>> But actually listStatements is really slow as well, just like  
>> writeAll... but I haven't tried it yet.
>
> An idea is to copy the inferred model into a new, plain one (OWL_MEM  
> for example). Then a writeAll() call will surely be efficient,  
> because any little query doesn't activate any inferencial process,  
> it becomes just an "in-memory" check. :)
>
>> Btw, for timing, I usually use the Timers from  
>> org.mindswap.pellet.utils.Timers, for instance:
>>    Timers timers = new Timers();
>>        // read the file
>>        timers.startTimer("Read");
>>        model.read( ont );
>>        timers.stopTimer("Read");
>>    timers.print(true,null);
>
> Thanks, I'll give them a try later... Actually I'm using  
> System.nanotime() and just print out the (end-start) difference. :)
> Thanks again for your precious help :)
> Matteo
>
>> Cheers,
>> -Rinke
>> On 21 mei 2008, at 17:35, Matteo Montalto wrote:
>>> Hello Rinke, and thanks for your help,
>>> performances are good if I don't call the writeAll() method. To  
>>> give an idea, about 2 seconds with a 50+ k ontology written in rdf/ 
>>> xml. But that's trivially expected, because if I don't make a call  
>>> of a method that causes pellet "triggering", what actually I  
>>> obtain is nothing but having my owl file stored in an OntModel  
>>> object. :)
>>> There's no inference at all in such an OntModel; and nothing can  
>>> be realized if pellet doesn't trigger....
>>> (that opens to my second question in my reply: what's the minimum  
>>> code to make Pellet trigger? :-P I could try to do my task in two  
>>> distinct moments: 1) pellet triggers (so that i can estimate  
>>> temporal performances of the engine w.r.t my solution) 2) i could  
>>> write down the model using listStatements() methods and iterate  
>>> over the results (this should be a solution that waste less memory  
>>> than repeated queries ( that's what writeAll() does).
>>>
>>>
>>>
>>> Rinke Hoekstra ha detto: in data 21/05/2008 17.05:
>>>> Hi Matteo,
>>>> What is the performance like if you do not call the  
>>>> model.writeAll method?
>>>> -Rinke
>>>> On 19 mei 2008, at 17:30, Matteo Montalto wrote:
>>>>> Hello list, please forgive me for this looong post :-P
>>>>>
>>>>> I'm doing a work with Jena in order to realize an SWRL rule  
>>>>> interpreter;
>>>>> in order to check results and performance, I did two simple  
>>>>> programs
>>>>> that configure a system as follows:
>>>>>
>>>>> 1) first solution (mine one): I load an ontology from an .owl  
>>>>> file into
>>>>> a base model (with no inference at all); then I build a pellet  
>>>>> model
>>>>> over it (using Pellet as OWL reasoner, disabling the SWRL safe  
>>>>> rule
>>>>> support). Finally, a rule model is built over pellet model.  
>>>>> "Visually",
>>>>> the schema is as follows:
>>>>> -----------------------------
>>>>> |  upperModel (ruleReasoner)  |
>>>>> |   ----------------------    |
>>>>> |  | innerModel (pellet  |    |
>>>>> |  |           no_SWRL)  |    |
>>>>> |  |  ----------------   |    |
>>>>> |  |  | base (OWL_MEM)|  |    |
>>>>> |  |  ----------------   |    |
>>>>> |  |                     |    |
>>>>> |   ---------------------     |
>>>>> -----------------------------
>>>>>
>>>>> 2) second solution (simpler one): I create a pellet model and  
>>>>> load in it
>>>>> my ontology (configuring Pellet to support SWRL safe rules):  
>>>>> this is the
>>>>> schema:
>>>>> -----------------------------
>>>>> !    pelletModel (pellet and  !
>>>>> !                SWRL support)|
>>>>> -----------------------------
>>>>>
>>>>> An equivalent schema would be:
>>>>>
>>>>> -----------------------------
>>>>> !    pelletModel (pellet and  !
>>>>> !                SWRL support)|
>>>>> |     -------------------     |
>>>>> |    | base (OWL_MEM)    |    |
>>>>> |     -------------------     |
>>>>> -----------------------------
>>>>> meaning, in this last case, that pelletModel is a model built  
>>>>> over a
>>>>> base model that doesn't make inference and is used only to load  
>>>>> the
>>>>> ontology from the file. (I tell "equivalent" since the creation  
>>>>> of the
>>>>> "nested" model affects by nothing the performance of the overall  
>>>>> system
>>>>> in such a case).
>>>>>
>>>>> The problem:
>>>>> while solution 1) goes smooth and with good performance,  
>>>>> solution 2)
>>>>> seems REALLY slow (my test ontology for this case is this one:
>>>>> http://www.megafileupload.com/en/file/65341/daycare-swrl- 
>>>>> owl.html, 35 k,
>>>>> expressivity ALCOIF(D), with 4 SWRL rules).
>>>>>
>>>>> To give an idea  while solution 1) gives an output in few  
>>>>> seconds (the
>>>>> output is a call of the writeAll() method from Jena API), the  
>>>>> second
>>>>> solution gives the expected output ("expected" to say that, to  
>>>>> me, is
>>>>> correct) in minutes.
>>>>> I did some tests, using the following java code (that uses Jena):
>>>>>
>>>>> //solution 2 - snip of the main()
>>>>> OntModel pelletModel =
>>>>> ModelFactory.createOntologyModel(PelletReasonerFactory.THE_SPEC);
>>>>> pelletModel.read("file:myOntFile.owl");
>>>>>
>>>>> then, in order to get an output file, I used:
>>>>> File f=new File("PelletOutput.owl");
>>>>> FileOutputStream fos;
>>>>> fos = new FileOutputStream(f);
>>>>> PrintStream ps=new PrintStream(fos);
>>>>> pelletModel.writeAll(ps, "N3", null);
>>>>> //ps is a PrintStream variable that I use to write directly on a  
>>>>> file,
>>>>> N3 is the language used.
>>>>>
>>>>> The writeAll() call makes Pellet trigger, so that I know that in  
>>>>> output
>>>>> I'll find the entire model (base one + inferred sentences).
>>>>> Such a implementation takes more than half an hour to terminate
>>>>> successfully, giving a correct output (e.g., also reporting  
>>>>> assertions
>>>>> inferred by the application of SWRL rules).
>>>>> A similar implementation of solution 1) gives an output in few  
>>>>> seconds.
>>>>>
>>>>> I did also some other tests and asked on the Jena ML, but all  
>>>>> makes me
>>>>> thinking that it's a pellet-related question. Also because, in  
>>>>> solution
>>>>> 1), where Pellet is used as owl reasoner without SWRL support,
>>>>> performance are definitely much better.
>>>>>
>>>>> Do you have any hint/suggestion to give me?
>>>>> Thanks! :-)
>>>>> _______________________________________________
>>>>> Pellet-users mailing list
>>>>> Pellet-users at lists.owldl.com
>>>>> http://lists.owldl.com/mailman/listinfo/pellet-users
>>>>> _______________________________________________
>>>>>
>>>>> Sponsored by Clark & Parsia, LLC http://clarkparsia.com/
>>>

-----------------------------------------------
Drs. Rinke Hoekstra

Email: hoekstra at uva.nl    Skype:  rinkehoekstra
Phone: +31-20-5253499     Fax:   +31-20-5253495
Web:   http://www.leibnizcenter.org/users/rinke

Leibniz Center for Law,          Faculty of Law
University of Amsterdam,            PO Box 1030
1000 BA  Amsterdam,             The Netherlands
-----------------------------------------------





More information about the Pellet-users mailing list