[Pellet-users] Pellet stalling / OutOfMemory exceptions during classification

Tony Burdett tburdett at ebi.ac.uk
Tue Nov 20 15:15:48 UTC 2007


Evren,

This is really helpful.  The ontology is kind of funky because it's 
being automatically generated to model a database, but your tips are 
very useful and have certainly given me something to actively target.  
I'll let you know how I get on!

Thanks very much for your time!

Tony

Evren Sirin wrote:
> I don't have anything that will solve your immediate problems but 
> below are some suggestions that might help you. I will describe the 
> steps  I have done as they may be helpful to people having performance 
> problems with other ontologies.
>
> 1) Try classifying the ontology using the command line Pellet. This is 
> just to make sure that problem can be replicated without any 
> dependency to custom code. In this case, I have seen the behavior you 
> described: Classification goes very fast until 60% then gets stuck and 
> progresses very slowly after that. It eventually halts with memory 
> exceptions.
>
> 2) Load the ontology to Protege4 and try with FaCT++. This step is to 
> see if the bad behavior is specific to Pellet (a bug, missing 
> optimization, etc.). I have seen that FaCT++ stalls just like Pellet 
> suggesting the modeling in the ontology has some issues.
>
> 3) Set the logging level for ABox to INFO in the log4j.properties file 
> to see where exactly the reasoner gets stuck. I have seen that the 
> concept Seq_region_attrib_Record was the first problematic concept:
>
> INFO [ABox] - Consistency Seq_region_attrib_Record for 0 individuals []
> INFO [ABox] - Consistent: true Tree depth: 25 Tree size: 26726 Time: 
> 23745
>
> Without getting too much into the details of tableau algorithm, the 
> tree depth and size refer to the completion tree created by the 
> reasoner to prove the satisfiability of the concept. The reasoner 
> starts with one node and every existential restriction (someValueFrom, 
> minCardinality) creates a new node. The reasoner stops creating new 
> nodes when certain conditions are met . The fact that more than 20K 
> nodes were created for this concept suggests there is some complicated 
> existential restrictions in this ontology.
>
> I've investigated the Seq_region_attrib_Record concept briefly and 
> realized this was the case. A Seq_region_attrib_Record is related (via 
> cardinality restrictions) to a Seq_region_Record instance, a 
> Dna_Record instance and a Dnac_Record instance. A Seq_region has 
> separate cardinality restrictions relating it to a Dna_Record and a 
> Dnac_Record. I suspect these are same Dna/Dnac records but there is 
> nothing that states this equivalence. Therefore, the reasoner creates 
> a new node for each of these restrictions in the completion tree. A 
> Dna_Record and a Dnac_Record have separate cardinality restrictions 
> relating them to a Seq_region_Record. Again this is probably the same 
> Seq_region_Record that Seq_region_attrib_Record refers to but there is 
> no such explicit relation stated.
>
> My suggestions would be to express these relationships using complex 
> subproperty axioms and remove some of the  cardinality restrictions as 
> much as possible. For example, you can write:
>
> (Seq_region_attrib_Record_has_Seq_region_Record 
> Seq_region_Record_has_Dna_Record)
>   subPropertyOf Seq_region_attrib_Record_has_Dna_Record
>
> (Seq_region_attrib_Record_has_Seq_region_Record 
> Seq_region_Record_has_Dnac_Record)
>   subPropertyOf Seq_region_attrib_Record_has_Dnac_Record
>
> and leave only one cardinality restriction in the 
> Seq_region_attrib_Record concept. You probably need to do this for 
> many other concepts to solve all the issues. For example, I can also 
> suggest adding the following axioms and removing one cardinality 
> restriction from Seq_region_Record:
>
> (Seq_region_Record_has_Dna_Record Dna_Record_has_Dnac_Record)
>   subPropertyOf Seq_region_Record_has_Dnac_Record
>
> (Seq_region_Record_has_Dnac_Record Dnac_Record_has_Dna_Record)
>   subPropertyOf Seq_region_Record_has_Dna_Record
>
> I don't know how much this makes sense semantically but I think these 
> changes would simplify the models reasoner constructs.
>
> Hope this helps,
> Evren
>
> On 11/16/07 4:17 AM, Tony Burdett wrote:
>> Hi guys,
>>
>> I have a tool to generate ontologies that reflect explicit semantics of
>> databases, and I'm having problems reasoning over the larger stuff.
>> This particular example was generated from the Ensembl Human database.
>>
>> Once I get over a certain size, I'm getting either OutOfMemory
>> exceptions or else it's stalling altogether. I've tried reasoning over
>> my ontology using both the java api and using the pellet command line
>> jar, and I've tried it with both pellet 1.5 and 1.5.1 with no apparent
>> difference. When i try and classify it from the command line using:
>>
>> pellet -if homo_sapiens_core_47_36i.owl -c TREE -s off > output.txt
>>
>> I set pellet.properties USE_CLASSIFICATION_MONITOR = swing and increased
>> the -Xmx option to 1g, and the little progress bar rips straight up to
>> around 60-70% and then completely stalls, before throwing an
>> OutOfMemoryException. Any feedback on whats going on here would be
>> really welcome. I've attached the ontology I'm using for reference.
>>
>> Thanks,
>>
>> Tony Burdett.
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> Pellet-users mailing list
>> Pellet-users at lists.owldl.com
>> http://lists.owldl.com/mailman/listinfo/pellet-users
>> _______________________________________________
>>
>> Sponsored by Clark & Parsia, LLC http://clarkparsia.com/


-- 
Tony Burdett
Software Developer,
ComparaGrid.

European Bioinformatics Institute
email: tburdett at ebi.ac.uk
tel:   01223 494624



More information about the Pellet-users mailing list