[Pellet-users] Pellet stalling / OutOfMemory exceptions during classification
Tony Burdett
tburdett at ebi.ac.uk
Tue Nov 20 15:15:48 UTC 2007
Evren,
This is really helpful. The ontology is kind of funky because it's
being automatically generated to model a database, but your tips are
very useful and have certainly given me something to actively target.
I'll let you know how I get on!
Thanks very much for your time!
Tony
Evren Sirin wrote:
> I don't have anything that will solve your immediate problems but
> below are some suggestions that might help you. I will describe the
> steps I have done as they may be helpful to people having performance
> problems with other ontologies.
>
> 1) Try classifying the ontology using the command line Pellet. This is
> just to make sure that problem can be replicated without any
> dependency to custom code. In this case, I have seen the behavior you
> described: Classification goes very fast until 60% then gets stuck and
> progresses very slowly after that. It eventually halts with memory
> exceptions.
>
> 2) Load the ontology to Protege4 and try with FaCT++. This step is to
> see if the bad behavior is specific to Pellet (a bug, missing
> optimization, etc.). I have seen that FaCT++ stalls just like Pellet
> suggesting the modeling in the ontology has some issues.
>
> 3) Set the logging level for ABox to INFO in the log4j.properties file
> to see where exactly the reasoner gets stuck. I have seen that the
> concept Seq_region_attrib_Record was the first problematic concept:
>
> INFO [ABox] - Consistency Seq_region_attrib_Record for 0 individuals []
> INFO [ABox] - Consistent: true Tree depth: 25 Tree size: 26726 Time:
> 23745
>
> Without getting too much into the details of tableau algorithm, the
> tree depth and size refer to the completion tree created by the
> reasoner to prove the satisfiability of the concept. The reasoner
> starts with one node and every existential restriction (someValueFrom,
> minCardinality) creates a new node. The reasoner stops creating new
> nodes when certain conditions are met . The fact that more than 20K
> nodes were created for this concept suggests there is some complicated
> existential restrictions in this ontology.
>
> I've investigated the Seq_region_attrib_Record concept briefly and
> realized this was the case. A Seq_region_attrib_Record is related (via
> cardinality restrictions) to a Seq_region_Record instance, a
> Dna_Record instance and a Dnac_Record instance. A Seq_region has
> separate cardinality restrictions relating it to a Dna_Record and a
> Dnac_Record. I suspect these are same Dna/Dnac records but there is
> nothing that states this equivalence. Therefore, the reasoner creates
> a new node for each of these restrictions in the completion tree. A
> Dna_Record and a Dnac_Record have separate cardinality restrictions
> relating them to a Seq_region_Record. Again this is probably the same
> Seq_region_Record that Seq_region_attrib_Record refers to but there is
> no such explicit relation stated.
>
> My suggestions would be to express these relationships using complex
> subproperty axioms and remove some of the cardinality restrictions as
> much as possible. For example, you can write:
>
> (Seq_region_attrib_Record_has_Seq_region_Record
> Seq_region_Record_has_Dna_Record)
> subPropertyOf Seq_region_attrib_Record_has_Dna_Record
>
> (Seq_region_attrib_Record_has_Seq_region_Record
> Seq_region_Record_has_Dnac_Record)
> subPropertyOf Seq_region_attrib_Record_has_Dnac_Record
>
> and leave only one cardinality restriction in the
> Seq_region_attrib_Record concept. You probably need to do this for
> many other concepts to solve all the issues. For example, I can also
> suggest adding the following axioms and removing one cardinality
> restriction from Seq_region_Record:
>
> (Seq_region_Record_has_Dna_Record Dna_Record_has_Dnac_Record)
> subPropertyOf Seq_region_Record_has_Dnac_Record
>
> (Seq_region_Record_has_Dnac_Record Dnac_Record_has_Dna_Record)
> subPropertyOf Seq_region_Record_has_Dna_Record
>
> I don't know how much this makes sense semantically but I think these
> changes would simplify the models reasoner constructs.
>
> Hope this helps,
> Evren
>
> On 11/16/07 4:17 AM, Tony Burdett wrote:
>> Hi guys,
>>
>> I have a tool to generate ontologies that reflect explicit semantics of
>> databases, and I'm having problems reasoning over the larger stuff.
>> This particular example was generated from the Ensembl Human database.
>>
>> Once I get over a certain size, I'm getting either OutOfMemory
>> exceptions or else it's stalling altogether. I've tried reasoning over
>> my ontology using both the java api and using the pellet command line
>> jar, and I've tried it with both pellet 1.5 and 1.5.1 with no apparent
>> difference. When i try and classify it from the command line using:
>>
>> pellet -if homo_sapiens_core_47_36i.owl -c TREE -s off > output.txt
>>
>> I set pellet.properties USE_CLASSIFICATION_MONITOR = swing and increased
>> the -Xmx option to 1g, and the little progress bar rips straight up to
>> around 60-70% and then completely stalls, before throwing an
>> OutOfMemoryException. Any feedback on whats going on here would be
>> really welcome. I've attached the ontology I'm using for reference.
>>
>> Thanks,
>>
>> Tony Burdett.
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> Pellet-users mailing list
>> Pellet-users at lists.owldl.com
>> http://lists.owldl.com/mailman/listinfo/pellet-users
>> _______________________________________________
>>
>> Sponsored by Clark & Parsia, LLC http://clarkparsia.com/
--
Tony Burdett
Software Developer,
ComparaGrid.
European Bioinformatics Institute
email: tburdett at ebi.ac.uk
tel: 01223 494624
More information about the Pellet-users
mailing list