[Pellet-users] Reasoning Speed and Memory Problems

Bijan Parsia bijan at clarkparsia.com
Mon Jan 21 12:26:42 UTC 2008


Hi Stuart,

The question of what makes an ontology difficult (for certain reasoners) to reason with is a complex one without simple answers. (I know two PhD proposals that have crashed and burned on this!)

One real danger in giving advice about how to distort your modeling is that, really, the best we can do is give *conservative* advice (which is maximally distorting). Plus, any advice is likely to be made obsolete relatively quickly. There's a lot of amazing work coming out on large expressive tboxes *and* large aboxes (hypertableaux and abox summarization are just two examples). While it takes resources for these to migrate to production systems, it does happen and it would be unfortunate to have adopted a modeling style that is *bad* for your application or domain. Also, if you model "to the existing reasoners" you can distort how reasoners (and reasoner research) develops, since we'll be tuning for what you give us, not what you *want* to give us.

You might check out my profiling ontologies paper for more discussion:
    http://iswc2007.semanticweb.org/papers/589.pdf

One general methodology is to model first, then approximate or otherwise tune your ontology. (As long as your reasoner works fast enough for modeling, this is fine.)

These caveats aside, the most robust current advice would be to stick inside a tractable fragment:
   http://iswc2007.semanticweb.org/papers/589.pdf

For a long time, the only known tractable fragments, well, sucked, but the past few years has bubbled up a lot of quite nice and expressive ones.

While the implementations of these are still few, making new ones is a much much easier task and they tend to be robust in their performance. That is, it's harder to build ontologies that bust them hard. C&P is experimenting with the relevant algorithms which I imagine will trickle into Pellet one way or another.

Sound ontology segmentationcan help quite a bit, no so much because it makes things faster overall, necessarily, but it let's you separate your tough performing bits from the rest. Many ontologies are only hard in a few cases (see paper).

One specific "sore spot" at the moment is that the mere mention of inverses (even if you don't use them) disables a number of optimizations in Pellet. This will go away, I thin, as we figure out how to integrate anywhere caching, but currently it can blow out time and space pretty easily (see the paper for a discussion).

I hope this helps. I hope to take up again the general challenge of user performance tuning and understanding at some point. OWL WG and debugging work are consuming me at the moment :)

Cheers,
Bijan.



More information about the Pellet-users mailing list