[Pellet-users] non-determinism across runs due to lack of hashcode for Node

Aditya Kalyanpur adityak at gmail.com
Tue Jun 26 17:24:28 UTC 2007


On 6/26/07, Evren Sirin <evren at clarkparsia.com> wrote:
> On 6/26/07 11:00 AM, Aditya Kalyanpur wrote:
> > Hi All,
> >
> > A quick fyi for those who don't know: at IBM Watson we are working on
> > a scalable reasoning tool SHER that internally uses Pellet for
> > reasoning and justification finding. Evren, as always, has been
> > enormously helpful in discussing and integrating fixes in Pellet for
> > bugs that we discovered in our experiments with SHER. I have made the
> > mistake in the past of discussing ideas with him on IM and forgetting
> > to share it with the list, so my apologies for that.
> >
> > Anyway to two issues we recently discovered in Pellet 1.4 -- (and just
> > checked that they are still there in the latest release - 1.5 RC1):
> >
> > 1. We noticed a fair bit of non-determinism across Pellet runs for the
> > exact same dataset (runtimes varying by a factor of 10 for consistency
> > checking of a particular ontology),
>
> Hmm, that's a huge difference in performance. I'd be interested to see
> the ontology.

Hmm. Just checked with Aaron who experienced this and he mentioned it
as a version of Galen. I'll ask him to send it to you.

> > and we narrowed down the reason to
> > the lack of a hashcode implementation for the Node class.
>
> This would surely reduce the non-determinism but note that there are
> many other places in the tableaux algorithm that will introduce
> non-determinism like certain uses of unionOf and maxCardinality. So what
> you are describing will just address one part of the issue regarding
> non-determinism.

Understood.

> > Is there a
> > reason for the Node class not having a hashCode?
> Simply put it was never required. The default implementation was enough
> for our purposes.
>
> > We have implemented a
> > simple hashCode function that uses the hashcode of the ATerm
> > associated with the node -- is this ok?
> >
> That is certainly ok, i.e. it will not break anything. I don't know how
> useful it is, e.g. making the runs more deterministic might mean you are
> deterministically making them run at the slowest speed all the time. I
> guess what you have seen with the ontologies you tried was the opposite
> and you deterministically got fast results with this change. I don't
> know if that is the case in general. I would be more interested in
> figuring out why there is so much change in the reasoning performance
> for different hashCode implementations. That might reveal some issues in
> some other part of the reasoner.

well, without a hashcode, the default hash fn. inherited from Object()
uses the address (which changes from run to run). But yes, such a big
difference between runs is surprising.

> > 2. We also noticed that there was no ATermUtils function that
> > normalizes to a CNF or DNF form?
> No, there is not. ATermUtils provides two different normalization
> functions used in different parts of the reasoner. Negation Normal Form
> (NNF) and concept normalization as described in DLHB chapter 9. CNF and
> DNF was never internally required and thus not implemented.

You're right, never mind :-)

Regards,
Aditya

> Cheers,
> Evren
> > If you are interested, we can share
> > the code we have for generating these normal forms.
> >
> > Regards,
> > Aditya
> >
> > -------------------------
> > Aditya Kalyanpur
> > IBM TJ Watson Research Center, Hawthorne NY
> > 914-784-7097 (t/l 863)
> > _______________________________________________
> > Pellet-users mailing list
> > Pellet-users at lists.owldl.com
> > http://lists.owldl.com/mailman/listinfo/pellet-users
> > _______________________________________________
> >
> > Sponsored by Clark & Parsia, LLC http://clarkparsia.com/
> >
>
>


More information about the Pellet-users mailing list