{ by david linsin }

June 29, 2008

Does Your API Docs Leave Users in the Dark?

Have you ever used an API and you didn't know what would happen, if you called a method and passed null or an empty List as a parameter? And not only didn't you know what's gonna happen to you as a caller, but you also had no idea what the state of the system would be, in case something went wrong. We don't have dig deep to find APIs like that. Take a look at the documentation of the method save in org.hibernate.Session:

save
public Object load(Class theClass, Serializable id) throws HibernateException

Return the persistent instance of the given entity class with the given identifier, assuming that the instance exists.

You should not use this method to determine if an instance exists (use get() instead). Use this only to retrieve an instance that you assume exists, where non-existence would be an actual error.

Parameters:
theClass - a persistent class
id - a valid identifier of an existing persistent instance of the class

Returns:
the persistent instance or proxy

Throws:
HibernateException

You might argue that in this example it is pretty obvious what's going to happen, if you call load with null as the identifier. But is it really that obvious? Maybe it is to you, because you know the API. To be completely honest, I had to find out the hard way.

My first thought was that it must throw a HibernateException, because that's what the method throws, according to the API docs. Obviously that was a wrong assumption, since @Test(expect=org.hibernate.HibernateException) failed. So what's it gonna be? IllegalArgumentException, NullPointerException or maybe it simply returns null? I guess you get my point by now: document what happens, in case your users screws up calling your API! Don't let them grope in the dark!

In case of Hibernate, I had to run tests to find out that the method actually throws an IllegalArgumentException, if you pass in null as identifier. I think it doesn't need to be that way. If you'd simply document your API according to Item 28 of Effective Java's guidelines, coding-life would be a lot simpler. Take a look at the following javadocs of the List interface:

get
public E get(int index)

Returns the element at the specified position in this list.

Parameters:
index - index of the element to return

Returns:
the element at the specified position in this list

Throws:
IndexOutOfBoundsException - if the index is out of range (index < 0 || index >= size())

It tells you everything you need to know in order to be prepared in case a user screws up: the upper and lower boundary of the argument and the Exception which is possibly thrown in case those boundaries are not respected.

I'm a big fan of documenting limits and constraints of arguments, possible Exceptions thrown in case of violating those constraints and any state changes that might result due to those infringements. One big advantage of this kind of documentation, is that it's easy to write test cases against that API. I was wondering if there was a tool, which could infer all the possible violations based on the javadocs and generate test cases for them, but I couldn't find any. Another benefit is that you don't have to look through all the possible code paths, for each implementations of the interface, just to find out what would happen, if you messed up. Sometimes you might not even have the source code. In that case it's virtually impossible to code against an API without good documentation. At the time writing against the Hibernate API, I didn't have the source code - for particular reasons - so my only resource was the javadocs, which turned out to be a quite disappointing.

I consider documentation an integral part of the contract an API publishes. Tip 21 of The Pragmatic Programmer puts it perfectly:

Design with Contracts
Use contracts to document and verify that code does no more and no less than it claims to do.

I usually tell developers using my APIs to go and take a look at the javadocs, when they want to know e.g. whether to check the parameters before calling a method. Frankly, I'm guilty of the same thing. When using the Hibernate API, I called the load method to check whether an instance exists in the database. Actually, I should have called get instead. I didn't consult the documentation, partially because of its quality. However it took me quite a while to find the bug, when a look at the javadocs would have solved the problem in no time.

Maybe we are so used to bad API docs, that we don't consult them anymore. I think it doesn't need to this way. We should try to shed some light on the usage of our APIs by providing a reasonable documentation.

3 comments:

unmaintainable said...

I kind of disagree with your example; documenting everything in every tiny detail may be a requirement for widely used public APIs (like the JDK), but IMHO a more relaxed approach is useful for the rest of us. Like "configuration by exception" is considered a good idea, there should also be "documentation by exception". That is, don't document every tiny detail but instead follow some higher level conventions. Many people do that intuitively, but unfortunately forget to document those conventions.

In many APIs it's a common (but unstated) assumption that you musn't pass null unless explicitly permitted. If you do, the usual Java conventions say to throw a NullPointerException (the JDK does, might even be mentioned in Effective Java). Not everybody does that, it's also common to throw an IllegalArgumentException. In any way, programming errors (and I'd say this is) yield RuntimeExceptions.

Apart from that I think that "design by contract" could be a huge gain for the industry as a whole. I don't know if there's decent tool support for Java (generating code from comments isn't my idea of "decent"), but if common IDEs and build systems supported it, we could have better error checking and better documentation at the same time.

Unfortunately, I've never seen DbC used outside of university. It's mentioned as the practical application of Hoare Calculus (at least at my uni) but hasn't been widely adopted. Pretty sad.

david said...

unmaintainable > ...documenting everything in every tiny detail may be a requirement for widely used public APIs (like the JDK), but IMHO a more relaxed approach is useful for the rest of us. Like "configuration by exception" is considered a good idea, there should also be "documentation by exception". That is, don't document every tiny detail but instead follow some higher level conventions.

I totally agree that you shouldn't document everything in detail. There are exceptions, e.g. internal interfaces or dependencies. I tend to document everything, which is kind of a bad habit of mine. I'm trying to quit :-)

However, I think if you are developing an API of a subsystem or component, which could potentially be reused or share across teams, it's crucial to document it. I'm not talking about internals! Just the interfaces, like Spring does it. If you are consuming an undocumented API, you are left assuming or should I say hoping that the developer of that module was thinking the same way we did, which is following common Java conventions.

unmaintainable said...

I completely agree, decent documentation for published interfaces should be mandatory. Otherwise your clients have to write test cases to check their assumptions (which may not always be the worst idea anyway).

A while ago, I blogged about my personal strategy for code comments and documentation, in case you're interested.


com_channels

  • mail(dlinsin@gmail.com)
  • jabber(dlinsin@gmail.com)
  • skype(dlinsin)

recent_postings

loading...