Welcome to

`www.brising.com`

Compellingly Beautiful Software

 Products  Programming  Contact

How to Misunderstand and Abuse Objects

I could have written this piece twenty years ago. Given what I still see and the frequency with which I see it, it is no less relevant today.

Let me propose an analogy:

Remember when you learned geometry? You encountered that big word perpendicular. What does it mean? Somebody drew a picture with two intersecting lines, pointed at it, and said “Perpendicular means… that.” You easily recognized what that was, but your teacher still made you state that more carefully. When formulating proofs, you couldn’t depend on seeing that to know that was there. The more careful definition said: if any pair of adjacent angles formed by intersecting lines are equal, then the lines are perpendicular.

Remember when you learned about vectors? You soon learned that if A and B are vectors, and A∙B = 0, then A and B are perpendicular. Amazing! It’s one of those things that’s hard to believe until you see it. This works in two dimensions. It also works in three dimensions. Then your teacher told you it works in any number of dimensions. Of course, you wondered what perpendicular looks like in seven dimensions, or thirteen, or thirty-eight. But you eventually came to understand that what it looks like doesn’t matter as much as that perpendicular vectors behave in a particular way, and you don’t have to be able to see it to know it or to use it. If you have vectors A and B in any number of dimensions, and A∙B = 0, you know they are perpendicular, and that has interesting and useful consequences beyond what they might look like.

If you went farther in mathematics, you noticed that, in upper-division courses, the definition of perpendicular was turned on its head. It was no longer about what it looks like, where things that look like that happen to have A∙B = 0. The more elementary thing about perpendicular is given as A∙B = 0, which also happens to mean A and B look like that in situations where you can see them.

Remember when you learned to program? You learned to think very carefully about sequences of actions performed on numbers. Eventually, you learned that it is often useful to group certain numbers together into a structure. Then you learned to write procedures that work inside these structures. You also learned to write procedures that treat structures as things in their own right, regardless of what is inside them. Typically, you defined structures in one place, then wrote the code that worked with them somewhere else, probably in a different file.

Somebody had the bright idea to syntactically associate the code that works on structures with the structure definitions themselves.

This is where the story gets interesting.

It is not just a matter of how code is grouped together in files. There is a style of thinking that goes along with this, and it has significant payoffs in the economics of code production. Amazing! It’s one of those things that’s hard to believe until you see it. And, just as with the definition of perpendicular, there is a place where the definition of structure is turned on its head. The most important thing about the turned-over definition is not the data grouped together inside the structure. The most important thing is what the structure does. How does the structure behave? Whatever data may be inside is important, but it is secondary. With this new kind of thinking, we don’t call it a structure anymore. We call it an object.

It is telling that the first real object-oriented programming language was named Simula. The idea was to have software objects simulate the behavior of things in the real world. Birds fly, fish swim, living cells divide, bridges connect. The world is full of things and their behavior. As an object-oriented programmer, it is your job to notice and name this behavior and associate it with who is doing it. The conversion of behavior into code and data happens later.

Some software objects have no correlated object in the real world. StringBuffer or DataInputStream are good examples. But the style of thinking that goes into simulating real-world objects carries over into these software-only objects, too. If you use this style of thinking, several nice things often happen: your code is less complex, easier to change, and may even be reusable in contexts far outside of your original work.

Some objects exist only to hold data. But it is valuable to try to think beyond this. One of the classic examples of OO programming is a class called Employee. An instance of this class has a name, home address, job title, start date, end date, hours worked, and probably lots of other data, all of which it needs to know. But let’s look at Employee from another angle: an Employee joins the company, then works and gets paid, might get sick, hopefully takes vacation now and then. Eventually, an Employee quits or retires or dies or is fired. All of these things involve Employee attribute data somehow, but that is almost beside the point. And the fact that it is almost beside the point is exactly the point of real OO programming.

That last sentence about what an Employee can do is especially interesting. All of these behaviors indicate that the Employee leaves the company, but for different reasons. These differences may be significant in what happens with the Employee’s underlying data. Even more significant differences may be found in the Employee’s subsequent relationships with other objects. It is helpful to think and write about these behaviors in their original behavioral terms, not only in terms of what happens to the underlying data.

One important realization of OO programming is that objects often have a life cycle. For example, an Employee does not work before joining the company, or start its life by being fired. If any Employee were to endure such treatment, that would indicate serious flaws in the program.

Another important realization of OO programming is that objects rarely exist by themselves. Usually, they exist in a larger context that includes other kinds of objects with different kinds of behavior. The overall activity of an OO program is to have these various objects invoke behavior in each other. For example, the bird hunts the fish, which causes the fish to evade the bird.

But then things go sideways.

It often happens that new OO developers are shown some supposedly object-oriented code. They see one or more objects that hold internal attribute data. They also see methods (things that look suspiciously like functions), where some of these methods are referred to as getters and setters. These particular methods return or change the values of the internal attributes. Someone who is paying attention then asks “Why do I need these getters and setters, when I can just access the data directly as variables?” Someone who is in the know then answers “Because getters and setters are object-oriented.” The novice developer shrugs and goes on, assuming he will understand this wisdom at some point.

Meanwhile, the code that does the interesting behavioral stuff often appears elsewhere in a different class or file, often with the word Util or Helper or in some cases Service in its name.

No. If this is what you are doing, you have missed the party. All that interesting code found in your Helper class is supposed to be in the objects themselves. How best to arrange behavior within objects is often obvious. Sometimes, it requires finesse. Occasionally, you have to make a difficult choice and live with the consequences. But saying the behavior goes somewhere, for example in a Helper class, blunders past the point of the object-oriented paradigm. If you want the benefits of OO, you have to do OO, which means that behavior lodges in particular places, not in some random file over there somewhere. Your job as a software designer is to discover where those places are so the economies of OO can take their best effect. In most situations, if you write a class called Util or Helper, you are doing it wrong.

None of this implies that getters and setters are bad. For some objects, getting and setting their internal attribute values really is all they do. But for many objects, the state of internal attributes is only an implementation detail of more interesting behaviors that aren’t called get and set.

Are helper classes ever a good thing? Let’s look at different angles of this question.

Some languages allow classes that cannot be subclassed. Java final classes are an example. Do you control the source code for these classes? Maybe you or someone you work with wrote them. Or maybe they are like java.util.String: part of a standard library and beyond your reach. In this latter case, you can’t modify the class, and neither can you subclass it, even if your application could really use it. If so, a Helper or Util class is the best you can do, where all methods in that class are probably static, which is to say they are just functions that don’t really play the OO polymorphism game. Sorry. But if you control the source code for the class in question, don’t even think of making your class final. Remove the silly final attribute and merrily modify or subclass away, being sure to lodge behavior in the appropriate objects so you can use them as objects.

Why is there a final class attribute at all? When Java was first released, it was anticipated that browser-based applets would become a major presence in the software world. These are little applications embedded in web pages, doing things the usual web machinery might otherwise do poorly. In this context, and without a final declaration, one could use a subclass of common things like String to subvert the security mechanisms in the so-called Java sandbox, thereby allowing unrestricted access to the host machine. I saw a demonstration of this once, but I don’t remember the details. It was the kind of clever trick that people who like to rape other people’s computers seem to be good at thinking up. Declaring a carefully chosen set of classes as final helps to prevent this possibility. It also prevents you from doing things you would otherwise like to do, like real OO programming.

Please note that applets became a non-issue as events unfolded. They are no longer officially supported by modern web browsers. Javascript became powerful enough on modern processors that people prefer to do applet-like things with that language instead. So final classes in Java turned out to be a complete waste of effort (and maybe the complicated aspects of ClassLoaders are, too). Unless you are actually writing applets (let’s face it: you aren’t), there is no good reason for the final class attribute to exist.

Is final a bad thing everywhere? No, it isn’t.

If you need a constant value, final is of course how you declare it in Java.

Declaring a method as final sometimes gives the compiler the option of generating the method’s code inline, which can improve performance in some situations. Simple getters and setters are the usual candidates. But this has risks. A final method cannot be overridden in a subclass, which means you as the code author must take responsibility for two things. First, whatever code you write in this method had better be right, because anyone using your class will not be able to fix it in a subclass if you mess up. Second, remember polymorphism: whatever code you write in this method had better be the only thing anyone will ever want to do with the method’s behavioral concept. Both of these are high-risk propositions.

On a similar note, the private attribute is high-risk in every situation where it can appear. In theory, anything private is considered an implementation detail that users aren’t supposed to know about, much less use or change. In practice, this just doesn’t work, and for the same reasons that final classes are a bad thing. In true OO thinking, code that cannot be used in novel ways is not very useful. And in any paradigm, broken code that cannot be fixed is not useful at all. What private needs to mean is unsupported: those who are responsible for the code can change this interface or implementation without warning and you are responsible for the whole thing if you choose to make direct use of it. Compared to the alternatives, that is usually just fine.

In any situation where you find something declared private, change it to protected and add a comment that this code feature is unsupported (it would be nice if there were an @annotation for this). Then you can get on with your life, and those who have to use your code can, too.

So why bother?

If well-developed behavior is the OO goal, some still wonder, with good reason, why getters and setters exist. The answer has already been implied, but let’s make it explicit. When one object interacts with another, it is supposed to do it through behavior, not through data. So if one object wants to know how many coconuts another object has, that is done by invoking a method (calling a function) instead of accessing a variable. Purists will then say Just Get Used To It. If you dislike such admonitions, I can’t blame you. But the purists have a point. Those simple little methods that put a cover over variable access also cover the possibility of side effects.

There are high-temperature articles in the literature of Computer Science disparaging functions with side effects: code that does what its name implies, but also does something else it doesn’t tell you about. There are times and places where these disparagements are right on point. There are other times and places where they only miss the point. There are situations in which getting or setting a value needs to have other consequences. The most common of these result from Observer relationships. Restricting variable access to an object’s own methods is the easiest way to ensure these consequences happen in a uniform and reliable way: when a method accesses a variable, it also invokes the side effects. It is still your responsibility as a programmer to ensure that the side effects of these side effects don’t run out of control.

Perhaps you are thinking “OK, but I’m just going to refer to public variables because that Intentional Side Effect issue doesn’t apply to my object in this situation,” and you may be right. As your program evolves, will you continue to be right? The use of getters and setters is an insurance policy in this regard. It is not without cost, but the cost is reasonable, and the compiler may be able to make that cost go away entirely.

So just get used to it. It isn’t pointless religion. Always delare your instance variables as protected. Then write and use the getters and setters, assuming getting and setting is what your objects really do (see the discussion of Employee, above). If you are concerned about the impact of this policy on your program’s efficiency, please wait until you can demonstrate that it actually matters to your running program. It probably doesn’t.

What it is, and isn't, about:

When OO was new back in the ‘70s and ‘80s, the first thing one tended to hear about it in any discussion was inheritance: ThingB is like ThingA, but with some other behavior (and maybe some other data) added on. There were lots of cute examples illustrating the point. As a result, people had to be forgiven for thinking OO was about inheritance. This sometimes had sad consequences. A common misunderstanding was that every class in your program was supposed to inherit from one class you wrote called something like MyApplication. Perhaps you are thinking “That’s really stupid. No one would do that.” But lots of not-stupid people tried to do exactly that back then, thinking they were doing the right thing, and then gave up on OO programming when it turned out badly for them.

No. OO programming is not about inheritance. OO programming is about encapsulation (wrapping data inside behavior), and then it is about polymorphism (the same method name invokes different behavior in different classes). Many OO languages offer inheritance, and a good program may use that in some aspects of its design.

If inheritance can be a good thing, then multiple inheritance must surely be even better. But we learned the hard way that it is not just possible, not just easy, but likely that you or someone close to you will make a costly mess with multiple inheritance. These days, programmers are widely discouraged from using it even in languages where it is available to them, just as programmers are discouraged from using the goto statement if their language happens to have one.

Even with just-plain-old-single-inheritance, one can do regrettable things. It is easy to find examples of single-inheritance hierarchies where the design aspects being inherited don’t offer much help by being inherited while more-deserving aspects go wanting. Given the messes people can and do make with inheritance, it has become doctrine in some circles that all inheritance is bad.

I will tell you flat out: the doctrine that all inheritance is bad is bad. It’s a matter of understanding your tool set and then using the right tool for the job. Inheritance isn’t always the right tool. In the cases where it is the right tool, it can have an amazingly positive payoff. But just because you start off thinking of your design in terms of some obvious inheritance hierarchy doesn’t mean you are right about it. Sometimes you have to be willing to recognize a mistake and back away from it. This might result in a different inheritance hierarchy that is more helpful than whatever you started with. It might also result in no inheritance at all.

One of the reasons inheritance is appealing is because it sets up an implicit basis for polymorphism. If ThingA has subclasses ThingB and ThingC, we have lots of opportunities for writing simple expressions involving ThingA, with the option of different behavior if the runtime instances are actually ThingB or ThingC. The expressions don’t change, and there are no instanceof checks, but the right thing happens because the methods in ThingA, ThingB, and ThingC are polymorphic. This means your program has less control structure, which probably makes it smaller, and also reduces the opportunities for bugs. Really. It does. Polymorphism is a kind of invisible control structure that simplifies your code.

This means you should be thinking in terms of polymorphism, which most OO languages carry out via a fancy function call mechanism whose details you really don’t need to see.

Here are a couple of good design rules for your OO code:

1) If your class definition contains a variable named anything like type, you are probably doing it wrong. There are exceptions to this, often resulting from so-called Object-Relational mapping, which may be necessary. But usually, if you create an object with a variable called type, ask yourself sternly if you should instead be making subclasses with polymorphic methods. Why? That type value is probably going to run through a switch statement at some point, right? But a polymorphic subclass would just invoke the right behavior because that’s what that kind of object does.

2) If your code needs instanceof, you are probably doing it wrong. There are exceptions to this, too, often resulting from the aforementioned final classes, which would never have been defined that way but for reasons that looked good in the increasingly distant past. So you may not be able to avoid instanceof when dealing with instances of String or Integer or the like. Sorry. In all other situations, use polymorphism.

Not long ago, I worked with a guy who insisted on implementing polymorphism the hard way. In his base classes, the method definitions had chains of if/instanceof/elseif/instanceof… to discover what kind of object the code was really dealing with at that point, so it could then provide the right behavior for the subclasses in the base class code. This was instead of overriding implementations in subclasses. His rationale? “So all the code is in one place!” He explained that he didn’t like to read method definitions horizontally across subclasses. I understand his objection, but I don’t agree with his conclusion. Reading horizontally is a skill you need to develop if you’re going to either create or consume real OO code, and most people don’t have that skill at the outset. So develop that skill and get on with it. You can also look for better development tools to make this task easier. But don’t backslide into stone-age code. Language-based polymorphism is real. It works. Use it.

It sometimes happens that one needs to think of unrelated classes (those that don’t inherit from one another or from some common superclass) as behaving similarly in some aspects. Most likely, this means they have some of the same method names. One may even want to share concrete method implementations across unrelated classes, not just their names.

This is what gave rise to multiple inheritance, which has already been mentioned as being bad. Nevertheless, the need exists and persists. In recognition of this need, some languages provide the concept of interface: a named set of method declarations that can be applied to any class, with the expectation that the developer will fill in appropriate implementations of those methods. Some languages also allow default method implementations to go along with the interface.

One may wonder at this point: what is the difference between multiple inheritance and interfaces with default implementations? That is a good question, and the answer is: not much. But these ideas are currently viewed as acceptable, while multiple inheritance is not.

However…

This is still an opportunity for you to make a real mess with hardly any effort at all. One of the hallmarks of good OO code is factorization: taking the task apart into the right arrangement of distinct objects. So if you find yourself using interfaces to create a class that might realistically be called MultiPurposeMegaObject, it’s time to step back and think again about what you’re doing.

Earlier, I mentioned the thinking that resulted in “If inheritance is good, then multiple inheritance must be better.” Except that it isn’t. And a similar phenomenon arises around interfaces: “If interfaces some places are good, then interfaces everywhere must be better.” Except that they aren’t.

There are a few good reasons for using interfaces. For example:

To enforce specific polymorphisms across unrelated objects.
To generate proxy objects across languages or communication channels.
To express some design abstraction without implementation.

It is (3) where people start to go crazy with interfaces. In a few situations, taking (3) at face value really is the best one can do. Usually, what should be done instead is to define an abstract superclass, probably with a lot of implementation already in place. Unless there is a real need to implement across unrelated classes, either now or in the near future, an interface is wasted work. Interfaces should not often be regarded as ends in themselves.

Interfaces are one kind of abstraction. Moving farther into abstractive concepts, we encounter design patterns. These are not language features. They are assemblies of characteristic objects that solve well-known problems in well-known ways.

Design patterns are good to know. But design patterns should never be regarded as ends in themselves. They should take a back seat to the original intent of object design, which is to simulate things in the real world. If your simulation is good, applying design patterns can make the difference between good software and great software. But thinking in terms of design patterns first is probably unhelpful.

Some people haven’t figured this out. One of my recent co-workers told me about being taught OO at a major university with the well-known “Gang of Four” Design Patterns book as the primary text. I suspect this is done because it is easier to teach design patterns than it is to really teach OO. Design patterns involve thinking about code. OO involves thinking about the world, making the assignments harder to grade, which can influence plans teachers make, not always for the better.

I recently encountered some code that had been written twelve years before. It defined:

A constructor for a class, let's call it ObjectT.
An interface FactoryIF<T> for a “Factory” pattern.
An implementation of FactoryIF<ObjectT> for a specific ObjectT Factory.
An interface FactoryFactoryIF<T> for a “FactoryFactory” pattern.
An implementation of FactoryFactoryIF<ObjectT> for a specific ObjectT FactoryFactory.
A static method wrapped around all this stuff.

How many implementations of the FactoryFactory interface were there? One. How many implementations of the Factory interface were there? One. How many lines of code were in any of the methods? One. How much value did stages 2 through 6 add to the code? It wasn't obvious they added any, so I asked why we were doing this.

“Well, applying these design patterns allows us a lot of flexibility if we ever need to extend this.”

In twelve years, no such need had ever appeared.

“Yes, but it might… some day.”

Yes, but… not likely… ever. And I claim that anyone who ever looked at that code knew that. Meanwhile, we have work to do. Imagine a debugging run where the code needs an instance of ObjectT. How much useless nonsense do we have to wade through before this instance appears? Why is this better than just invoking the constructor for ObjectT? In some situations, there may be a real answer to that question, but “Because we can!” isn’t it.

Code like this is not hard to find nowadays. If someone can create an interface or apply a design pattern, they will, and then they will brag about it, even when it accomplishes nothing useful, which is often the case. There is a name for this: Compulsive Object Pattern Design (COPD). Like other addictions, it is nothing to brag about.

Many software designers like to learn and do sophisticated stuff. This can be a good thing. But we need to ask ourselves often: "Am I writing better code, or am I just showing off?" Sometimes that question is hard to answer. Sometimes it isn’t.

—

This article is obviously aimed at Java programmers, but others can take note.

Back in the early 1990s, it was true that, if you really wanted to learn OO programming, you needed to learn Smalltalk. That is still true today. The people who created Java learned some things from Smalltalk. They didn’t learn enough. In some aspects, their language or its libraries couldn’t easily do what Smalltalk easily can. It still can’t. In other aspects, they just chose to do it wrong. If your software does a whole lot of primitive arithmetic (e.g. inverting large matrices, computing Fourier transforms, etc.), Java will get up and walk away from Smalltalk. But if you do the kinds of stuff people actually do with OO languages, then Java, the language, has little to recommend itself. It does have a really good VM. Interesting: that VM seems to have been designed so one couldn’t reasonably implement Smalltalk on it. That was true then. I don’t know if it is still true now.

Why did Java take over the OO workspace?

Because Sun, in a minute, had more marketing muscle than ParcPlace ever had.
Because Java looks a lot like C, and if you don’t really get OO, you can still write something that looks a lot like C. That is what a lot of people still do today.

It is hard to say which of these factors mattered more in the early days of Java. However it happened, Java didn’t kill Smalltalk, but did force it into the background. The people who still use Smalltalk now are either cheerful dilettantes or people with real work to do, more than can reasonably be done with Java. If you are a competitor of such people, they would prefer you don’t know why they can easily do stuff you can’t. This is why ParcPlace and its successors have never been able to make a persuasive client reference list.

And then we could talk about what happened to Lisp. But let’s do that another day.