Encapsulation Boundary Diagrams
Posted 2 Aug 2001 at 20:16 UTC by Bram 
Encapsulate - To seal from external contamination, as if putting in a
capsule.
I highly recommend planning your development using Encapsulation
Boundary Diagrams. Before getting far into any large piece of code, I
always draw an encapsulation diagram and keep it in plain site at
all times. This isn't from religious devotion - the technique is simply
indispensible.
Encapsulation boundary diagrams give a bird's-eye view of a program's
structure.
First, create a list of all the units in your program. Units should have
the
following properties -
- They each have a tagline, a single sentence which describes what
they do, such as 'writes information to the log file'. If you
have trouble writing taglines, then chances are your boundaries
are fuzzy or some of your modules should be broken into smaller pieces.
- They're maintained separately.
If you find yourself altering two
units in tandem, separate their shared functionality into it's own unit.
- Each unit has it's own tests, which can be run without
linking in other units.
Now for the diagram. Write the name of each unit on a piece of paper and
draw
circles around them. Draw lines connecting each pair of units for which
one calls the other. Draw lines instead of arrows - return values often
change into
callbacks and vice versa.
In general, each unit has one interface per line which
terminates at it.
Sometimes you can be clever and make a single interface work for two other
units, but that's the exception, not the rule. Don't be afraid to
separate interfaces
if fusing them becomes even a little bit hackish. Try to keep interfaces as
small as possible - the goal is to reduce interdependencies.
Here's the encapsulation diagram for my current project, BitTorrent -
___________
/ RawServer \
\___________/
|
_____|_____
/ Encrypter \
\___________/
|
_____|_____
/ Connecter \
\___________/
/ \
/ \
/ \
_____/____ ____\_______
/ Uploader \ / Downloader \
\__________/ \____________/
\ /
\ /
\ /
_\_______/_
/ Throttler \
\___________/
And here are the taglines -
- RawServer - calls select()
- Encrypter - encrypts connections
- Connecter - creates and keeps track of connections
- Uploader - sends files to peers
- Downloader - gets files from peers
- Throttler - stops peers from downloading when they get too in debt
Disclaimer - I left out several modules and a small connection.
Communicativeness is more important than absolute accuracy.
Encapsulation diagrams are great for time estimation - just estimate
how long each unit will take and add them up. I generally
take 3-7 days for a unit, depending on how tricky it is.
I learned encapsulation diagrams from looking at one which happened to
be posted on a wall, an ironically
chance way to learn my most valuable diagramming tool. Hopefully
this wonderful technique will become more common in the future.
-Bram Cohen
A couple people pointed out that 'boundary' is extraneous, so I'm
now officially shortening the term to 'encapsulation diagrams' :-)
How is this any different from OOA/OOD class diagrams ?
It seems that this idea of visualising your modules is very good. I for
one find myself opening up my code and staring at it blankly sometimes
just wondering how I ever wrote it.
The problems I can forsee with such a system is constant updating as
program requirement change. Is there any decent software that is
specifically built for doing this?
Its worth noting, posted 3 Aug 2001 at 16:09 UTC by Krelin »
(Journeyer)
That doxygen will spit out these diagrams for you if you install "dot"
from the GraphVis packages... It calls them collaboration diagrams.
Excellent way to start understanding a new code base (especially C++, as
convoluted and complex as C++ object heirarchies can get)...
Cheers
More notes, posted 3 Aug 2001 at 18:59 UTC by Bram »
(Master)
movement asked 'How is this any different from OOA/OOD diagrams?'
Encapsulation diagrams are similar to OOA/OOD diagrams, but have some
important differences -
- Encapsulation diagrams are much more inclusive in what dependencies
they show. If some code calls mywidget.mydoorstop.myocelot.meow(), there
will be a connection to widget, doorstop, and ocelot. Those
dependencies are hidden in diagrams centered around data structures.
- Encapsulation diagrams lump together whole units, which can include
multiple clasess and functions. There's a common misperception of unit =
class, which causes people to fret needlessly about which object within
a unit to give a certain responsibility, when it can easily be moved
around later.
- OOA/OOD diagrams include much more technical information about how
objects are related to each other, and in particular how instances are
structured, which is completely missing in encapsulation diagrams. This
makes them contain less information, but that information can be a
distraction when you're trying to get an at-a-glance understanding of a
program.
jono asks 'Is there any software specifically built for doing this?'
Encapsulation diagrams are simple enough that a pen, paper, and flatbed
scanner work quite well :-)
I'll once in a while slip and say 'module' when I mean 'unit'. That's
because I've been writing in Python and following the one unit per
module rule, which is quite handy if you're writing in a language which
allows it.
UML, posted 3 Aug 2001 at 19:26 UTC by RyanMuldoon »
(Journeyer)
UML is nicely suited for this kind of task. Dia (a gnome program) or
Kivio (a KDE program) could both be used for something like this.
Rational puts out a whole suite of programs designed to make
diagramming like this intregral to application design. There are even
books on the "Rational Unified Development Process" (or something
similar to that). I think it is an excellent way to do things, as it
really encourages people to focus on architectural issues, and not just
jumping into code. It also makes it easier to split up the application
development to let those that are good at architecture to do the
initial modelling, and then let the programmers finish off the job.
Is this a joke?, posted 4 Aug 2001 at 11:49 UTC by exa »
(Master)
This is just a graph of physical modules(=vertices) and
dependencies or uses (=edges). :)
Similar graphs are used in any modular design. This isn't an
invention, so the "encapsulation boundary diagrams" is just a fancy
name for what we already use.
Besides many languages actually implement what we call "class package"
or "class cluster" in OO design. In C++, that is a namespace or a
physical file (according to your preference). In Java, you have the
package construct. In Ada-like languages that's a module.
Thus, such graphs can be automatically generated. Make each such
language construct a node, and draw an arc between two modules if a
use (function call/data access) exists.
In the reverse direction, you may write such graphs in a formal
language, and a code generator can create skeleton code for your
modules. In a more complex setting, I would presume that you could
access this graph any time to view and edit in an IDE. Note: a
language that supports modules is better suited for this purpose.
Thanks,
No I'm not joking, posted 6 Aug 2001 at 16:36 UTC by Bram »
(Master)
Encapsulation diagrams may be 'obvious', but -
- They're extremely useful. I get more out of them than all of UML
combined.
- They're very underused. Hopefully this article will help popularize
them.
I didn't even mention UML. That's just crap from some bigots who think
they know how to program.
Things like UML might help you design a simple business transaction
system, but it won't take you anywhere when you're writing a compiler.
For that kind of design you need abstraction, not a religious devotion
to OO jargon&design. In good programming, there is no single winning
paradigm. That's why drawing graphs that contain only relevant
information is a good idea, though I can't say that I like the
"encapsulation boundary diagram" name.
I'd better write some code and document them properly than to use
those stupid UML "computer aided software engineering" tools.
Those idiots who did UML even suggest using UML for kinds of modelling
other than software. They must be really really dumb. I've seen their
documents and the first thing they do is to try to convey UML in terms
of UML, claiming to give the semantics for UML. Hah, semantics doesn't
work like that. You can't use your formalism before you fully define
it.
In cognitive sci./AI there are vastly superior studies on ontology
compared to the non-sense language of UML. A couple of people writing
crappy OO stuff for corporates can't punch a needle's hole in that
kind of research. It'd be like Britney Spears trying to imitate NIN. :P
Thanks,
__
Eray Ozkural
I often do diagrams like this on paper for visualization, but for when
you need a computer version (documentation, etc) I often use the GNU
implementation of the Pic language. You can use groff to make postscript
directly and the output
looks beautiful---especially when you use a font like Helvetica, you can
achieve a visual style much like the Design Patterns book.
It's trivial to make and update diagrams with it (my example took
just a few minutes) and given the number of relative placement options
(i.e. put this thing down and to the left of that last thing etc) it
would probably not be hard to make Pic files automatically generated
from some other program.
I put a short example online here. This quickie
just reproduces the ASCII diagram you did above but on a GNU system you
can make PostScript from it directly (see comments in file) and then
include it in anything else. (TeX will allow somewhat better
integration, as GNU Pic has a TeX mode.)