Component System Looking for Design Peer-review

Posted 1 Jan 2001 at 01:50 UTC by pphaneuf Share This

XPLC (cross-platform lightweight components) is a component system with goals of providing extensibility and reusability both inside and between applications, while being portable across platforms (and languages) and having the lowest possible overhead.

I am looking for an early peer-review of its design itself.

Initial insights

My idea went through multiple iterations of prototyping and getting myself to think outside the box. After some quite complex scheme, it became clear to me that I wasn't going to turn around the whole compiler and linker situation. It's been quite a while (at least 20 years) that better solutions existed and are now tried-and-true, but the good old compiler/linker duo is still there, seemingly to last. So my flow of thought went around this.

To achieve this, I tried abstracting what was going on when you link software together. There are two main things that happen: gathering of interface information and the process of actually getting to the code.

In C++, the former is done by the compiler when going through the header files and is codified in the generated assembler as offset constants into virtual method tables and other such things. The latter is done by the linker when resolving symbols.

In a component system, interfaces provide the first, and the service manager the second.

Complexity is to be avoided as much as possible. DCE is a prime example of this, leading to a whole lot of problems that can arguably be called a failure.

Transparent distributed components are explicitly not part of my plans. RPC (remote procedure call) as a mean of achieving distributed components never really took off and is being phased out even by its bigger supporters like Microsoft toward more explicit message-oriented communication. See AnoteOnDistributedComputing.

I will cite Mozilla XPCOM and Microsoft COM as influences on my basic design. This article will assume a basic understanding of the common interface-based COM/XPCOM style components.

XPLC Core

The bare essential of XPLC is composed of a small number of interfaces and an even smaller number of components.

There is a single entry point that is not part of a component, the XPLC::getServiceManager() function. This entry point is how XPLC is bootstrapped and doesn't need to be called more than once. An example of calling it more than once would be two unrelated pieces of code wishing to use XPLC, say, the main executable code and some library code. The service manager, following the SingletonPattern, is shared among those unrelated pieces of code, providing a rendez-vous point. XPLC-aware code get the service manager passed to them as part of their initialization process, as it is needed to do pretty much anything.

The service manager itself is a very simple piece of code, it whole purpose is being a hook for extensions. The most often used method of the service manager is the getObject method, which gives you the object associated with a UUID.

A UUID (universally unique identifier), for those not familiar with them, are 128 bits numbers that are unique across time and space, for all practical purpose. They have been invented for use in the DCE, but have been used in various other environments, like COM (which based its RPC layer on the DCE RPC) and the Linux ext2 filesystem (as a volume identifier).

The service manager does not do the mapping of UUID to object itself, but rather leverages components called service handlers, which can be added and removed from the service manager. When the getObject method of the service manager is invoked, it iterates over its list of service handlers and invoke their own getObject method, until one returns an object. XPLC has only one default handler, a static mapping of UUID to objects, that is used for all components that are not dynamically loaded, such as those provided by XPLC itself and those linked in to the main executable.

I would like to point out that the XPLC core does not have any configuration file or use any environment variable in its working. The core itself is extremely simple, under complete programmer control and will not do anything it was not prompted to do (like file or network access of any kind).

Component Categories

These are central in the extensibility of XPLC applications. The general principle is that a category is a UUID that is a list of other UUIDs rather than directly an object.

Categories are going to be implemented through a category manager component, which is a service handler. When the UUID of a category will is asked for, the category manager will build up a category object containing the list of UUIDs.

Categories have a concept of a default component, which can be either the first available component in the list, or a specific component (this is so that it can be made configurable by the user of an application).

Dynamic Loading

There are two ways of doing dynamic loading included with XPLC, but keep in mind that these are not the only two possible ways. They are themselves implemented outside of the XPLC core, using only facilities also available to any other components. For example, dynamic loaders capable of handling Java, Python or Perl instead of dynamically loaded shared libraries would be possible.

The first dynamic loader is a simple component that can load a single DLL (called shared object in the Unix world, but I will use DLL for this article to avoid confusion with the concept of "objects that are shared") and make its components available when hooked as a service handler. It's only parameter is the filename of the DLL to load. The DLL will not be unloaded automatically when unused, only when the loader is actually destroyed. In this regard, it is very similar to the explicit dlopen or LoadLibrary interfaces.

The second one is more complicated, covering a whole directory. When created, the component will survey the directory it has been given as a parameter to find any DLLs containing components and build up a map of the UUIDs serviced by these DLLs. When asked for a UUID, the dynamic loader will load the required DLL if needed, and will automatically unload it if it is not needed anymore. If possible, this UUID to DLL mapping information can be saved in a cache file, along with the modification time of the directory and of each DLL, so that the cache is properly invalidated if a change occurs.

Note that these two dynamic loaders are ordinary components and are not singletons like the service manager is. In fact, a single instance of the simple dynamic loader can only load a single DLL, so an application wishing to load multiple DLLs would simply create an instance of the simple dynamic loader component per DLL.

There is a potential problem with these. The service manager does not have any specific ordering in its use of service handlers, so prioritizing component modules is not possible with this design. Also, the directory-based dynamic loader could face the problem of having more than one DLL implementing the same UUID (different versions of the same component for example) in the same directory.

Ordering of service handlers withing the service manager could be resolved through changing the service manager interface to have appendHandler and prependHandler methods instead of the single addHandler method.

Ordering of components within a directory could be resolved by the modification time (the newest component wins). I do not find this solution particularly appealing, but it is the only one I see at the moment.

Versioning

The COM-style interface-based components are a very good basis for versioning, as has been pointed out in ComVsCorba.

I also have the intention of enabling version verification at every possible point of friction, such as the libxplc.so/XPLC.DLL itself, loadable components, and so on. This is important to allow refactoring later on without crash and burns "notification".

Comparisons

Here are some "competitors" to XPLC, for comparison purposes:

  • XPCOM

    XPCOM, while being a good thing overall (I'd rather see XPCOM dominate than Microsoft COM), suffered from the pressure of having to release the Mozilla project. A lot of it is thrown together and it seem to lack focus. A lot of things made it into what the Mozilla project calls XPCOM that doesn't have a whole lot to do with components, so things are getting a bit bloated code-wise.

    While they would like it to be used in more places, they are putting their efforts into making it what Mozilla needs first. This is one of the reasons that I think an independent project might be better to take care of something shared with other projects such as a component system.

  • CrystalSpace Shared Class Facility

    This project shares some of the goals of XPLC, most proeminently being lightweight. They also share a problem I see with XPCOM, that it is part of another project.

I did not mention COM or CORBA on purpose, as they do not share enough of the goals of XPLC to be considered on the same level.

Implementation

Some notes about the implementation of XPLC as it stands now.

  • It is developed using some ExtremeProgramming. For example, the test suite accounts for roughly 30%-35% of the code, and the tests are developed before the component themselves are implemented.

  • It is weighting in at under 2000 lines of code currently, and should still be below 4000 lines of code with everything described in this article be implemented.

  • The stripped libxplc.so shared library weights in at 24 kilobytes on Linux/Intel. I would be surprised to see the full implementation of XPLC reach 100 kilobytes.

  • A DLL containing XPLC components does not have to link with libxplc.so/XPLC.DLL.

As you can see, XPLC is trying hard to live up to the "lightweight" part of its name.

Conclusion

Starting with this base, I have confidence that XPLC could be used to build complex and extensible systems. To me, one of the things I like most about XPLC is its simplicity, which can then be used to build maintainable complex systems, similarly to the Unix pipes. Unix pipes have been designated as the only truly successful component system, so I see sharing this important attribute with them as a good sign.

The lack of built-in transparent remoting in XPLC should not be an obstacle to distributed components, because such a remoting or messaging layer could easily be added to it, just like rsh or ssh can be used to make distributed systems out of Unix pipes, without them directly supporting remoting.

Thanks in advance to the Advogato community for your thoughtful comments!


A bit too simple?, posted 1 Jan 2001 at 18:55 UTC by hp » (Master)

It seems like most of what a component architecture should have isn't specified by your article. What about threads (to use threads, you need all your components to be using the same threading abstraction)? How are exceptions handled (C++ or CORBA style)? Do you use an IDL? Do you have dynamic reflection? If you don't have dynamic reflection, you pay in bloat every time you use a non-C++ language, because you have to compile static C++ stubs for every language you use a component from; what's the rationale for bloating and complicating things for non-C++? (Do you intend to support non-C++ languages?) MS .NET also uses dynamic reflection to automatically build a proxy for remote objects; you would need statically compiled stubs for remote objects as well. How are object lifecycles managed (refcounting, GC)? Do you define a fixed ABI or simply use the C++ compiler ABI? Do you support IDEs with "object properties" along the lines of Delphi? Do you have some infrastructure for events/callbacks (signals/slots), or does this have to be constructured ad hoc by each interface? How extensive is your type system - is it very limited like XPCOM, or more general like CORBA? If limited, how do you e.g. pass a rectangle to a method, do you always have to explode the rectangle into primitive types such as int and pass those individually? etc., there are tons of issues to consider ;-) right now it sounds like you basically just have an abstraction around dlopen() for loading C++ classes, I'm not sure that constitutes a component system...

Re: A bit too simple?, posted 1 Jan 2001 at 21:37 UTC by pphaneuf » (Journeyer)

I agree that the article only covered the bare minimum and didn't cover much infrastructure, only the general architecture.

Threads are left unspecified for the moment. I wish not to encourage a very bad tendency in so-called "modern" software, but at the same time I realize that we are not a whole lot of people to share my view and that those who don't will have to be taken care of in some way. So, some plan for thread definitely has to come up. I was thinking of specifying that as far as XPLC is concerned, there is only what COM calls "free-threaded components", or something similar. That could very well be contrieved as a cheap cop-out though (which could be right).

Exception are left out, in the name of simplicity. Error handling is expected to be C-style. This too is not very well taken care of, but COM-style HRESULTs for languages that don't support exceptions are like applying razor blades to my eyes (rather unpleasant I might say). Again, this is a controversial area for me.

About IDL, while I do not use it at the moment, I intend to do at some point. At this very moment, only C++ is actually supported to get to a feature-complete first release of the core, but the intention is to be cross-language as well as cross-platform, as stated at the beginning of the article. There will be some level of reflection. It might not be as dynamic as I would like (I like Objective-C more than C++, but I'm rather forced into C++), but there will be some reflection. Static stubs for every supported languages is definitely out of the question.

There is an interesting thing with IDL and reflection that you brought up, in that it will allow the transparent remoting that I said I wouldn't support. A transparent remoting add-on would definitively possible, but the style of the interfaces will definitely be inhospitable to them. For example, as there is no exceptions, there are no obvious way of reporting remoting errors. The idea is that simplicity is a better feature than transparent remoting. An interface is complete in itself. All you get when calling a method is the return value, everything is obvious and explicit, nothing to forget about. If you ignore a return value, you are doing do knowledgefully, as it was clear in the method signature that there was a return value. No void-returning methods popping an exception on you. From my experience, handling exceptions is not much less painful than handling return values, and people that don't check return values generally don't handle exceptions either. Since exception escalation (unhandled exceptions bubbling up the call stack) is only garanteed when a language supporting exception is used in the implementation, you're not getting a whole lot in exchange for horrible HRESULT-style exception handling.

But I agree it does look a bit unnatural in languages supporting exceptions, but there is a cost to everything, I just made a different compromise. In my defense, I would say that there are a certain number of C++ libraries in existence that avoid exceptions too, and that they are not very well supported across platforms and compilers (for C++). But I also know other languages have much more mature exception support.

Lifecycle is handled by reference counting, with support for weak references (inspired by XPCOMs weak references). I am not too happy about this, as I feel that actually, weak references should be the default type of references (strong refs being owning refs, there should be a limited number of these), but they are next to impossible to implement transparently. There is a root interface similar to COM and XPCOM, with the three classic methods to manage refcount and query other interfaces. I intend to have the C++ binding make this as natural as possible for C++, with smart pointer templates and a type-safe template function wrapper around QueryInterface for example.

A note there: while a more complex GC system might be seen as easing the programmers life, I think there is no such thing as a free lunch. If you don't know what you're doing, there is no GC system that will save your ass. People often see refcounting and think that everything is taken care of by the refcounting, but you still need to respect an ownership model, or else you're doomed. One of my friend commented to me that most of the time, when you actually knew what you were doing, that you usually didn't need any kind of GC (refcounting or other). His opinion is also that a lot of people do stuff without knowing what they're doing, of course. That's his opinion, but I will admit that there is some truth in there (or else, there wouldn't be as many threads in the programs I use daily!)... :-)

ABI is currently undefined, simply using the C++ compiler ABI. Unfortunately, there's just about Microsoft that can impose an calling ABI on the compiler makers. I don't think a -fxplc-abi flag is soon to be available in GCC. I would like to define an ABI, but I doubt this is possible at all. Using the native C++ ABI for a platform (across languages) is about the best we can do, keeping in mind the high performance requirement of XPLC (imposing a different ABI would most probably require some jumping through hoops)

Delphi is very far back, so this is hard to answer, I hope this will suffice: interface will have reflection at some point, and interfaces are all there is (there are no "properties" allowed in interfaces, only methods).

There is no specific infrastructure for events/callbacks or signals/slots. This is a basic and simple component system that aims to DoTheSimplestThingThatCouldPossiblyWork. XPLC will not save you from having to read documentation and specifically still requires the programmer to Know What He's Doing (TM).

I have to draw a line somewhere. To some people, a component is something that has an area measured in pixels. Maybe my Perl background is apparent in some places, and that might hurt some Python people's eyes, I'm very sorry. Myself, I have my own One True Way for many things (including a strong opinion on threads and a way of handling events that I think is very good), but I am trying to be realistic here and understanding that imposing my own One True Way to others might be a losing proposition. So instead of doing something I would find of dubitable value, I will leave a certain non-negligible amount of infrastructure that is not 100% essential to others. I will definitely try my hand at this, and probably that XPLC will include a set of commonly useful components that will follow my ideas, but XPLC itself will not impose such things. I just hope that the included stuff becomes popular.

If I might compare with Bonobo (since that might be more where you're coming from), XPLC would be somewhere between the new GTK+ object system and CORBA I'd say. In the Microsoft world, Bonobo would be something that compares more with OLE, and XPLC with COM itself. I just abstracted away the library linking process and made it much more dynamic, I'm not pretending to go on the level of Bonobo.

The type system is currently "whatever is expressible in C++". That is going to trickle down to some subset of this that will be expressible in IDL. I intend on supporting structures if possible.

There is more than just an abstraction around dlopen(), IMHO. Sharing components between applications and component libraries themselves, transparent extensibility, a calling interface between languages (when the IDL compiler will be ready), are things I rate as essential to a successful component system. I agree with Rob Pike's assesment that Unix pipes are pretty much the only component system that was truly successful. And there's a good reason: it's dead simple. To hell with complex type system and infrastructure. Add a program/script to a directory in the PATH or add a new directory to the PATH, and you've just extended the system. No built-in remoting, but it can be done in multiple ways. There is just a minimum of architecture (processes, pipes, stdin/out/err, environment variables) and it's almost infinitely extensible and readily usable. Also note there are no threads and no "bubbling" exceptions. ;-)

It's just that so many things are optional in XPLC that just the core of XPLC might seem like just an abstraction around dlopen(). If you check attentively, the dynamic loaders are not part of the core of XPLC. The XPLC core is just an abstraction around a dynamically extensible symbol table. Everything else are optional extensions. There will be a good number of extensions bundled with XPLC that will make it a serious component system, but if you want to get down with just the core, you effectively have very little. That's designed as a feature. :-)

Thanks a lot for your comments, I am very pleased that someone with a good grasp of the subject took the time to answer.

Threading, posted 2 Jan 2001 at 20:04 UTC by nymia » (Master)

I think threading is one of the most difficult part when building frameworks. Usually, some objects need to run on a separate process by making a member function, like run() make another member function like a looper() run on a thread. Once that is done, message queues and semaphores come in the picture. After that, more things need to be added as well.

Re: Threading, posted 2 Jan 2001 at 20:35 UTC by pphaneuf » (Journeyer)

I might argue that XPLC is not a framework by itself, but more of a base to build frameworks on. Think COM underlying OLE2, or (quite similarly) CORBA underlying Bonobo.

Now, the real question is: should that be common and part of the XPLC layer? Or is the current solution (in COM parlance, XPLC having only one apartment type, the multithreaded apartment) sufficient?

Now, before thinking that providing some protection/separation for threads is needed, check out point 33 ("Beware the Free-Threaded Marshaler (FTM)", page 136) and point 35 ("STAs may need locks too", page 146) in "Effective COM".

The decision of having only the equivalent of the multithreaded apartment in XPLC is part of the high-performance goal, thus avoiding any kind of marshaling. This doesn't preclude having access to threads and all that they entails in XPLC, it just says that if you do use threads, we won't be protecting you from the monsters that go with them (which is probably all that you deserve anyway, IMHO).

Comments from e-mail, posted 2 Jan 2001 at 21:19 UTC by pphaneuf » (Journeyer)

I thought I should copy some comments I had by e-mail from people that do not have Advogato certification, so that they'd be written down along with the article.

Bjorn Reese <breese@mail1.stofanet.dk> wrote:

Hello Pierre,

I do not have sufficient privileges to post articles on Advogato (nor do I care about that), so I am replying by email instead.

First I would like to make you aware of other component models that might be worth investigating.

SOM:
  IBM's System Object Model.
  A good starting point is
    http://www-4.ibm.com/software/ad/som/library/somvscom.html

Bonobo:
  The GNOME component model built on top of CORBA
  http://www.helixcode.com/tech/bonobo.php3

JavaBeans:
  Although married to Java, you might still be able to find interesting things in JavaBeans.

The book "Component Software" (Addison-Wesley, ISBN 0-201-17888-5) by Clemens Szyperski may also be worth looking at.

Component models are essential in todays software, and on the Unix platforms we have been lacking way behind, so it is good to see that finally something is starting to happen. However, we also should try to address known problems with the current component models.

A major problem with binary compatibility is that interfaces may change quite often, so you end up with many versions. Phasing out older versions is seldomly popular by your target group (and may even be impossible in commercial settings). An alternative to predefined interfaces is to use tagged parameters. In a traditional interface you define that a given function takes an integer parameter followed by a string parameter. If you add a third parameter, then you have to use different versions of the function. By tagging parameters more flexibility in the interface is possible. The demarshaller will simply use a default value if the third tag is missing. An example of this is the XML based SOAP specification.

Renaud Hebert <hebert@bcv01y01.vz.cit.alcatel.fr> wrote:

As I have no account on Advogato.

I've found quite surprising that you compare your component system only with XPCOM and XPLC.

What about KParts ? Have you looked at it?
--
Renaud Hebert

More e-mail comments, posted 2 Jan 2001 at 21:21 UTC by pphaneuf » (Journeyer)

Robert Findlay <fcsoft@attcanada.ca> wrote:

You might want to check out the SIMPL project at

http://www.holoweb.net/~simpl/

In particular you might want to view the section on "SoftwareICs"

http://www.holoweb.net/~simpl/simpl_softwareICs.html

where the SIMPL Send/Receive/Reply messaging is used to produce fully encapsulated reusable components that can be written in any language from C/C++ to Tcl/Tk.

Even more e-mail comments!, posted 2 Jan 2001 at 22:30 UTC by pphaneuf » (Journeyer)

David Golden <david.golden@ireland.com> wrote:

I was just wondering how XPLC compared to the Bamboo open-source component framework?
http://www.npsnet.org/~watsen/Bamboo/

Replying to e-mail comments, posted 2 Jan 2001 at 22:58 UTC by pphaneuf » (Journeyer)

SOM, maybe I didn't look hard enough, is mostly a CORBA implementation, isn't it? The linked paper is quite old (1995, DCOM didn't even exist yet and CORBA wasn't quite where it was today).

Note that SOM requires a "Direct-to-SOM" compiler to get compatible code. This must be something similar to getting the compiler to behave according to the COM binary compatibility standard (like Visual C++ does).

SOM is also not open source, so in the case that I am wrong and that it actually is something more than an implementation of CORBA, then all I can do is pretty much get ideas from it.

Bonobo is interesting in a few ways.

My impression of it is that it is a clone of OLE2 on COMified CORBA. Maybe I'm wrong. I won't comment on whether this is a good idea or a bad idea, as this is not pertinent to the discussion at hand.

The important part is that Bonobo is (mostly, except for its COM look-alike interfaces) orthogonal to XPLC, and that it would maybe be possible to port Bonobo to XPLC. XPLC is more comparable to the CORBA layer that Bonobo sits on.

One of the thing I also find interesting is the adoption of OLE2- and COM-style components in Bonobo. I might be wrong, but I remember some GNOME people (miguel himself?) belittle these technologies. I might remember that the wrong way though...

JavaBeans, I will have to look at again, they had just been created when I started working on XPLC (yes, I worked on it for that long) and there was little more information than marketing drivel at the time. My understanding is that it's a set of conventions for Java classes that enables Java IDEs to show them in a component-like way to developers, through the use of Java reflection.

I looked at KParts, and it shares a problem that I attributed to both XPCOM and CrystalSpace's SCF, it is only a part of another project. It also seems to be focusing on IPC a bit, but is seemingly much more efficient at this than other efforts. I think KParts is a very good things for KDE.

Also, it goes more on the level of Bonobo (but implements more of the lower-level by itself rather than relying on what I call a component system, like CORBA).

SIMPL is explicitly interprocess messaging, so it is barely comparable with XPLC, which is explicitly in-process. The "software ICs" concept that it talks about could also be implemented with XPLC.

Bamboo is very close to XPLC's goals. Really close. I just downloaded it, and will be looking at it very soon. It seems to include more things than XPLC, looking at the NSPR dependency.

Some parts of its roadmap made me feel that it might be getting weird, notably about downloadable modules and mention that separation of interface and implementation (that they deem essential for versioning) is still in the future, inspired by XPCOM.

The downloadable modules is an example of what you often get when such a project grows out from within another. Bamboo came from a virtual reality project, which shows some of its rootings. Again, I'm not saying that downloadable modules is bad, just pointing out that the process of getting them is a side-effect of being part of another project, which also means that some things that are not needed by that project might take a lot of time coming.

Again, feel free to correct me on any of these topics, that's what I am here for!

Yet another e-mail comment, posted 3 Jan 2001 at 05:56 UTC by pphaneuf » (Journeyer)

Rick Parrish <rfmobile@swbell.net> wrote:

Pierre,

Here is something I hope you will consider in designing XPLC. It is an opportunity to avoid a flaw that both MSCOM and XPCOM share. That flaw is aggregation. If you have ever taken a look at the macros that XPCOM uses to deal with both the ordinary and aggregrated implementations of nsISupports or the Microsoft MFC and ATL templates for doing likewise in IUnknown then you know how much effort goes into dealing with this issue in both COMs.

There is a much simpler, easier way around this that offers three advantages:

1. all components are implicitly extendable without aggregation - when you implement your component you don't even have to think whether you want to go through the extra effort to support this.
2. only one interface that supports interface discovery is required.
3. All other interfaces do not (and should not) derived from nsISupports or IUnknown. This makes every interface's vtable smaller than its equivalent counterpart in XPCOM or MSCOM.

How does this work? Easy, when you create a component, you receive a pointer to it's interface discovery mechanism and this interface is the only interface from which you can discover any other interfaces supported by this component. Other interfaces dispensed by this interface are not reference counted - only this one is. As long as you want your object and any other outstanding interfaces to remain valid, hang on to it's interface discovery pointer (ie. QueryInterface).

None of the other interfaces returned by QI should have their own QI so this is the only way a client can get to the interfaces of the component.

The big pain of MSCOM and XPCOM comes from one component that extends another where it must somehow be able to inform the inner component that has multiple interfaces that all implement QI to redirect requests for the "master" interface dispenser back to the containing component. Ugly. By completely hiding where an interface came from this becomes a non-issue. The only code that needs to know where an interface came from is the code that used the component's interface dispenser to retreive it to begin with.

All a new component needs to do to assume and therefore extend an existing component is this: in the new component's interface dispensing handler requests for those interfaces it does not wish to implement are delegated to the internal component. That's it!

Something to think about.

Regards, Rick Parrish

QI'ing leads to DLL hell, posted 3 Jan 2001 at 08:20 UTC by nymia » (Master)

I think a lot has been said about this and we have seen how the use QI () leads to so many problems. IMO, components shouldn't be given the ability to QI() an interface because it leads to tight coupling. Does exposing a member function from an interface really the solution? I don't think so, one only resorts to that method only when one has no choice but C++. (Note that I am bashing C++ but only pointing out the situation where one is forced to come with a solution using one language as a reference.)

I hit the reply button too early, posted 3 Jan 2001 at 08:26 UTC by nymia » (Master)

Correction:

I think a lot has been said about this and we have seen how the use of QI () leads to so many problems. IMO, components shouldn't be given the ability to QI() an interface because it leads to tight coupling. Does exposing a member function from an interface really the solution? I don't think so, one only resorts to that method only when one has no choice but C++. (Note that I am not bashing C++ but only pointing out the situation where one is forced to come with a solution using one language as a reference.)

Re: QI'ing leads to DLL hell, posted 3 Jan 2001 at 18:40 UTC by pphaneuf » (Journeyer)

nymia: wasn't that the same thing twice? What did you change?

I am of a quite different opinion here. One of the goals of XPLC and of QI()s is precisely to avoid the DLL hell. For example, in Quadra, we have both an Svgalib and an Xlib back-ends, and thus we linked against both to work. Result: Quadra wouldn't run if Svgalib wasn't installed, even if we'd never call into it even ònce!

Now, we'll have a component module link against Svgalib and another link against Xlib, and whatever we can load will be what is available. If Svgalib is missing, the dlopen() will fail and Quadra will still run anyway.

Again, regarding QI() itself, this is a way to avoid tight coupling. You get an object, you QI() it to something you know, and if it doesn't support it, you pass. What "you pass" mean is dependent on every situation, of course, but it isn't any worse than a message not being understood by the receiver object, for example. In fact, you at least know that the object doesn't support the interface, and you can take special action instead for example.

Another example where QI() avoids tight coupling: when you want to implement a feature that is orthogonal to the main goal of a component, for example persistence. Instead of requiring that a class inherits from a persistence protocol (interface), which I find a pretty tight coupling, you can dynamically check for persistence support in a looser manner.

Of course, the downside to all these dynamic features is that they can also fail dynamically (the case where a component doesn't support a needed interface). But you have this problem everywhere you have this feature. In some languages, like Java, you don't QI(), but the Java runtime does the equivalent for you, without telling, and will throw an exception if a method is not found or something like that, which isn't better.

In fact, I like the explicit way better, because you can point at a QI() that doesn't check for NULL and say "see? you've got a bug here", where the Java runtime could be doing one of these dynamic queries at some times where things look safe and throw up on you (hmm, nice analogy!).

Curious about your goals, posted 3 Jan 2001 at 20:57 UTC by sab39 » (Master)

(Note: I'm not claiming to have any particular insight into the issues you will face. I'm not offering "design peer review" because I wouldn't consider myself a "peer" of any of the people who have responded so far. I'm simply a curious onlooker...)

Why did you choose the particular goals you did? As I understand it, the goals seem to be more or less: Simple, lightweight, and fast.

While all of these things are admirable, I'm not sure that any of them will prove compelling reasons for anyone to actually *use* XPLC. COM, CORBA, XPCOM, Bonobo, KParts, and whatever OpenOffice uses are already in wide use in existing projects. You also mention a bunch of other technologies, which are also presumably already being used in the "real world".

Now, of course you should hack on whatever *you* want to hack on, and questions of "whether the world needs another component architecture" shouldn't discourage you from persuing your own personal goals. However, if you want to persuade users of other component systems to switch to yours, you will need to provide compelling benefits over what they have. Speed and lightweightness seem not to be important goals of many projects these days, and simplicity is only good if you aren't cutting important functionality for it.

So far in your postings, it seems that your goal of simplicity has led you to explicitly exclude features that are in these other implementations (particularly exceptions, which while unnatural and ugly in C, are essential to the point that it would be unnatural to try to code without them in many other languages, and threading, which for good or bad is widely used. You also don't mention out-of-process (but non-distributed) calls).

It seems to me that, to be compelling enough to persuade anyone to actually switch, you should aim to provide at least *some* way of dealing with issues that other systems deal with. Perhaps the single most compelling feature you could provide would be transparent interoperability with all the other systems (which I know is hard - that's what would make it compelling :) ), but that directly conflicts with performance and simplicity.

So I guess my question is, why did you choose to pursue the particular goals that you did, as opposed to the almost-opposing set of goals I have outlined here?

(By the way, I still think you should hack on whatever *you* want to, not what *I* happen to think the world "needs"...)

Re: Curious about your goals, posted 3 Jan 2001 at 22:26 UTC by pphaneuf » (Journeyer)

A few reasons...

First, there is no real component system for most Unix systems (other than the Unix pipes, that is). CORBA is being bandied about, but... (next point)

Second, all those other component systems are not in such wide use. Maybe they seem to be widely used, but there's only about Microsoft COM that's widely used, and only on Windows!

The reason I think this is the case is that the barriers to entry in the component world are many: some have too much overhead and are too costly, others are too complicated to use or to code components into, etc...

COM got the performance not to badly sorted out, as long as it is in-process and that no marshalling occurs. But it's awfully complex to code in (do you speak hungarian? I don't). It's also not portable.

Speed and lightweightness not important goals? Ok, do a CORBA wrapper to GTK+ and try to use that to build an application with any kind of usability. XPLC will allow something like this, which could enable swappable GUI toolkits for example (note that I said "enable", I didn't say that I'd do it or that it will even happen!). :-)

I would say that for components to be pervasive, the cost associated to them should be as low as possible, both in developers time and in runtime overhead. Thus these straightforward goals, to achieve pervasiveness by practicality and pragmatism rather than with creeping featurism.

As for too much simplicity forgiving some important features, think about this: C and C++ and probably the two most popular languages out there, and COM exceptions maps to an ugly mess in both C and C++, even if C++ has neat native exception handling (that is broken on a significant number of platforms).

Threading is allowed in XPLC, just not promoted (which I see as a feature, but I agree that I might not be in the "winning" camp on that subject).

Out-of-process calls are not in XPLC by choice. The idea is for the per-call XPLC overhead to be known, bounded and small. The idea is that out-of-process communication is possible, but will be explicit in some manner, again by choice. You'll know that you are putting your feets in mud when you actually do that, you won't get stuck without your knowledge (mess around with the free-threaded marshaler in COM, and you'll see what I mean).

I think that CORBA and others could easily live beside XPLC, but I target XPLC as something that would be as pervasive as the Linux ld.so. I would like binary packages to be able to support everything I have, but still work if some optional library is missing from my system. Now, I autoconf auto-detects or let me chooses things that will be build or not depending on if I have those on my system, then sending this to someone that doesn't have them doesn't work, and so on.

You don't want CORBA to solve such a small-scale problem! Imagine your web browser making out-of-process calls to the PNG library on every image row to decode!

But on the other hand, I have a feeling that bigger thing could be build with a system such as XPLC, scratching into the lower end of what is currently done with CORBA, such as things like the GNOME Panel for example (and many other GNOME desktop-oriented functionalities).

I don't see XPLC displacing a lot of the current component systems, I mostly intend to pick the huge slack that they leave.

I guess that the answer to your last question (why did I choose this approach) is that I don't want complicated server-oriented distributed objects, I would like everything to be configurable at runtime. That means many things have to be components, including the file I/O used by grep, so that I can add support for URLs as files at runtime. You see where this is going?

Is the simplicity there?, posted 3 Jan 2001 at 23:33 UTC by Malx » (Journeyer)

First of all:

1) why you have choosen MS way - way of static class based objects?
Look at JavaScript object model (property based):
http://developer.netscape.com/docs/manuals/js/client/jsguide/obj2.htm

2) Also you have mentioned a pipes... Look at them - you could use them with any language (same as ENVIRONMENT of OS :). They are out of language - they are used for interprocess communication.
Then - why you through this away ? It is simplicity you are trying to extend.
You could use same pipes with text fields or, better, messages (they have integrity of data). All you need is type converters. Something like
"imgcat *.jpg| grep size=32 | move to/dir/"
here "imcat" will compose image object with neded parameters - size/name/bbp/format/text-info etc (but never include image data into this!!! just links to FS (or URLs).If you intend to work on image content (btw look for programm in GNOME software for PIPE-like editing/modifying of images - ImageShaker).

I'm not the expert in any way, but I thinks that COM emulation is the same thing as building MS-like desktop into(in place of) Unix world :(

Re: Curious about your goals, posted 4 Jan 2001 at 06:46 UTC by shalabh » (Journeyer)

sab39, I do think there is a point in having a bare-minimum lightweight component system. I work in a products company - there are many places where we would like to have a component architecture. However exisiting sytems (COM, CORBA, XPCOM) come with *way* too much overhead and lots of functionality we just don't need. Options are to either do our own tiny component-like system (and lose time) or not use the architecture at all (and lose extensibility). Mostly we opt to save time.

pphaneuf, something like XPLC would be great for a lot of scenarios. I support your idea of enforcing minimum and leaving the rest to the implementors. Along the same philosophy IMO, you could review the current design of XPLC in the light of feedback and see if even things that *are* enforced will not be too much for the users.

Maybe interfaces should not have QI(), posted 4 Jan 2001 at 07:44 UTC by pphaneuf » (Journeyer)

I am discussing this idea by e-mail with Rick Parrish, and I think it is an interesting one, with good advantages but also some downsides.

The idea is that interfaces should not inherit from the IUnknown interface. Every component should support the IUnknown interface, but all refcounting and QI() would be done on that interface pointer.

The advantages are that you can implement agregation much more easily, since you have only one QI() implementation per actual physical component. All the rules of COM transitivity don't apply anymore, because doing a QI() is one way (if you QI an IFoo interface from an IUnknown interface, you cannot use the IFoo pointer to get another interface or get the IUnknown back).

You can also share a single IUnknown implementation more easily with a mix-in class in C++, and have a single AddRef/Release/QueryInterface implementation in a library (if the QueryInterface is table-based), thus saving on code bloat.

The main disadvantage is that QI() is one way only. If you get passed an IFoo, there is no way to know if the component supports also IFoo2 or IBar. You have to keep the IUnknown pointer around like a kind of object handle, with which you don't do actual work, but need in order to do refcount and obtain other interfaces from which you do the actual work.

You also can't do keep a reference to a non-IUnknown interface if you didn't get the IUnknown interface that goes with it, because you can't addRef the object through the non-IUnknown interface, and keeping it might be risky.

The idea is very interesting, but I think that losing the ability of doing QI from any interface is a big cost. I feel this is one of the critical aspects in extensibility, where an API that once did only so much can be extended in the future and be dynamically used if available. Applying Rick's idea would mean that we wouldn't pass non-IUnknown pointers around, only IUnknown pointers, which mean that every single method that want to use the object will have to do a QI before doing so. This is losing some of the simplicity of using components.

Re: Is the simplicity there?, posted 4 Jan 2001 at 08:03 UTC by pphaneuf » (Journeyer)

Malx, I understand your feeling that I am doing a "COM emulation" (but look at Bonobo if you want real Microsoft-emulation, IMHO) and that there are better object models.

I would love to support a better, cleaner object model, something like what you linked at or something like Objective-C or Smalltalk, but the truth is that I have to support languages like C and C++, or even assembly, and that I have to do so with the smallest overhead for them, so that going from regular C/C++/assembly non-component code to XPLC components is as easy and enticing as possible.

If they have to go through some dynamic method dispatcher, even if it is relatively efficient, it would be very hard to get in the performance range that C++-style virtual method tables give you. They're about the cheapest possible way to implement dynamic dispatching.

Pipes are interesting, but I am thinking of other levels of components where pipes would be inappropriate, like GUIs and widget sets for example. By the way, I am a great fan of the netpbm package, this is a great component library, and it manages to be fast enough even if it passes the image data itself (it often has to be all read anyway, maybe avoiding the filesystem speed things up?).

I feel that the very basic idea behind COM is one of the very few really good ideas that came from Microsoft. That, and the combo box, to me, are two great invention. Let's face it, there is so many people at Microsoft, somebody ought to have a good idea once in a while, eh? :-)

And while I think it is a good idea, I am anything but a Microsoft lackey (ask any of my friends, I do not use Windows without death threats), and I don't want XPLC to just be monkeying Microsoft COM, like maybe XPCOM is doing. I look at MS COM and other component systems, and where I see good ideas that I could use, I will use. I try to be pragmatic.

Re: Curious about your goals, posted 4 Jan 2001 at 08:11 UTC by pphaneuf » (Journeyer)

shalabh: another thing is that when you do your own tiny component-like system (like we could perceive Mozilla and CrystalSpace have done), you're not compatible with anything, you can't use other programs component, you're in your own little world.

You say that we opt to save time, and that is a shame in the end, because you end up with the vast majority of program not being components or component-using. You are right in, this is what I have my eye on.

In your last sentence, are you saying that you would try to see if even more things should be optional? Do you see anything yourself that you think could be made optional? How I see this, there's only one component that you have to use, it's the service manager.

About KParts and XParts, posted 4 Jan 2001 at 09:21 UTC by pphaneuf » (Journeyer)

I took a look at both of these technologies. I took my information from here and there.

The first thing I notice, is that while KDE used to be CORBA-everything like GNOME is, they seemed to have realized that a certain level, CORBA is overkill and just plain overhead. KParts is out to avoid CORBA completely in the most common embedding cases, DCOP to provide a simple IPC mecanism and so on. They note that using the much more lightweight KParts instead of CORBA allows them to be much more pervasive embedding components into applications, which is in line with my ideas.

KParts is all in-process, just like XPLC, but is not very general, it is seems to be oriented toward visual/GUI components rather than general components. Still, I find this rather nice, because in-process is good for visual components in many cases. But KParts is only very slightly related to XPLC, from what I gather, and the KParts concept could be implemented over XPLC.

XParts adds out-of-process components to KParts, in a quite general way (in the KParts framework). You can use KParts that are in another process, or use anything that can embed itself in an X window (this makes me think a lot of miguel's initial presentation on Bonobo, the "Unix sucks" one (I think he's right about the title, but come on, CORBA to replace Unix pipes?)). Very nice, but only relevant in a visual environment.

As I intend to do something similar to Medusa using XPLC, being restrained to visual components is not very good.

As a side-note, XParts is an example of adding out-of-process to a component system that has been designed for in-process only. Some techniques of XParts could be carried over to adding out-of-process objects to XPLC (which I'll leave out of the core, of course, but would be available).

Re: Curious about your goals, posted 4 Jan 2001 at 13:49 UTC by shalabh » (Journeyer)

pphaneuf: IMO, XPCOM is today not really 'tiny' :-) But it does enforce a lot of mechanisms that a developer may not want to conform to. Which is why developing a system using XPCOM requires a lot more initial effort on the part of the developer, than would be required if the developer chose just plain C++.

Keeping just one component (the service manager) essential is a good thing - anything less than that would obviously be nothing at all. What I meant was that you could review if it is possible to implement other features on this component system without changing the core. Things I see people mention like reflection, out-of-process, etc. As I'm not a component system guru I wouldn't be able to tell you what all I would want when I become one :-)

I believe (rather hope) that XPLC core will remain a tiny bare-minimum one. So even though after a while there might be a lot of components- support components (registry etc.) the developer would still be able to use only the parts of the support system that he wants.

how much have you looked at CORBA ?, posted 4 Jan 2001 at 15:41 UTC by stefan » (Master)

From a look at the (sparse) outline of the ideas you want to implement, I have difficulties to understand what exactly you want to support and what you want to leave out to a higher level layer. In fact, it appears XPLC is heavily underspecified. As others already pointed out, it currently appears to be just a complicated way to call dlopen(). If all you want is a modular system, i.e. one that lets you load different implementations of an interface dynamically, there are plenty of (Open Source) projects that do just that. There is little need to provide a (even lightweight) framework to do just that.
However, things get more complex if you want to add some meta data (a whole type system even), or provide some level of concurrent access, or language/location/platform transparency.

You do have to be explicit about stuff like concurrency, or language independence. I.e., you need to specify how your services will act under parallel access (distributed or not). You do need to specify some form of interface in a neutral language (IDL) if you want the components to be implementable in different languages, and of course you need to talk about an adapter mechanism (marshalling) which allows different memory layouts to be bound together.

Given that, I'd really suggest you start reading some CORBA specs, and you give precise arguments against them, instead of just lamenting that CORBA is overly complex. I'm not claiming that CORBA is a silver bullet, but it does solve a lot of common problems nicely. I use CORBA heavily, especially in a colocated setting (hundrets, if not thousands, of CORBA objects in the same address space), where speed really matters. And indeed - in contrast to your statement above - colocated method invokations are little more than virtual method calls in C++.

You shouldn't look at Bonobo or KDE to learn about CORBA. Both seem to suffer from some heavy misunderstanding of the CORBA object model. While Bonobo tries to reimplement DCOM on top of CORBA, KDE made a big mistake when dropping CORBA in favor of its own (C++ and in-process only) replacement, because it abused CORBA and consequently suffered from efficiency problems.

It's sad to see the same pattern as with C++: instead of trying to grasp the technology, people complain that it's overly complex or even more that it's 'badly designed' (just demonstrating their ignorance) and then switch to a much less powerful alternative, essentially (badly) reimplementing the same ideas (OO in C for example).

Again: I'm not claiming that CORBA provides necessarily all you need. But I suggest that you study the CORBA architecture (the general ideas and techniques), and then provide some substance when complaining, instead of asking the whole world to read your very vague proposal and to comment. It's just the way it works: if you think you can do things better, you have to do the first steps, and provide some meat in order to tease the people.

Comments, posted 4 Jan 2001 at 17:37 UTC by nymia » (Master)

Here are some of what I think XPLC will need to address and state its position:

  • Threads - Solaris, Linux, Windows, BeOS
  • Process API - fork(), exec*(), dlopen()
  • IPC Facilities - pipes, sema4s, mutexes, message queues, shared mem, etc
  • Sockets
  • File System API
  • Object Brokers - orb, poa
  • Name Service
  • Interceptors - COM+, CORBA
  • IDL
  • QI Properties - reflexivity, symmetry, transitivity

The first five items are already given as they are provided by the OS. Deriving from them is a no-brainer. The remaining items will definitely be provided for or not by XPLC.

Component system complexity, posted 4 Jan 2001 at 19:12 UTC by apenwarr » (Master)

stefan: what does CORBA buy you, if all components run in the same address space? As far as I can tell, IDL is supposed to make the interface transparent (ie. no need for dlopen() and similar messes), but C++ and normal shared libraries do that anyhow. People use CORBA for cross-language, interprocess, and distributed programming, but if your ORB implements those features, then it seems to me that it really does require a lot of complexity. As far as I can tell, the cases where CORBA can be fast are exactly the cases where it doesn't buy you anything. Am I missing something?

Others:

In my opinion, the number one reason that people don't use any given system (whether a component system or whatever) is complexity of either the API or the implementation. If it's too hard to learn or too much work to use, then it falls by the wayside.

API (Interface) complexity: People like to use files. open, read/write, close, and you're done. On the other hand, people don't really like to write sockets programs in C: just to start, you need to call socket(), inet_*(), maybe gethostbyname(), one of bind, listen, or connect, and then finally you can read/write and close. Most of us here probably know how to do all that, but I bet almost all of us have some kind of function library to wrap around it, such as 'int sock = tcp_open("www.slashdot.org", 80)' or some such thing. I certainly do. Notice how so many people think Java makes network programming so much easier -- well, it does. It was never really hard, of course, but the API made it seem hard.

Implementation complexity: I'm sure anyone here can think of a library they refused to use because it would more than double the size of their program. Nowadays, statically linking my "ls" program is a bad idea because glibc is too complex.

Component systems suffer from the same two types of complexity, and IMHO that's why they've never caught on. I don't know much about COM, but XPCOM has so much overhead just to register a C or C++ component that there's no way I would ever use it. ld.so, however, makes it easy; I just include the right header and link with the library (or use dlopen(), of course, but that makes it too difficult so almost nobody does that).

CORBA and IDL solve the interface complexity issue beautifully -- in the C++ case, the IDL compiler just generates a nice header file for you, you fill in the contents of each function, and the data all goes to the right places. #include the header file and link the right libraries to use your component.

However, in my opinion CORBA fails in terms of implementation complexity. There is no such thing as a "simple, fast, stable CORBA ORB" except as compared to the bigger, slower, buggier, more featureful ones. With the huge number of CORBA ORBs available and still too big, I have to assume it's a fundamental design problem, not the implementors' fault.

You can work around the slowness and complexity of CORBA by having fewer, "bigger" function calls in your object's interface -- the Berlin project does this with great success. And of course, any real distributed system, CORBA or not, has to do the same thing because otherwise latency will kill you (as it does with remote X11).

The ideal component architecture, which might not exist yet, has both a simple, easy interface (like modern ELF shared libraries) and a simple, lightweight implementation that I won't be afraid to link with my programs. Hint: if I have to do anything like queryInterface(), dlopen(), or typecast objects from a strange base class, it's too complicated. Is KParts close to this?

One last thought:

If your program is linked with svgalib and Xlib, and you want it to run even if one of the libraries is unavailable, what's the minimum set of changes necessary to make it work? I bet the problem would be 99% solved if ld.so just allowed the program to execute until one of the missing functions was called, or even did a default "return -1" or something for a missing function. Do I really need the whole mess of dlopen() for each and every function in svgalib?

Re: Comments, posted 4 Jan 2001 at 19:56 UTC by pphaneuf » (Journeyer)

As you mention, the first five services are already provided by the OS. XPLC does not promise to be a "portable platform", only to be cross-platform itself, so initially at least, it will not come with componentized abstractions of OS-dependent features like these, applications will still have to take care of their own portability themselves. XPLC just promises not to be too much of a problem itself.

Note that I said "initially". I would like to have a package of basic XPLC components that could be relied on across platforms.

Now, I'll state my position on the remaining items:

  • Object Brokers - orb, poa

    It doesn't have neither an ORB or a POA. Since there is no remoting, there is no need for an ORB, and methods are invoked in C++ virtual method call style, more directly than through a POA.

  • Name Service

    The service manager acts as a general naming service. In comparison to CORBA, the UUIDs used by XPLC as names are garanteed to be unique across time and space. For those thinking that collisions don't happen that often, I had one just today checking if there was an Advogato project page for Medusa, the Python single-threaded multi-protocol server (used by Zope).

  • Interceptors - COM+, CORBA

    Not supported, method calls are as direct as possible, with O(1) efficiency.

  • IDL

    I intend to have an IDL at some point to allow scripting languages to call interfaces and implement interfaces (which will require some type information). At the moment there is none, for purpose of getting somewhere (just like XPCOM didn't have IDL before). I will try to implement pretty soon at least a minimal IDL compiler with a C++ header backend, so that we can start writing IDL instead of C++ headers for interfaces.

  • QI Properties - reflexivity, symmetry, transitivity

    The XPLC IObject::getInterface() method is required to support this classic trio of requirements, just like MS COM and XPCOM.

Again, this isn't set in stone, discussing this kind of stuff is exactly what I am here for. The plan is either to have XPLC killed because something that is better already exist, or for XPLC to survive with a stronger design than it first had.

Re: Comments, posted 4 Jan 2001 at 20:38 UTC by nymia » (Master)

Thanks for replying, I'm sure XPLC will move forward and get to its destination. That's one reason why we are all here for: to scratch own on itch and be happy with what we're seeing here. It's true, we're just not aware of it that we are contributing something into this big and mysterious world of Free Software and Open Source. And we're seeing the fruits of it on a worldwide scale. Look around, we're changing our lives and also changing the lives others as well. I think that is good!

Re: Component system complexity, posted 4 Jan 2001 at 22:05 UTC by stefan » (Master)

apenwarr: yes, you are missing something: the duality of transparency and sound design. To be more explicit: CORBA provides all the means to be location and language transparent. That doesn't mean I have to run each object in its own address space. Quite on the contrary. It means that I don't need to care. However, it is clear that for the whole to be efficient, I want to cluster objects together in a suitable way to provide fast interaction among objects that are tightly coupled together. In the berlin project that means for example that I will do everything to keep the scene graph nodes in the server, as all the traversals will run so much faster when no marshalling is required. But I can plug in a client side Graphic, if I need to. Language transparency in the context of berlin means that the protocol doesn't mandate the granularity of distributed method invocations (in contrast to, say, X). The fact that I can do all that, even dynamically at run time (lifecycle management, load balancing, etc.) shows what location transparency is all about.
An argument I like to use when explaining this is the analogy with physical vs. virtual memory. At some point people used to address memory directly. The step towards virtual memory with all the complexity involved (memory management, i.e. paging, memory protection, etc.) is IMO quite similar to the evolution we are seeing now in terms of giving up control over the physical location of objects. The analogy is quite far reaching. Even though you don't usually think about physical addressing, you may care about memory layout to make your programs more efficient w.r.t. caching. Similar arguments apply to object clustering in a distributed environment.

And yes, CORBA is pretty compex. I'm not arguing about that. I'm arguing against the myth that CORBA is slow.
The complexity in CORBA is just a mirror of the complexity of the problem domain CORBA deals with. And if you strip of most of the problems, there isn't much left to be solved in terms of a middleware framework. The fact that the complexity of CORBA is shining through in C++ is something you might consider an advantage or a disadvantage, depending on your point of view. If you find it inconvenient, I suggest you have a look at some berlin demo applets I wrote in python. It's really sweet.

Back to the point: even if distribution is not an issue, there is still language transparency. You do need some form of interface definition, together with language mappings as well as an equivalent of GIOP. (hoping that all languages would eventually agree on a common ABI is just an illusion.
If you restrict yourself to in-process components, and you don't want to deal with a mess of (one-to-one) language adapters (such as python <-> C/C++) you are bound to a single language. If even that isn't important, you are really left with a way to specify ('export') an interface that a plugin supports. As dlopen is inherently type-unsafe, you have to build your own type system around it. We do that in berlin (and yes, it is totally independent of CORBA) to provide 'kits'. All we use from CORBA in this context is the 'repoId', which is used when the client asks for a special kit interface (but of course we could fall back to some CORBA independent mechanisms). Beside that, each module has an attribute vector (a set of properties), that can be inspected to see whether the module in question fits the client's needs. This plugin mechanism is encapsulated in two classes. I really don't see a need to build a framework around it.

In a nutshell: if you strip of all those interesting features like location or language transparency as well as concurrency considerations, I don't see much to be written down at all (especially when dealing with C++, which already has a nice type system, in contrast to C).
If all you need is reflection or some other sort of meta data, I think that could be easily added with a handful of classes, nothing I would call a 'component system'.

Then where is simplicity?, posted 5 Jan 2001 at 00:26 UTC by Malx » (Journeyer)

You telling that XPLC is other level then pipes. I still call this MS way of thinking.

You still refering to pipes simplicity, but propose COM-like programmers system?! You must refer to simplicity of a.out/ELF/ld.so in place! The same level , isn't it? :)

But you refering to pipes as a component model, not libraries or plugins...Ok what is pipes then...

Pipe-IPC is OS-managed way of transfering bytes amoung processes with buffering and controling of execution (stoping , restarting and killing/quiting of process by fact of data presence).
I thinks this is not the thing you refering to ;)
So called pipes are shell.
  • It parses CMD line
  • makes substitution according to FS and current dir
  • create pipes (in process)
  • forks , to make pipe-channel of processes
  • exec-s
Now what must be implemented in programms:
  • Reading stdin / writing stdout (optional stderr - almost none use it. Alsmost always in "sh" not in "tcsh" etc.)
    It is very hard thing! almost no X, SVGlib or ncurses programm do it ;) - same MS progs
  • Interpreting of command line parameters - prog -opts file1 file2 ....
  • quit if something wrong (this is not so simple, if you'll look from COM side ;)
It never bothered with data types... Have you said it have only 1 data tyle? Make "ls |mtvtoppm" to see it is not true ;).
This system have data types (simplest is \n,\t,' ' separation), but theirs management up to user/programmer (he is not persuaded to check for them - and it is great! - as in C).

The other thing I should mention:
- Shell way (the unix ?) - user could create a programm from blocks fast in interpreter way
( "cat bin/* | grep adobe-")
- MS way - we all do for you - you never needs to think of anything programmer will not thinks of. Oh! You are a programmer?!?? (Ask your admin?! ;)
Ok COM will give simplicity ...may be... but to only programmer. And after recompilation only

Then .... Why you do Not like pipes? :) You have told about GUI (this you mention as weak part of KParts orientation?) as an example. I'll strike back with example of:

deep# ./wmres
"Resolutions" MENU
"1024x768" EXEC ./wmres 1024 768
"800x600" EXEC ./wmres 800 600
"640x480" EXEC ./wmres 640 480
"320x200" EXEC ./wmres 320 200
"Resolutions" END
This is clear-text output of programm for WindowMaker, which generates submenu. It is GUI. (same with wm* icon-like apps).
Ok fast GUI could be pipe-like.

Then - plugins
Here I could remind of GIMP - it has bin/exe files as a plugins

What are other usages of XPLC?

Is shell the only way? - no - tcl/tk + wish , JavaScript as shell (from Mozilla), Lisp.
lots of them........ (Is Oberon what you need for objects? Or Plan9?)
Isn't COM just geting rid of OS process-app/lib management functionality?

It is MS way is not becouse MS-sux :) (It is a greate system really), but becouse you just couldn't think that there is something else there ........... :-/

one more..., posted 5 Jan 2001 at 00:31 UTC by Malx » (Journeyer)

Autochooser of VGAlib or Xlib

Unix world - you have gdb and you have frontend for X/emacs/K/..... what you ever like

Win world - you have kicq, gicq, licq - GUI is main part , and functionality is side effect of GUI :))

Re: Then where is simplicity?, posted 5 Jan 2001 at 03:06 UTC by pphaneuf » (Journeyer)

This is hard to explain properly, there are too few examples available...

If pipes were like ELF and ld.so, if you didn't have grep on your system, you wouldn't be able to run cat or sort, or any program that could be input or output of grep.

XPLC allows the possibility of using available library-packaged software in an optional manner. dlopen() is complicated, but XPLC components, you just have to drop in a directory (like you just drop an executable binary/script in /usr/local/bin or some other directory in the PATH).

You come from a web background? Maybe you know Zope? XPLC is kinda like Zope, but except that it isn't just for Python, and it isn't just for the web. Okay, Zope isn't just for the web, but let's not split hair.

Microsoft Internet Explorer is a program that makes me bitter. I hate it, it really bites. But at the same time, the idea behind its design is so nice.

What most people don't realize is that the "location" text field is not just for URLs! If I happen to have a COM component that knows how to handle URLs starting with "foobar:", it will get passed whatever the rest of the string is after the "foobar:" and get asked to retrieve whatever content goes with it. Then, with the content comes the MIME type. It finds a component that can display the obtained MIME type and tells it to do its thing in the browser window. In the case of the HTML renderer, it does that recursively with the <IMG> tags. If they really got it right, you should be able to put an URL pointing to an HTML file as the SRC of an <IMG> tag and have the content of that HTML file display as the content of the image tag!

I applaud Konqueror, for that they got the same thing going, thanks to KParts. Very nice job.

And get this: no recompilation and no relinking.

Where I say that this is a different level than Unix shell pipes (I switched from just "Unix pipes" to follow your logic) is similar to the difference between user-space daemons and kernel-space services. Some things can be implemented in both spaces, like an HTTP server, with different compromises (the user-space HTTP server can be complex and run sub-programs for dynamic content, and the kernel-space one is much faster and has direct pipelining of static content from I/O drivers to network drivers).

What I don't like about KParts (from what I understood) is the stickiness of the GUI. A KPart is about doing something on the screen it seems. I don't know if you can make one that can do nothing with the screen, say just an URL fetcher (that fetches into a memory buffer, for example).

The part about a component system coming from another project is that what you end up is that only KDE programs will use the cool KHTML component.

I must say that I do not totally understand your argumentation though... Particularly the last part, about "Unix world" and "Win world".

The first project I want to do with XPLC as soon as it is workable enough, is a finger daemon. Yeah, you've got to start small. But I don't really see this as so "Win world". Why don't I use inetd or xinetd with in.fingerd? The idea is to push the very idea of inetd and xinetd further: one of their big overhead that make it so that we don't use them for all our services is that they have to fork a process, which is rather slow. What if that was all done internally? Without recompiling? With all the speed of native code!

Ok, this is stupid for just a finger daemon, but think about other things: every SSL services could share the private key, so instead of entering a passphrase multiple times, you would enter it only once and save memory! An SSH server could be run right from such a componentized inetd without the lag it usually does when it generates its session key. Componentized tcp_wrappers would be able to keep a cache of DNS queries instead of starting a-new at every connection (and would parse the configuration files only once).

These all seems like small wins that don't have much importance, but the thing is that all these small wins accumulate to make a sleepy old system into a snappy one. You don't want to save memory or support more connections on the same server hardware?

An example where some people did what shalabh said ("do our own tiny component-like system (and lose time)") is in Window Maker. There is now a plugin system for window decorations that can draw them at exactly the same speed the internal ones are drawn, did you know that? Check out the libwmfun package that comes with Window Maker.

What about the "Autochooser of VGAlib or Xlib"? This is about a game, decidedly not command-line, not very "pipable". If it was done with XPLC, someone could add support for GGI, even if they didn't have the source to Quadra, and a user could use that GGI support without recompiling, just by dropping an hypothetical "ggi.so" into a directory and starting the game!

What does gdb and its various front-ends have to do with that?

Maybe you have an extremely fast computer and don't see the difference, but when I use the OSS output plugin to XMMS compared to the ESD output plugin (which goes through a pipe to the ESD component), I have a major performance lossage. Significant latencies, skips easily, etc...

I don't think I see your point very clearly, I am sorry... Anyway, as sab39 and nymia said, I'll hack on whatever I want, m'kay? I'm just here to ask you guys aobut ideas, and you might just convince me of doing something else than XPLC, but I still think it's more a matter of getting it right.

Anti-DCE, posted 8 Jan 2001 at 15:31 UTC by lkcl » (Master)

there is a significant difference between the DCE/RPC runtime library (250,000 lines of code, 50,000 of which is the IDL compiler, a further 20,000 is marshalling / unmarshalling of basic and complex types) and the DCE environment _based_ on that rtenv ("8 million lines. 8 _million_ lines. 8 milllion _lines_" - to pseudo-quote a line by danny devito, no prizes for guessing the name of the film as i've forgotten).

microsoft did a _vastly_ significantly better job of implementing [or improving on] the original DCE 1.1 rtenv, _especially_ when i started on the samba-compatible version: i was finding bugs _literally_ at the rate of one every two weeks for _two years_, until they finally gave up and did a total rewrite for Windows 2000, and now even _i_ can't find ways to crash w2k services (except spoolss.exe, which is still a piece of shit).

yes, the documentation for DCE 1.1 totally sucks, and the code's pretty hairy too. however, if you stick at it for about two years, ignoring the docs and the code and just getting on with it, you get to quite _like_ the way DCE/RPC works.

don't worry: i'm just twisted and perverted by my close exposure to Things That Microsoft Loves Most: SMB and DCE/RPC.

Complexity, posted 8 Jan 2001 at 15:48 UTC by lkcl » (Master)

Be warned by the size of the code-base. If it's 100,000 lines of code or more, don't even THINK about saying xxxxing stupid things like,

"but it's _far_ too complicated. surely it's got to be simpler than that?" and on this basis, reject the entire code-base.

THINK. UNDERSTAND. be prepared to stare at code, write code, stare at network traces [if it's over-wire] for AT LEAST a year. THEN consider sticking your oar in.

the samba dce/rpc codebase i worked on was rejected by Andrew Tridgell because, even though he is a highly respected, specialist Unix Systems and Algorithms Programmer with a PhD, he could not grasp the necessity of the levels upon levels upon levels [and i'm not kidding: see http://lists.samba.org/pipermail/samba-technical/2000-February/006380.html which is a three-part series of messages] that were required, and rejected it out-of-hand.

i see that there is evidence of this occurring elsewhere in the open source community, which is why i specifically wanted to bring up this particularly irksome topic to, hopefully, reinforce the lessons to be learned, with another appropriate example.

to achieve certain very large goals for which a decade or more of man-years is required to implement, you WILL need to use a series of small, simple solutions which, when layered together with WELL-DEFINED interfaces, will give you extremely powerful capabilities. if you think you can do the same thing WITHOUT spending the time, you are deluding yourself very badly.

DCE and complexity, posted 9 Jan 2001 at 16:02 UTC by pphaneuf » (Journeyer)

LOL, I love the pseudo-quote! :-)

I understand your point perfectly, and while I always think it might be possible to shave a few thousand lines (off of an 8 million lines project!), there is a reason for all this complexity and code size.

But I'm taking this in a wholly different view. I'm not even thinking of being compatible with MS COM, as XPCOM once did (I don't know if they still entertain that notion). I'm not (explicitly) supporting remoting. I'm not supporting exceptions. I'm not supporting this and that.

The point is a different engineering compromise than the one they did at Microsoft. I am betting that a less featureful system with a lower barrier to entry will have a better yield.

I might be wrong about this, but I'm having an awful lot of fun anyway, so the hell with it. :-)

It's been a bit more than two years that I have been poring over papers and code, and I understand it will take even longer before we get a really good offering (we'll have something usable and reasonably useful too pretty soon though).

You are talking about small, simple solutions that can be layered to build larger things, and about well-defined interfaces. That's precisely everything that XPLC is about.

.net, posted 11 Jan 2001 at 14:28 UTC by lkcl » (Master)

You are talking about small, simple solutions that can be layered to build larger things, and about well-defined interfaces. That's precisely everything that XPLC is about.
excellent. i wish you every success. given that there have been about five separate articles all about this funny compartmentalisation / library issue, now that OpSrc is getting so large and clunky, i intuitively feel that something out there, real soon, _is_ going to fit everything together - properly - and move unix up the evolutionary tree a few branches. maybe it's microsoft's .net strategy: that'd be a hoot.

.net??, posted 20 Jan 2001 at 22:37 UTC by aaronv » (Apprentice)

the motivation for .net is sincere, and the problems it solves really *do need to be solved*. Instant binary interoperability without an IDL and automatic data marshalling are great, but .NET would be horrible for two reasons:

  • The middleware goes inside compiler-emitted machine code. This is bad, especially for debugging purposes. What happens when you're trying to profile or debug and compiler-emitted symbols make it impossible to get serene data?
  • Support modules come from a central server, and that onsite at microsoft. Do we really want to give microsoft control of something so central to our lives as programmers?

I can't remember if I've voiced these same concerns before. This is a bit OT, but this whole middleware topic strikes a very loud chord: is explicit middleware good enough, or do we really want to move forward and make it automatic?

Internet Explorer, posted 23 Jan 2001 at 19:43 UTC by pphaneuf » (Journeyer)

Today, I saw a Windows machine, I thought I could check the size of the Internet Explorer executable. 60 kilobytes. I sure don't like Microsoft software most of the time, but I have to applaud them for such a modular architecture!

The DCE critics link has changed., posted 13 Jun 2001 at 20:04 UTC by pphaneuf » (Journeyer)

It is now over here.

XPCOM status, posted 2 Aug 2001 at 17:30 UTC by pphaneuf » (Journeyer)

http://mozilla.org/projects/ xpcom/ was last modified the 15th of may 2000 (look at the bottom of the page, the "last modified" stamp of 16th november 1999 is not true, I'm having it removed).

That was to add a link to the Standalone XPCOM page, which was updated last on the 26th of may 2000.

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!

X
Share this page