11 Jun 2010 mindcrime   » (Master)

OpenQabal, Project Shelley, Project Poe, Project Collins, WTF?

If you were paying attention to the previous post and its predecessor you should kinda have an idea of what Project Shelley is. At least, you may be thinking to yourself, "Ok, it's a Reddit clone, written in Groovy/Grails, and incomplete, what's the big deal?" And that's a perfectly fine attitude to have at the moment. The point of this post is to start clarifying where this is going, how Project Shelley relates to Project Poe (namesake) and Project Collins (namesake) and how it all fits into the OpenQabal vision.

Let's start by looking at what OpenQabal is meant to be. In the past, I had described OQ as:

...an open-source social-networking and collaboration platform / suite driven by the principles of federation, composition, and openness; with a special emphasis on enabling "distributed conversations" and the "federated social graph."

So, let's start by throwing that paragraph out, while keeping some of the essence of it. This is still - in part - about social-networking and collaboration, but "distributed conversations" (especially at "Internet scope") aren't really part of what I'm interested in tackling right now. Not because it's not interesting, or because it's a solved problem, but because I don't see as much value for it in the setting(s) that I'm focusing on. But, again, if members of the community see a place to do that kind of work within the OQ umbrella, then that would be great by me. Now, the "federated social graph" bit... I wouldn't say that's being thrown out, but rather placed on the "back burner."

So, that leaves us with, what, "an open-source social-networking and collaboration platform / suite driven by the principles of federation, composition, and openness?" Well, yes. But to make the scope of intended application(s) clearer, and the way things will decompose, I'd re-word that now as something like:

"OpenQabal is an open-source suite of APIs and applications - driven by the principles of federation, composition, open protocols, and open standards - for building and enabling intelligent enterprise applications for collaboration, social-networking, knowledge management and discovery, organizational learning, Information Retrieval and decision support."

No, that definition isn't perfect, and it'll morph over time. But I think that gets closer to the heart of things. "Intelligent enterprise applications" is really what interests me right now. And what I mean by that is, using technologies like collaborative filtering, tagging, social-graph mining, explicit semantics, data mining, machine learning, etc. to build (or integrate) enterprise applications in a way that makes them "smarter" or better able to help humans find the information they need, when they need it (even if they don't know they need it yet!) I've said before that I don't really like the label Enterprise 2.0, but for lack of a better term, you could say that that's what this is. Except we should probably call it Enterprise 3.0 just to one-up the competition, eh?

With that said, what OpenQabal becomes is sort of an "umbrella" or over-arching structure for sub-projects that handle different parts of this vision. Think of how Glassfish has become an "umbrella" project for a series of related projects: OpenMQ, OpenESB, SailFin, Portal Server, etc.

With OQ as the overall structure, sub-projects will deal with provide various APIs and/or applications / subsystems that are part of this overall "intelligent application" vision. A number of the pieces we talked about in the old OpenQabal model will live on now, pretty much as they always would have. I still see the need for an API / system for managing tags, something for doing recommendations, something for managing a social graph and allowing queries against it, etc. Of course, as before, in a lot of areas existing open-source code exists to do these things. In that case, we may (within the letter and spirit of the respective licenses) just "borrow" existing code, possibly wrapping or modifying it to fit the model of what we're doing here. Other bits will have to be written entirely from scratch, and that's OK.

So, what about these sub-projects? Well, here's what I think I know about them so far:

  • Project Shelley - An interface that builds on the APIs for tagging, ranking and recommending items, to provide a platform for sharing and discovering useful links, documents, people, etc. What we have right now is pretty simplistic, but there's a LOT of room for growth in this. Expect another post soon just dealing specifically with "what's coming" for Project Shelley.
  • Project Poe - a sort of "mini Facebook" like social-networking interface. Builds on the APIs for social-graph management, activity-stream, activity profiling, tagging, etc. Provides the front-end for managing connections and for letting users provide information about themselves, their interests, etc. But unlike Facebook, no silly Pirates vs. Ninjas or Farmville stuff.
  • Project Collins - This one we haven't talked about before. It may not come into existence as a standalone project, but the general vision is a search engine that leverages all of the various bits of information from Project Shelley, Project Poe, and "TBD" to provide better / deeper / more insightful search results than you can get from simple document content analysis. Intranet search in the Enterprise usually sucks, largely because page-rank type algorithms don't work well due to the lack of links between documents. But by supplementing the content analysis with scoring based on tags, social graph connections, activity-stream information, etc., it should be possible to do a much better job. This is not, by the way, a totally novel idea. It's something people are referring to as Social Search.
  • GraphEngine - as before, an API for storing and managing the "social graph stuff." Doing this on a large scale is still a problem, and I'm intrigued by the idea of using an incremental evaluation system approach for this, but haven't done much on this yet.
  • ProfileEngine - name says it all, really.
  • RecommenderEngine
  • ActivityEngine
  • TagEngine
  • Etc.

Of course since this is all intended for an enterprise setting, a big focus will be on integrating with other systems (see the point about "open protocols" and "open standards" above). Notice that the description for Project Shelley mentions discovering documents and people, despite the screen shots only showing stuff about links so far. That's where integration comes into play... part of the vision is to integrate with, for example, a document management system like Alfresco, a forum system like JForum, a groupware / calendar system, and/or a HRM system like OrangeHRM (or Project Poe). That, of course, also plays into the vision for Project Collins, which is all about searching across all the different domains, and using the knowledge aggregated across all of them, to enhance our search capability.

And we haven't even had time to talk about the event correlation stuff, personalization (filters, attenuators and amplifiers) or social-network-analysis, or prediction markets. Whew.

Anyway, that's the quick and dirty on how some of this fits together. As far as status: Project Shelley already exists and has a lot of features implement, as seen from the screen-shots. There is already Project Poe code as well, but it's much less feature complete. Some of the backend API code already exists, but it needs to be refactored, munged, moved around, etc.

Also, one last note... Nothing about this new direction or "roadmap" precludes the possibility of doing a "distribution" as we talked about before. If there's demand/need, the idea of bundling one or more, or all, of the OQ "pieces" with another "piece" like Roller or JavaBB or Alfresco, etc., could certainly be doable. It might even make sense to do some pre-configured bundles that make everything work together seamlessly. But my focus at the moment is really on the new bits. And more documentation (in the form of blog posts, at least) if you hadn't noticed. Don't forget to follow OpenQabal on Twitter.

Syndicated 2010-06-11 01:34:58 (Updated 2010-06-11 02:58:16) from openqabal

Latest blog entries     Older blog entries

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!