Older blog entries for mindcrime (starting at number 46)

26 Jul 2010 (updated 20 Aug 2012 at 02:46 UTC) »

ScrewPile update for 07-25-2010

So, what's new since last time? Well, not everything one might have hoped for, but a few Neddick bugs have been fixed at least, and progress continues on. We still hope to have Neddick TPR2 out by the end of August, and maybe a solid start on Heceta.

Specifically regarding Neddick, since last time, the bugs that have been resolved are:

Bug #12 - Make "tag" box work when pressing ENTER] - FIXED

Bug #8 - Get arrows for up/down vote links - FIXED

Bug #25 - Create scheduler mechanism for asynchronous updates - FIXED

Bug #35 - Need scheduled job to rebuild entrycache on some periodic basis - FIXED

The scheduled task stuff turned out to be pretty simple, thanks for the excellent Quartz Plugin for Grails. Now the EntryCache gets refreshed once a minute. Additionally, even though it wasn't in the bug list as a discrete bug, a problem with recalculating the score of an entry - after an upvote was removed - has been fixed.

As to what's coming next... mainly work on the next Neddick TPR, so see the Roadmap for details. The bugs on the roadmap page are now sorted (more or less) by the order in which we intend to attack them, so it should be pretty easy to tell where things are going.

And that's a wrap, for this week.

Syndicated 2010-07-26 03:36:00 from ScrewPile Dev

What's new with ScrewPile?

Since the last post, all sorts of stuff has gone on! Things are progressing nicely, especially with Neddick. Since last time we've registered each of the ScrewPile subprojects with sited dedicated to F/OSS, such as Ohloh.net, Advogato.org, and Freshmeat.net. Source code has been pushed to GitHub and the Google Code SVN repository. We've gotten Bugzilla and IceScrum installed and configured, and we have a live Neddick demo site up and running. Additionally, documentation has been written on our development process, and some roadmap documentation has been put together. We've also put together a site at Fogbeam.org dedicated to Fogbeam's open source activities.

Not a bad week's work, eh?

So, for your browsing convenience, here are links to all the new hotness:

Neddick demo site

http://www.fogbeam.org/
http://dev.fogbeam.org/
Bugzilla
IceScrum

Development Process Doc
Neddick Roadmap


Neddick on GitHub
Quoddy on GitHub
Heceta on GitHub

Neddick on Freshmeat
Quoddy on Freshmeat
Heceta on Freshmeat

Neddick on Advogato.org
Quoddy on Advogato.org
Heceta on Advogato.org

Neddick on Ohloh.net
Quoddy on Ohloh.net
Heceta on Ohloh.net

I think that's about it. Whew...

Anyway, what's coming next? Well, all of the stuff from the Neddick roadmap, and we also need to get a proper Ant build setup, some tests written, and Hudson installed and configured to do continuous integration and automated testing.

The Neddick screencast should also be forthcoming shortly, which should be fairly enlightening.

And that's about it. Any questions, just leave a comment here, or join the ScrewPile-dev or ScrewPile-discuss Google Group and post there.

Syndicated 2010-07-14 01:29:00 from ScrewPile Dev

11 Jul 2010 (updated 20 Aug 2012 at 02:49 UTC) »

Welcome to ScrewPile

(edit: everything below is true, with the addendum that "ScrewPile" has in turn been renamed to "Fogcutter". The reason is that "ScrewPile" is just a bad name, that looks and sounds ugly unless you're one of the 15 people in the world who knows what a "screwpile" lighthouse is.)


Welcome! Some of you may have known this project under its old name OpenQabal. We've recently decided to rename the project amidst some other dramatic changes, and this new blog is now the home for news and updates about the project.

If you made it here by accident, or don't know exactly what ScrewPile is, let us explain:

What is it?



ScrewPile is an umbrella project, sponsored by Fogbeam Labs, which is building a suite of powerful, open-source tools for knowledge management and collaboration. ScrewPile was formerly known as OpenQabal but has been renamed, as part of an effort to develop a consistent naming pattern for related projects.

Why "ScrewPile?"



A Screwpile is a particular type of lighthouse. Fogbeam'ers like lighthouses, so we're using lighthouse related terms for all of our open source activities.

Tell Me More



The 65,000 foot overview description would go something like this:


"ScrewPile is an open-source suite of APIs, components and applications - driven by the principles of federation, composition, open protocols, and open standards - for building and enabling intelligent enterprise applications for collaboration, social-networking, knowledge management and discovery, organizational learning, Information Retrieval and decision support."


ScrewPile proper, then, is an "umbrella project" or over-arching structure for sub-projects that handle different parts of this vision. Think of how Glassfish has become an "umbrella" project for a series of related projects: OpenMQ, OpenESB, SailFin, Portal Server, etc.

With ScrewPile as the overall encompassing framework, sub-projects will deal with provide various APIs and/or applications / subsystems that are part of this overall "intelligent application" vision.

Subprojects will deal with things like: an API / system for managing tags, recommendations engine, mechanism for dealing with voting/ranking things, something for managing a social graph and allowing queries against it, etc.

Of course, as before, in a lot of areas existing open-source code exists to do these things. In that case, we may (within the letter and spirit of the respective licenses) just "borrow" existing code, possibly wrapping or modifying it to fit the model of what we're doing here. Other bits will have to be written entirely from scratch, and that's OK.

Tell Me About These Subprojects



Well, here's what we think we know about them so far:


  • Neddick - An interface that builds on the APIs for tagging, ranking and recommending items, to provide a platform for sharing and discovering useful links, documents, people, etc. What we have right now is pretty simplistic, but there's a LOT of room for growth in this.


  • Quoddy - a sort of "mini Facebook" like social-networking interface. Builds on the APIs for social-graph management, activity-stream, activity profiling, tagging, etc. Provides the front-end for managing connections and for letting users provide information about themselves, their interests, etc. But unlike Facebook, no silly Pirates vs. Ninjas or Farmville stuff.

  • Heceta - This may not actually come into existence as a standalone project; but the general vision is a search engine that leverages all of the various bits of information from Neddick, Quoddy, and "TBD" to provide better / deeper / more insightful search results than you can get from simple document content analysis. Intranet search in the Enterprise usually sucks, largely because page-rank type algorithms don't work well due to the lack of links between documents. But by supplementing the content analysis with scoring based on tags, social graph connections, activity-stream information, etc., it should be possible to do a much better job. This is not, by the way, a totally novel idea. It's something people are referring to as Social Search.

  • GraphEngine - as before, an API for storing and managing the "social graph stuff." Doing this on a large scale is still a problem, and I'm intrigued by the idea of using an incremental evaluation system approach for this, but haven't done much on this yet.

  • ProfileEngine - name says it all, really.

  • RecommenderEngine

  • ActivityEngine

  • TagEngine

  • Etc.



Of course since this is all intended for an enterprise setting, a big focus will be on integrating with other systems (see the point about "open protocols" and "open standards" above).

Previous discussion of Neddick (formerly code-named "Project Shelley") mentions discovering documents and people, despite the screen shots only showing stuff about links so far. That's where integration comes into play... part of the vision is to integrate with, for example, a document management system like Alfresco, a forum system like JForum, a groupware / calendar system, a HRM system like OrangeHRM and/or a social-networking application like Quoddy (formerly code-named "Project Poe").

That, of course, also plays into the vision for Heceta, which is all about searching across all the different domains, and using the knowledge aggregated across all of them, to enhance our search capability.

Get involved by visiting the project page on Google Code.

Syndicated 2010-07-08 03:01:00 from ScrewPile Dev

8 Jul 2010 (updated 20 Aug 2012 at 02:50 UTC) »

OpenQabal renamed, development moved, etc.

(edit: everything below is true, with the addendum that "ScrewPile" has in turn been renamed to "Fogcutter". The reason is that "ScrewPile" is just a bad name, that looks and sounds ugly unless you're one of the 15 people in the world who knows what a "screwpile" lighthouse is.



As we get very (very) close to a release of "The project formerly code-named Project Shelley," big things are at hand in OpenQabal world. First... the project is renamed. Say goodbye to "OpenQabal" and hello to "ScrewPile."

Why ScrewPile?


A Screwpile is a particular type of lighthouse. Fogbeam'ers like lighthouses, so we're using lighthouse related terms for all of our open source activities.


Also, development has been moved from the old Java.net site and onto Google Code and Github. The new ScrewPile Project on Google Code is already available for your perusal. A New ScrewPile Dev blog is available as well. GitHub repos are coming soon'ish.

Why Google Code? And why Google Code *and* GitHub?


Google Code is just a little easier to work with. The project pages can be edited wiki-style and it has nice integration with Google Groups, etc. And since all of these big changes are coming, now seemed like a good time to switch. As to why both GC and GitHub... because Git is the future, and we expect the Git repo(s) to be the main basis for development, with code pushes to the Google Code SVN repo as a convenience for people who don't know Git, don't have access to Git, etc.


The other big news... "Project Shelley," "Project Poe" and "Project Collins" all have real names now! Say hello to Neddick, Quoddy and Heceta respectively.

Why "Neddick," "Quoddy," and "Heceta?"


Because we like lighthouses. Neddick is named after the famous Cape Neddick "Nubble" Lighthouse, Quoddy after the West Quoddy Head Ligthouse and Heceta after the Heceta Head Lighthouse.


There are still a few things to be worked out over the next day or two, but we expect to be totally up and running on the new system(s) very soon, and the first code pushes from the new codebase will be out very soon. Stay tuned...


And as always, follow @ScrewPile on Twitter for super handy news blurbs and updates.



Syndicated 2010-07-08 02:29:08 (Updated 2010-07-08 02:44:18) from openqabal

Capability Case #2: Technology Radar

Following on from yesterday's post about Project Shelley Capability Case #1: Collaborative Filtering for Information Retrieval, here is the next detailed capability case for Project Shelley Capability Case #2: Technology Radar. Also available in (pdf) and (odt) format.


Name:

Technology Radar

 

Intent:

              Identify technological developments - which may present either a threat to the enterprise, or a groundbreaking new opportunity - as early as possible. 

 

Description:

New technologies are being developed at a dizzying pace.  Worldwide, private enterprises, academic researchers, and open-source hackers are all constantly pushing the envelope, developing new approaches and tools.   Some of these advancements may represent a huge threat to your organization, perhaps by enabling a competitor to cannibalize your existing business model with a much less expensive alternative.  Others may represent an opportunity to break new ground with products, product features, or services that can represent sizable new revenue streams. It is advantageous to identify these advances as soon as possible, in order to out maneuver the competition and take maximum advantage of new developments.

 

As Downes and Mui point out in their book Unleashing the Killer App, this kind of awareness requires a technology radar consisting of a fat pipeline, a sensitive radar screen and sophisticated intelligence.

 

Solution Story:

              At MegaCorp, developers of the market leading Flozzit product, leaders are constantly jousting with rival HyperCorp, each striving to release the most advanced product in order to steal customers from the other.  Recently, HyperCorp has released a multiple new versions with features that no one at MegaCorp had considered, or believed possible at the time.  After the most recent release, MegaCorp leaders dug in and discovered that HyperCorp had integrated advanced technology developed by researchers at Miskatonic University.  “Why,” asked MegaCorp CEO Howard Phillips, “did we not know about this sooner? This is actually a better fit for our product..  if we had done this first, we could have taken a huge chunk of HyperCorp’s market share, instead of letting them jump further in front of us”  No one had an answer.

              In order to address this lack of awareness of emerging technologies, MegaCorp decide to implement a Technology Radar.  An “emerging technologies” channel is created, where every member of the organization can submit links to documents, articles and news-feeds that touch on technologies related to MegaCorp’s industry.  Users throughout the organization vote, tag and comment on each submission, allowing the collective intelligence of the organization to filter the less important items, while pushing the key ones to the top.  Product Managers and executives begin to make browsing the latest ‘top items’ on the channel a routine habit... and some users configure the system to send them a dynamic alert via instant messaging when an entry reaches a certain score.

              A few months later, the Flozzit Product Manager receives such an instant message... the link is to an article published by Arkham University, announcing the development of a  new algorithm which solves a problem that MegaCorp engineers have been struggling with.   AU has released the source code under a permissive open-source license, and MegaCorp begin integrating the new approach, and also recruit two of the students from AU who worked on the project.

              Using the new technology, MegaCorp are able to deliver a new release of Flozzit with several features which they believe that HyperCorp cannot match.  Amazingly, the new HyperCorp release comes out six weekslater, and is almost totally equivalent to the new Flozzit.

CEO Phillips talks to his mangers and explains why he’s happy with developments; “We weren’t able to leap-frog them this time... but if we hadn’t rolled that new stuff out when we did, their new release would have been a dagger into our heart.    Now we’ve shown them, and the market, that they aren’t always the ones on the forefront of technical advancements.  And the two new guys we hired from Arkham are already hard at work on some stuff that’s going to blow everybody away.”

             

Vintage:

Mature Commercialization             

 

Challeges:

Corporate culture which fosters a “Not Invented Here” syndrome. 

Lack of incentives for participation in the system.

Lack of belief in the utility of the system.

Lack of participation in the system by executives and other decision makers.

 

Forces:

              TBD

 

Business Results:

              Better awareness of technological advances which are significant to the organization.                            Ability to gain early mover advantage over competitors by incorporating advances              

sooner.

Lower risk of being one-upped by the competition with a significant technical             

advancement.             

 

Capabilities:

              Share links to web-sites, documents and other items of interest

              Categorize links by topics using channels

              Tag links with specific keyword

              Rank items by voting them “up” or “down”

              Search and filter by topic, keyword, and/or score

              Sort view by various statistical measures, such as “all-time score”, “hotness”, and              

“controversiality.”

Dynamic alerting, via email, instant messaging, etc.,  when items reach certain thresholds.

 

Typical Use Scenarios and Guidance:

              A technology radar is established to pull in information from many disparate sources:

              RSS feeds, Twitter streams, email lists, and user submitted links to websites, documents               and articles.  Collaborative filtering through collective intelligence is used to filter the                             lower value submissions, while ensuring the relevant information gains visibility.               

Employees through the organization view the radar, through the “emerging technologies”              

channel and take advantage of the information. 

 

In some cases this may represent a “bottom up” scenario, such as an engineer finding an interesting new library which enables a feature the engineer likes... he quickly knocks out a prototype, shows it to senior management, and it is eventually adopted into a product release.  In another case, this may be a “top down” scenario, where a senior leader discovers a new technology, and issues a mandate that R&D investigate it’s applicability to their product.

 

Applicable Technologies:

              Fogbeam Labs “Project Shelley”

Other corporate knowledge repositories (blog servers, forums software, document              

management systems, HR management systems, etc.)

Existing Data Warehouses / Databases / Knowledgebases

External information sources (web pages, databases, etc.)

 

Implementation Effort:

              TBD

 

Integration:

              Project Shelley can easily integrate any knowledge source which can be accessed via

HTTP and which exists in a format which can be parsed into text tokens for indexing by              

Lucene.  Where text extraction is not possible, location through metadata is still possible (ex, mp3 audio files, video, etc).

 

 

Integration Mechanism:

              RSS feeds, HTTP, OpenSearch

 

Integration Status:

              TBD

 

Syndicated 2010-06-12 17:02:14 (Updated 2010-06-12 17:03:15) from openqabal

12 Jun 2010 (updated 12 Jun 2010 at 04:09 UTC) »

So, what's a "Capability Case" and why should I care?

One of the things that we're doing with Project Shelley (and all of the OpenQabal projects, really) is expressing the initial requirements in terms of capability cases. That link explains capability cases in more detail, but the gist of it is this: A capability case is a business problem, linked to a set of technological capabilities, through a scenario. A capability case could be considered somewhat similar to a use case, but capability cases are more specifically about linking the scenario to a business problem and envisioning a solution, expressed as required capabilities.

To illustrate the point, and to get the ball rolling with describing Project Shelley in terms of capability cases, here's our first Project Shelley Capability Case. (pdf), (odt).


Name:

Collaborative Filtering for Information Retrieval

 

 

Intent:

              Use voting/ranking by individual users to tap into the “wisdom of crowds” effect to filter /                             select the most relevant information in a given context.

 

Description:

Knowledge workers - and especially executives - face nonstop demands on their time, and have to make key decisions in ever decreasing time spans, in order to adapt to the rapidly changing business environment.  Balanced against the need for rapid decision making is the need to consider and evaluate as much available information as possible before making a decision.  The information needed to make correct decisions often exists, either within your enterprise - often locked away in the collective, accumulated wisdom of every member of the organization - or somewhere outside your enterprise. In either case, it can be nearly impossible to solicit the correct information before risking a strategic mistake.  In any medium to large organization, it simply is not possible to review every document and poll every employee, customer, partner and vendor before executing a decision.  Even if time were available to do this, a small nugget of essential information could easily be lost in the sea of noise.  However, technological tools make it possible to rapidly filter, rank and correlate various sources of knowledge, helping to ensure that what is important makes it to the people who need it, despite it’s origin.

 

A group of individuals can often be “smarter” than any one member of the group.    By aggregating the wisdom of individuals via voting / ranking / correlation using collective intelligence it is possible to tap into the wisdom of crowds effect within your organization.   Collective intelligence ensure that relevant information is seen by those who need to see it, even if it “bubbles up” from an otherwise obscure source.

 

Solution Story:

 

              At MegaCorp, the worldwide leader in enterprise software with their flagship Flozzit product; sales were down and managers were scrambling to increase revenues.  A group of managers decided that the solution was to create a new, feature-enhanced Flozzit 3.0, which would add missing features and solve long-standing issues that were resulting in lost sales.

 

              Lacking sufficient resources to add every desirable feature, it was critical that the Flozzit Product Manager identify the features which would most directly impact sales.  So the Product Manager began scouring over enhancement requests in the bug database, and scanning old emails from account managers, field reps and engineers.  After a few weeks work and several meetings, the PM thought she had a pretty good handle on which features should go into Flozzit 3.0.

 

Before committing resources to the new roadmap however, she decides to peruse the “Flozzit” channel on the PS portal, and look for items tagged “complaint” or “enhancement.”  At the top of the list is a report written by a customer support representative (who the PM had never met, or even heard of; she wasn’t even sure if he was still with the company) titled “Why BigCorp hates Flozzit.”  Intrigued, she examines the filtering metadata and sees that nearly every CSR in the company has upvoted the report, as well as one or two of the engineers.   She downloads the report and digs in, to find a detailed summary of the top issues that end-users at BigCorp (the largest customer of Flozzit!) had complained about when talking to the CSRs.  The language was detailed and some of it was not kind to Flozzit.  After reading the report, the PM arranges to meet with the CSR who wrote the report, and identifies 5 top issues which had never been discussed in the many meetings held to identify the new Flozzit 3.0 roadmap.  She then calls her top contact at BigCorp to discuss the issues and the first thing he says about issue #1 is “Yes, our users have been very concerned about that.  We noticed that HyperCorp is releasing that feature in their 4.0 product and might consider switching if MegaCorp doesn’t answer soon.”

 

Armed with this new information, the PM polls a sample of other Flozzit customers about the 5 issues identified and find that 3 of them are so important that they must go into Flozzit 3.0.

 

Six months later Flozzit 3.0 ships with the 3 new features and a slew of bug fixes.  BigCorp immediately commits to an upgrade, and are so happy with the new version that they purchase another 50 licenses a few months later.

 

Vintage:

Mature Commercialization             

 

Challenges:

              Corporate culture which stifles dissent. 

Lack of incentives for participation.

Lack of belief in the utility of the system.

Unsupported document formats, databases with proprietary formats which are difficult to

integrate.

 

Forces:

              TBD

 

Business Results:

              Better identification of actionable news and information which might otherwise remain              

lost in the sea of information inside the enterprise; leading to better decision making at

both the strategic and tactical levels.

 

Capabilities:

              Share links to web-sites, documents and other items of interest

              Categorize links by topics using channels

              Tag links with specific keyword

              Rank items by voting them “up” or “down”

              Search and filter by topic, keyword, and/or score

              Sort view by various statistical measures, such as “all-time score”, “hotness”, and              

“controversiality.”

 

 

 

Typical Use Scenarios and Guidance:

              Knowledge workers view channels of topical concern to their jobs, or of general interest, on a regular basis, voting and tagging existing items, commenting on existing items, and submitting new items, creating a view of what’s important - and adding to the corporate memory - using collective intelligence.

              Knowledge workers discover relevant information through casual browsing; and through directed searching by tag, channel, submitter, score, or other attribute, when specific topics are under review.   By limiting causal browsing to the most highly ranked items, an employee can maintain a “finger on the pulse” of what is considered important at a point in time, wthout reviewing every item.  But directed search makes all of the other items accessible when they are relevant to a topical query.  

 

Applicable Technologies:

              Fogbeam Labs “Project Shelley”

Other corporate knowledge repositories (blog servers, forums software, document              

management systems, HR management systems, etc.)

Existing Data Warehouses / Databases / Knowledgebases

External information sources (web pages, databases, etc.)

 

Implementation Effort:

              TBD

 

Integration:

              Project Shelley can easily integrate any knowledge source which can be accessed via

HTTP and which exists in a format which can be parsed into text tokens for indexing by              

Lucene.  Where text extraction is not possible, location through metadata is still possible (ex, mp3 audio files, video, etc).

 

 

Integration Mechanism:

              RSS feeds, HTTP, OpenSearch

 

 

Integration Status:

              TBD

 

Syndicated 2010-06-12 02:03:44 (Updated 2010-06-12 03:40:08) from openqabal

11 Jun 2010 (updated 11 Jun 2010 at 03:08 UTC) »

OpenQabal, Project Shelley, Project Poe, Project Collins, WTF?

If you were paying attention to the previous post and its predecessor you should kinda have an idea of what Project Shelley is. At least, you may be thinking to yourself, "Ok, it's a Reddit clone, written in Groovy/Grails, and incomplete, what's the big deal?" And that's a perfectly fine attitude to have at the moment. The point of this post is to start clarifying where this is going, how Project Shelley relates to Project Poe (namesake) and Project Collins (namesake) and how it all fits into the OpenQabal vision.

Let's start by looking at what OpenQabal is meant to be. In the past, I had described OQ as:

...an open-source social-networking and collaboration platform / suite driven by the principles of federation, composition, and openness; with a special emphasis on enabling "distributed conversations" and the "federated social graph."

So, let's start by throwing that paragraph out, while keeping some of the essence of it. This is still - in part - about social-networking and collaboration, but "distributed conversations" (especially at "Internet scope") aren't really part of what I'm interested in tackling right now. Not because it's not interesting, or because it's a solved problem, but because I don't see as much value for it in the setting(s) that I'm focusing on. But, again, if members of the community see a place to do that kind of work within the OQ umbrella, then that would be great by me. Now, the "federated social graph" bit... I wouldn't say that's being thrown out, but rather placed on the "back burner."

So, that leaves us with, what, "an open-source social-networking and collaboration platform / suite driven by the principles of federation, composition, and openness?" Well, yes. But to make the scope of intended application(s) clearer, and the way things will decompose, I'd re-word that now as something like:

"OpenQabal is an open-source suite of APIs and applications - driven by the principles of federation, composition, open protocols, and open standards - for building and enabling intelligent enterprise applications for collaboration, social-networking, knowledge management and discovery, organizational learning, Information Retrieval and decision support."

No, that definition isn't perfect, and it'll morph over time. But I think that gets closer to the heart of things. "Intelligent enterprise applications" is really what interests me right now. And what I mean by that is, using technologies like collaborative filtering, tagging, social-graph mining, explicit semantics, data mining, machine learning, etc. to build (or integrate) enterprise applications in a way that makes them "smarter" or better able to help humans find the information they need, when they need it (even if they don't know they need it yet!) I've said before that I don't really like the label Enterprise 2.0, but for lack of a better term, you could say that that's what this is. Except we should probably call it Enterprise 3.0 just to one-up the competition, eh?

With that said, what OpenQabal becomes is sort of an "umbrella" or over-arching structure for sub-projects that handle different parts of this vision. Think of how Glassfish has become an "umbrella" project for a series of related projects: OpenMQ, OpenESB, SailFin, Portal Server, etc.

With OQ as the overall structure, sub-projects will deal with provide various APIs and/or applications / subsystems that are part of this overall "intelligent application" vision. A number of the pieces we talked about in the old OpenQabal model will live on now, pretty much as they always would have. I still see the need for an API / system for managing tags, something for doing recommendations, something for managing a social graph and allowing queries against it, etc. Of course, as before, in a lot of areas existing open-source code exists to do these things. In that case, we may (within the letter and spirit of the respective licenses) just "borrow" existing code, possibly wrapping or modifying it to fit the model of what we're doing here. Other bits will have to be written entirely from scratch, and that's OK.

So, what about these sub-projects? Well, here's what I think I know about them so far:

  • Project Shelley - An interface that builds on the APIs for tagging, ranking and recommending items, to provide a platform for sharing and discovering useful links, documents, people, etc. What we have right now is pretty simplistic, but there's a LOT of room for growth in this. Expect another post soon just dealing specifically with "what's coming" for Project Shelley.
  • Project Poe - a sort of "mini Facebook" like social-networking interface. Builds on the APIs for social-graph management, activity-stream, activity profiling, tagging, etc. Provides the front-end for managing connections and for letting users provide information about themselves, their interests, etc. But unlike Facebook, no silly Pirates vs. Ninjas or Farmville stuff.
  • Project Collins - This one we haven't talked about before. It may not come into existence as a standalone project, but the general vision is a search engine that leverages all of the various bits of information from Project Shelley, Project Poe, and "TBD" to provide better / deeper / more insightful search results than you can get from simple document content analysis. Intranet search in the Enterprise usually sucks, largely because page-rank type algorithms don't work well due to the lack of links between documents. But by supplementing the content analysis with scoring based on tags, social graph connections, activity-stream information, etc., it should be possible to do a much better job. This is not, by the way, a totally novel idea. It's something people are referring to as Social Search.
  • GraphEngine - as before, an API for storing and managing the "social graph stuff." Doing this on a large scale is still a problem, and I'm intrigued by the idea of using an incremental evaluation system approach for this, but haven't done much on this yet.
  • ProfileEngine - name says it all, really.
  • RecommenderEngine
  • ActivityEngine
  • TagEngine
  • Etc.

Of course since this is all intended for an enterprise setting, a big focus will be on integrating with other systems (see the point about "open protocols" and "open standards" above). Notice that the description for Project Shelley mentions discovering documents and people, despite the screen shots only showing stuff about links so far. That's where integration comes into play... part of the vision is to integrate with, for example, a document management system like Alfresco, a forum system like JForum, a groupware / calendar system, and/or a HRM system like OrangeHRM (or Project Poe). That, of course, also plays into the vision for Project Collins, which is all about searching across all the different domains, and using the knowledge aggregated across all of them, to enhance our search capability.

And we haven't even had time to talk about the event correlation stuff, personalization (filters, attenuators and amplifiers) or social-network-analysis, or prediction markets. Whew.

Anyway, that's the quick and dirty on how some of this fits together. As far as status: Project Shelley already exists and has a lot of features implement, as seen from the screen-shots. There is already Project Poe code as well, but it's much less feature complete. Some of the backend API code already exists, but it needs to be refactored, munged, moved around, etc.

Also, one last note... Nothing about this new direction or "roadmap" precludes the possibility of doing a "distribution" as we talked about before. If there's demand/need, the idea of bundling one or more, or all, of the OQ "pieces" with another "piece" like Roller or JavaBB or Alfresco, etc., could certainly be doable. It might even make sense to do some pre-configured bundles that make everything work together seamlessly. But my focus at the moment is really on the new bits. And more documentation (in the form of blog posts, at least) if you hadn't noticed. Don't forget to follow OpenQabal on Twitter.

Syndicated 2010-06-11 01:34:58 (Updated 2010-06-11 02:58:16) from openqabal

10 Jun 2010 (updated 11 Jun 2010 at 02:07 UTC) »

More on what Project Shelley does

In case last nights post didn't sate your appetite for Project Shelley information, here's another morsel. These screen shots illustrate the various ranking views that are available and demonstrate the RSS feed support.

This is the "hot" view. "Hotness" is a basically a metric that combines age and activity... if something is both new and has votes (or comments), it's hotter than if it's older and has fewer votes/comments. In other words, hotness decays with age (on a logarithmic basis) from the moment an entry is posted, and hotness rises in response to votes or comments.


Here we see an older entry has been voted up, and with the limited activity on the test system, that 1 vote was enough to move it into the top part of the Hot view.

This is the "Top" view, which simply shows the entries with the highest score, over all time. Since this is a test system, there aren't a lot of votes, and nothing has a score higher than 1, but you can see how all the 1's are at the top, above all the 0's. An entry starts with a score of 0, and up votes add 1 to the score, while downvotes subtract 1.:

Viewing one of the RSS feeds in RSSOwl:

More looking at RSS feeds.

Syndicated 2010-06-10 19:49:39 (Updated 2010-06-11 01:59:55) from openqabal

10 Jun 2010 (updated 10 Jun 2010 at 06:07 UTC) »

Introducting Project Shelley (with lots of screenshots!)

So, if you follow the OpenQabal Twitter stream or my personal twitter stream, you've heard some chatter about something called "Project Shelley." You may assume it had something to do with my recent post about restarting OpenQabal development, and you would be correct.

So, what is this "Project Shelley?" Well, for starters, Project Shelley is just a temporary code-name. The project will get a better name later (maybe we'll have a contest or something) but it works for now. Someday I'll explain how the name came about... for now I'll just mention that it's a nod to our friend Mary Shelley the author of Frankenstein.

Now with that out of the way, let's get down to the nitty-gritty. Project Shelley is the first bit to come out of rethinking the direction of OpenQabal. In the past, I was more focused on the broader social-networking aspect, thinking about decentralized, federated social-networks. But my personal interest was always more in enterprise applications of this kind of technology, and NOT in trying to build a "facebook killer" ala Diaspora or whoever.

So, while working for a semi well-known self-publishing company in the Raleigh area back in 2008, I started playing with the Open Reddit code, looking at how that type of technology could be a complement to some other things "behind the firewall" vis-a-vis knowledge management. Some people showed interest, but no real champion ever stepped up to push its use, then the economic collapse happened and side-projects became less emphasized, and then I left the company in 2009. But that experience planted a seed, and so the first new OpenQabal sub-project is - essentially - a very Reddit (or Digg if you prefer) like application that uses voting, tagging, sharing, filtering, etc. of articles and documents.

In it's present form it looks a lot like "just a Reddit clone" but the intention is to move beyond that, and I'll talk more about the more advanced features later (and to be fair, Shelley already has things that Reddit doesn't, but it also lacks things that Reddit does have) but tackling something that starts of as a "reddit like" gave me a chance to get started with a well known problem domain, and a chance to get something tangible out the door to start poking and prodding and playing with.

The current version has a lot of functionality, but will need a fair amount of "cleanup" work to be anywhere near production ready. The intent here really was to blast through as much as possible in a short period of time just to make this project feel real again.

So... with no further ado, here are screenshots and details about what this stuff does.


The front page, which is the default view of submitted links for the "default" channel. ("Channels" in Shelley lingo are like "sub reddits" in Reddit lingo)

The Login page, awaiting login:

The front page after login. Notice the new tab in the upper right hand corner. That takes a user to their user profile "stuff."

Sharing a link. It's kinda hard to make out, but the first entry in that field is a plain email address (it's mine, don't spam me, ok?) and the second is the same address but prefixed with "xmpp:" Yes, we support XMPP messaging.

As you can see here, I did receive an XMPP message about the shared link.

And now we're looking at tagging support. It's primitive right now, but this page shows a user a list of all the tags they've used:

And clicking on said tag displays a list of the posts you've tagged with that tag. What hasn't been done yet is any work on dealing with tags on a "global" basis (eg, can I see links that somebody else tagged with a given tag?)

Notice here that the Search dialog has been filled in with the string "information retrieval".

And here we get our search results:

Click the "comments" link for a given entry, and you come to this page, which lets a user view and add comments to an entry, and also shows other similar links. Right now the "recommended links" stuff is built just using the MoreLikeThis class from Lucene Contrib, but this is one of the areas that's going to get some interesting work in the future. In particular, I want to supplement this by using knowledge of tags and social-graph connections, to (hopefully) get better results than from just a strict content similarity score.

About to enter a comment:

With a new comment added:

The "User Profile" page. It's pretty spartan right now:

The "Saved entries" page. Things go here if you click the "save" button under an entry.

Adding a tag to an entry. The input field is hidden until you click the "tag" link under an entry.

Same thing, after putting some tag text in.

So there you go...a quick look at what Project Shelly is, currently. Coming later, more on what it will become in the future, and some info on the even more mysterious "Project Poe."

Syndicated 2010-06-10 04:46:32 (Updated 2010-06-10 05:17:08) from openqabal

37 older entries...

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!