RSS Disposition Hinting Proposal

Posted 21 Aug 2005 at 19:34 UTC by tnt Share This

Really Simple Syndication (RSS) is a popular HTML-like XML-based data format used for syndication. RSS has a sorted history and a mutlitude of different incompatible RSS versions. (Some being based on RDF, but most only being based on XML.) Regardless of this RSS is an extemely popular format used for syndicating news, blog posts, IPradio, and IPTV, with an amazing amount of momentum.

The RSS MIME Type has previously been agreed on.[1] And is:

application/rss+xml
(following the conventions laid out by RFC 3023). This was done for many reasons. The reason relevant to this article is that it allows web browsers to recognize a document as an RSS feed and handle it properly without having to probe (or understand) the file (as would be the case if the "application/xml", "text/xml", or "text/plain" MIME types were used). For example, on receiving a file with the "application/rss+xml" MIME type, a web browser could "hand off" handling of the file to an extenal program (such as an RSS aggregator).

In the past RSS was dominated by the syndication of "text" media; the syndication of news websites, the syndication of blogs, the syndication of weather reports, the syndication of stock prices, the syndication of sports scores, etc. Today, this is no longer the case.

Today, there is a varied landscape of media types syndicated by RSS. IPTV uses RSS to syndicate "video" media. IPradio (also known as podcasting) uses RSS to syndicate "audio" media. Software update systems use RSS to syndicate "software" media.

Handling all RSS feeds with a single RSS aggregator is no longer desirable. IPTV RSS feeds may be handled by one piece of software. IPradio RSS feeds may be handled by another piece of software. And news and blog RSS feeds may be handled by yet another piece of software.

Today branding an RSS feed with the "application/rss+xml" MIME type is no longer sufficient to allow a web browser to handle an RSS feed properly without having to probe (and understand) the file. More metadata is needed. Metadata that hints at what is being syndicated.

The important point to keep in mind is that it is desirable to allows software (like web browsers) to know how to "handle" an RSS feed without having to probe (or even understand) the RSS feed. And although there are many ways of achieving this goal, this article only proposes one such method.

I propose that we keep the "application/rss+xml" MIME type as the only MIME type of RSS feeds, but create a community agreed upon tweak that is perfectly legal under RFC 2045 and RFC 2046. I propose that we use a "Content-Type" parameter to tell us what is being syndicated.

For example, "plain text" documents have the MIME type

text/plain
Sometimes, however, there are parameters appended to this MIME type. For example:
text/plain; charset=us-ascii

For RSS, we could also make use to the a "Content-Type" parameter. For example, we could have:

application/rss+xml; disposition-type=text

application/rss+xml; disposition-type=sound

application/rss+xml; disposition-type=moving-image

application/rss+xml; disposition-type=x-windows-update

Given that it is accepted (by the RSS community) that the use of a "Content-Type" parameter be used for hinting at the "disposition type" of an RSS feed. (I.e., given that it is accepted that we use a "Content-Type" parameter to tell us what is being syndicated.) Then there are certain things that need to be agreed upon.

  1. What the name of the parameter is.
  2. What standard values are allowed for the parameter.
  3. How we allow for non-standard parameter values.

For #1 I propose "disposition-type" be used (as was illustated in the example). For #2 I propose that the initial standard set be pulled from the Dublin Core Metadata Initiative's Type Vocabulary; namely: text, still-image, sound, software, service, physical-object, moving-image, interactive-resource, image, event, dataset, and collection. For #3 I propose we use the "x-" prefix.

(Given RSS' history of having drafts and proposal being implemented I feel compelled to explicitly say that this is not a standard in any way and should not be implemented. It is a proposal meant to affect discussion that may lead to a community accepted standard.)


&quodisposition-type&quo in practice, posted 21 Aug 2005 at 20:22 UTC by tnt » (Master)

I just ran a test. My personal weblog's RSS feed at:

http://changelog.ca/feed/rss
previously returned the MIME type:
application/rss+xml
but now returns:
application/rss+xml; disposition-type=text

I ran it through Feed Validator and it went through fine. (Feed Validator makes checks of the MIME type of the RSS feed.)

(Obviously, to see the impact of this proposal it would be useful to test this on many of the popular web-based and desktop RSS aggregators as well. To see if they handle MIME types properly. I.e., they ignore "Content-Type" parameters they don't understand. But it is nice to see that it goes through on Feed Validator just fine.)

Comment 1, posted 23 Aug 2005 at 16:01 UTC by robocoder » (Journeyer)

On the surface, the proposal appears to limit an RSS feed content to a single content type (i.e., preclude an RSS feed that contains multiple content types). One possibility would be to extend the proposal to handle this. (If a list, then order might be significant.)

On the other hand, perhaps disposition-type (by content type) is too granular. Have you instead considered categorizing the RSS feed (i.e., news headlines, music playlists, search results, etc)?

Suggestion on , posted 23 Aug 2005 at 17:40 UTC by icepick » (Journeyer)

application/rss+xml; disposition-type=sound
application/rss+xml; disposition-type=moving-image 

Rather than "sound" and "moving-image" why not use "audio" and "video", the same names given in the MIME Media Types?

Oops..., posted 23 Aug 2005 at 17:44 UTC by icepick » (Journeyer)

Now I actually read the article rather than skimming. Since you are further defining a MIME type why not keep to the MIME vocab rather than pull in the Dublin Core Metadata Initiative's vocab?

This is a great idea BTW.

Replt-To: robocoder; Re: Comment 1,, posted 23 Aug 2005 at 20:45 UTC by tnt » (Master)

robocoder: As I wrote this proposal I did consider the problem of RSS feeds that syndicated different media. (For example, a single RSS feed's <item> could be characterized as a mix of IPTV, IPradio, and Text media) I started off trying to working them into this proposal, but later decided that that was out of scope. In other words, this proposal does not try to handle those cases. And only trys to handle a simple case. (Either this proposal would need to be later extended to handle them. Or another proposal would need to be made.) Off the top of my head, it be something like:
application/rss+xml; disposition-type: mixed; disposition-list: moving-image.sound.sound.text.moving-image;
(Note, I used periods -- "." -- as a deliminator for the list, and did not put any spaces for the "disposition-list". It looks ugly but RFC 2045 does not allow you to put any spaces or use a comma in a value [unless it is in a quoted string]. But that's a detail that could be modified to "look" better.)

Or something like that. One could even somehow encode the list (or bag or set or whatever group-type is appropriate) into the "disposition-type" value. (But, like I said I wasn't trying to deal with this situation in this proposal.)

One of the main reasons I made this "out of scope" for this proposal is because I was trying to think of the actual use cases. (Which could end up being short sighted in the long run. But I also find it helpful in making something useful for today.) If a web browser got a Content-Type of "application/rss+xml; disposition-type: moving-image" then it would know to hand this RSS feed over to whatever application or web services handles your IPTV. Likewise, if a web browser got a Content-Type of "application/rss+xml; disposition-type: sound" then it would know to hand this RSS feed over to whatever application or web service handles your IPradio/podcasts. (And similar for the other disposition-types.) But what would happen if you got a "mixed" disposition-type? What would the web browser hand the RSS feed over to? The main thing is that I don't know if I have enough insight into the "mixed" media situation to create a proposal that would be useful. (My idea was to wait for real use cases to come up for this situation -- for the mixed media situation -- before coming up with a proposal for it.)

Your categorization names -- news headlines, music playlists, search results -- are actually how I think of these things in my head. I choose to go with the Dublin Core names for a few different reasons. #1: It currently seems to be the preferred way of dealing with types. #2: Everyone else is using them; so there's infrastructure already in place to deal with this. #3: The people who created the Dublin Core standards are libraries and specialists that have alot more insight into categorizing and real metadata usage than I (or alot of Computer Scientists and Software Engineers) probably do.

I'm not opposed to using other categorization (other than Dublin Core's). (And not opposed to creating one from scratch.) But if that is done it is going to take some work. (Which is NOT a bad thing.) And will require alot of involvement from alot of others in varying fields to find an initial set that will useful to the greatest number of people.

And then maybe get things like:

application/rss+xml; disposition-type: iptv

application/rss+xml; disposition-type: ipradio

application/rss+xml; disposition-type: music-playlist

application/rss+xml; disposition-type: search-results

application/rss+xml; disposition-type: news-headlines

or something like that.

Reply-to: icepick; Re: Oops..., posted 23 Aug 2005 at 20:50 UTC by tnt » (Master)

icepick: The reason that I choose the Dublin Core terminology instead of the MIME terminology (laid out in RFC 2046) of "text", "image", "audio", "video", and "application" is that I feel that the Dublin Core classifications are better; and since the Dublin Core standards already have widespread use I went with that.

Although, if there is alot of resistance on this issue, I would not be opposed to using the MIME terminology in RFC 2046.

Could be used for container files... like Ogg, BitTorrent, etc, posted 28 Aug 2005 at 19:33 UTC by tnt » (Master)

This type of thing could also be used for (pretty much) any type of container files. For example, Ogg files.

Ogg files can contain almost any kind of data and have a Content-Type of:

application/ogg

(Which doesn't tell you what is inside.)

They can contain audio Ogg Vorbis data or video Ogg Theora data. So the Content-Type for (audio) Ogg Vorbis files or streams could be:

application/ogg; disposition-type=sound

And the Content-Type for (video) Ogg Theora files or streams could be:

application/ogg; disposition-type=moving-image

(Given that we stick with the "sound" and "moving-image" labels and don't adopt some other nomenclature.)

You could even consider this for BitTorrent too. For example, torrent's have a Content-Type of:

application/x-bittorrent
Which again tells you nothing about what is inside.

Say, for example, the only thing in the torrent was a single MPEG movie file. Then you could use the Content-Type of:

application/x-bittorrent; disposition-type=moving-image

(Again assuming that we stick with the "moving-image" label and don't adopt some other nomenclature.)

License for Proposal, posted 28 Aug 2005 at 19:40 UTC by tnt » (Master)

Just to make this explicit, because I've been reading "A Buyer's Guide to Standards" and seen some people complain that some of the RSS extensions out there are NOT open standards.

And even though this is NOT a standard, and just a proposal... just to say it, this is licensed under Creative Commons Attribution-ShareAlike 2.5 License.

Reply from Rogers Cadenhead, posted 28 Aug 2005 at 20:10 UTC by tnt » (Master)

I got an e-mail from Rogers Cadenhead. He wasn't able to respond to this article here, but instead replied to it on his blog. Here's the link to his reply:

http://www.cadenhead.org/workbench/news/2717

Re: Reply from Rogers Cadenhead, posted 28 Aug 2005 at 20:26 UTC by tnt » (Master)

Hello Rogers,

You said:

RSS 2.0 makes no recommendation, and there's disagreement over whether it's better to use text/xml instead of application/rss+xml so an RSS feed can be viewed in browsers.

Given this (and without trying to choose sides on this issue) this proposal could be modified to allow for:

text/xml; disposition-type=sound

application/rss+xml; disposition-type=sound

application/x-rss+xml; disposition-type=sound

Or whatever MIME type is used for RSS. (Given that this proposal sticks with the "sound" label and don't adopt some other nomenclature.) The important part of take from this is the disposition-type Content-Type parameter.

The proposal could (eventually) be worded in a way that is indifferent to what MIME type is used for RSS.

Re: Reply from Rogers Cadenhead; In Regards to "RDF MIME type", posted 28 Aug 2005 at 20:41 UTC by tnt » (Master)

Hello Rogers,

You said:

RSS 1.0 has a recommended MIME type of application/xml, but it may change to application/rdf+xml in the future.

I don't know of any recommendation for this. But following what I believe to be the spirt of RFC 3023, I'd rather see the following as the MIME type for RDF-based version of RSS:

application/rss+rdf+xml

As far as I can tell, this is totally compliant with RFC 3023. And conveys much more information. In fact, I'd like to see all RDF-based documents adopt this style of MIME type. For example, a FOAF document would be:

application/x-foaf+rdf+xml

while it is not registered. And then:

application/foaf+rdf+xml

after it is registered.

However, as I said, I do not know of any RFC or any other document that makes this kind of recommendation. (But I think there should be one.) And don't know how Mozilla, IE, Opera, etc will handle them. Although, given that they implement RFC 3023, these MUST be recognized as XML MIME types. However, I haven't tested whether they implemented RFC 3023 correctly or not.

New Advogato Features

New HTML Parser: The long-awaited libxml2 based HTML parser code is live. It needs further work but already handles most markup better than the original parser.

Keep up with the latest Advogato features by reading the Advogato status blog.

If you're a C programmer with some spare time, take a look at the mod_virgule project page and help us with one of the tasks on the ToDo list!

X
Share this page