RSS Disposition Hinting Proposal
Posted 21 Aug 2005 at 19:34 UTC by tnt
Really Simple Syndication
) is a popular HTML-like XML-based data format used for syndication. RSS has a sorted history and a mutlitude of different incompatible RSS versions. (Some being based on RDF, but most only being based on XML.) Regardless of this RSS is an extemely popular format used for syndicating news, blog posts, IPradio, and IPTV, with an amazing amount of momentum.
The RSS MIME Type has previously been agreed on. And is:
(following the conventions laid out by RFC 3023
). This was done for many reasons. The reason relevant to this article is that it allows web browsers to recognize a document as an RSS feed and handle it properly without having to probe (or understand) the file (as would be the case if the "application/xml", "text/xml", or "text/plain" MIME types were used). For example, on receiving a file with the "application/rss+xml" MIME type, a web browser could "hand off" handling of the file to an extenal program (such as an RSS aggregator).
In the past RSS was dominated by the syndication of "text" media; the syndication of news websites, the syndication of blogs, the syndication of weather reports, the syndication of stock prices, the syndication of sports scores, etc. Today, this is no longer the case.
Today, there is a varied landscape of media types syndicated by RSS. IPTV uses RSS to syndicate "video" media. IPradio (also known as podcasting) uses RSS to syndicate "audio" media. Software update systems use RSS to syndicate "software" media.
Handling all RSS feeds with a single RSS aggregator is no longer desirable. IPTV RSS feeds may be handled by one piece of software. IPradio RSS feeds may be handled by another piece of software. And news and blog RSS feeds may be handled by yet another piece of software.
Today branding an RSS feed with the "application/rss+xml" MIME type is no longer sufficient to allow a web browser to handle an RSS feed properly without having to probe (and understand) the file. More metadata is needed. Metadata that hints at what is being syndicated.
The important point to keep in mind is that it is desirable to allows software (like web browsers) to know how to "handle" an RSS feed without having to probe (or even understand) the RSS feed. And although there are many ways of achieving this goal, this article only proposes one such method.
I propose that we keep the "application/rss+xml" MIME type as the only MIME type of RSS feeds, but create a community agreed upon tweak that is perfectly legal under RFC 2045 and RFC 2046. I propose that we use a "Content-Type" parameter to tell us what is being syndicated.
For example, "plain text" documents have the MIME type
Sometimes, however, there are parameters appended to this MIME type. For example:
For RSS, we could also make use to the a "Content-Type" parameter. For example, we could have:
Given that it is accepted (by the RSS community) that the use of a "Content-Type" parameter be used for hinting at the "disposition type" of an RSS feed. (I.e., given that it is accepted that we use a "Content-Type" parameter to tell us what is being syndicated.) Then there are certain things that need to be agreed upon.
- What the name of the parameter is.
- What standard values are allowed for the parameter.
- How we allow for non-standard parameter values.
For #1 I propose "disposition-type" be used (as was illustated in the example). For #2 I propose that the initial standard set be pulled from the Dublin Core Metadata Initiative's Type Vocabulary; namely: text, still-image, sound, software, service, physical-object, moving-image, interactive-resource, image, event, dataset, and collection. For #3 I propose we use the "x-" prefix.
(Given RSS' history of having drafts and proposal being implemented I feel compelled to explicitly say that this is not a standard in any way and should not be implemented. It is a proposal meant to affect discussion that may lead to a community accepted standard.)
I just ran a test. My personal weblog's RSS feed at:
previously returned the MIME type:
but now returns:
I ran it through Feed Validator and it went through fine. (Feed Validator makes checks of the MIME type of the RSS feed.)
(Obviously, to see the impact of this proposal it would be useful to test this on many of the popular web-based and desktop RSS aggregators as well. To see if they handle MIME types properly. I.e., they ignore "Content-Type" parameters they don't understand. But it is nice to see that it goes through on Feed Validator just fine.)
Comment 1, posted 23 Aug 2005 at 16:01 UTC by robocoder »
On the surface, the proposal appears to limit an RSS feed content to a single content type (i.e., preclude an RSS feed that contains multiple content types). One possibility would be to extend the proposal to handle this. (If a list, then order might be significant.)
On the other hand, perhaps disposition-type (by content type) is too granular. Have you instead considered categorizing the RSS feed (i.e., news headlines, music playlists, search results, etc)?
Suggestion on , posted 23 Aug 2005 at 17:40 UTC by icepick »
Rather than "sound" and "moving-image" why not use "audio" and "video", the same names given in the MIME Media Types?
Oops..., posted 23 Aug 2005 at 17:44 UTC by icepick »
Now I actually read the article rather than skimming. Since you are further defining a MIME type why not keep to the MIME vocab rather than pull in the Dublin Core Metadata Initiative's vocab?
This is a great idea BTW.
: As I wrote this proposal I did consider the problem of RSS feeds that syndicated different
media. (For example, a single RSS feed's <item> could be characterized as a mix of IPTV, IPradio, and Text
media) I started off trying to working them into this proposal, but later decided that that was out of scope. In
other words, this proposal does not try to handle those cases. And only trys to handle a simple case. (Either this
proposal would need to be later extended to handle them. Or another proposal would need to be made.) Off the top of
my head, it be something like:
application/rss+xml; disposition-type: mixed; disposition-list: moving-image.sound.sound.text.moving-image;
(Note, I used periods -- "." -- as a deliminator for the list, and did not put any spaces for the "disposition-list". It looks ugly but RFC 2045 does not allow you to put any spaces or use a comma in a value [unless it is in a quoted string]. But that's a detail that could be modified to "look" better.)
Or something like that. One could even somehow encode the list (or bag or set or whatever group-type is appropriate)
into the "disposition-type" value. (But, like I said I wasn't trying to deal with this situation in this proposal.)
One of the main reasons I made this "out of scope" for this proposal is because I was trying to think of the actual
use cases. (Which could end up being short sighted in the long run. But I also find it helpful in making something
useful for today.)
If a web browser got a Content-Type of "application/rss+xml; disposition-type: moving-image" then it would know
to hand this RSS feed over to whatever application or web services handles your IPTV. Likewise, if a web browser got
a Content-Type of "application/rss+xml; disposition-type: sound" then it would know to hand this RSS feed over to
whatever application or web service handles your IPradio/podcasts. (And similar for the other disposition-types.)
But what would happen if you got a "mixed" disposition-type? What would the web browser hand the RSS feed over to?
The main thing is that I don't know if I have enough insight into the "mixed" media situation to create a proposal
that would be useful. (My idea was to wait for real use cases to come up for this situation -- for the mixed media
situation -- before coming up with a proposal for it.)
Your categorization names -- news headlines, music playlists, search results -- are actually how I think of these
things in my head. I choose to go with the Dublin Core names for a few different reasons. #1: It currently seems to
be the preferred way of dealing with types. #2: Everyone else is using them; so there's infrastructure already in
place to deal with this. #3: The people who created the Dublin Core standards are libraries and specialists that have
alot more insight into categorizing and real metadata usage than I (or alot of Computer Scientists and Software Engineers)
I'm not opposed to using other categorization (other than Dublin Core's). (And not opposed to creating one from scratch.) But if that is done it is going to take some work. (Which is NOT a bad thing.) And will require alot of involvement from alot of others in varying fields to find an initial set that will useful to the greatest number of people.
And then maybe get things like:
application/rss+xml; disposition-type: iptv
application/rss+xml; disposition-type: ipradio
application/rss+xml; disposition-type: music-playlist
application/rss+xml; disposition-type: search-results
application/rss+xml; disposition-type: news-headlines
or something like that.
: The reason that I choose the Dublin Core terminology instead of the MIME terminology (laid out in RFC 2046) of "text", "image", "audio", "video", and "application" is that I feel that the Dublin Core classifications are better; and since the Dublin Core standards already have widespread use I went with that.
Although, if there is alot of resistance on this issue, I would not be opposed to using the MIME terminology in RFC 2046.
This type of thing could also be used for (pretty much) any type of container files. For example, Ogg files.
Ogg files can contain almost any kind of data and have a Content-Type of:
(Which doesn't tell you what is inside.)
They can contain audio Ogg Vorbis data or video Ogg Theora data. So the Content-Type for (audio) Ogg Vorbis files or streams could be:
And the Content-Type for (video) Ogg Theora files or streams could be:
(Given that we stick with the "sound" and "moving-image" labels and don't adopt some other nomenclature.)
You could even consider this for BitTorrent too. For example, torrent's have a Content-Type of:
Which again tells you nothing about what is inside.
Say, for example, the only thing in the torrent was a single MPEG movie file. Then you could use the Content-Type of:
(Again assuming that we stick with the "moving-image" label and don't adopt some other nomenclature.)
RSS 2.0 makes no recommendation, and there's disagreement over whether it's better to use text/xml instead of application/rss+xml so an RSS feed can be viewed in browsers.
Given this (and without trying to choose sides on this issue) this proposal could be modified to allow for:
Or whatever MIME type is used for RSS. (Given that this proposal sticks with the "sound" label and don't adopt some other nomenclature.) The important part of take from this is the disposition-type Content-Type parameter.
The proposal could (eventually) be worded in a way that is indifferent to what MIME type is used for RSS.
RSS 1.0 has a recommended MIME type of application/xml, but it may change to application/rdf+xml in the future.
I don't know of any recommendation for this. But following what I believe to be the spirt of RFC 3023, I'd rather see the following as the MIME type for RDF-based version of RSS:
As far as I can tell, this is totally compliant with RFC 3023. And conveys much more information. In fact, I'd like to see all RDF-based documents adopt this style of MIME type. For example, a FOAF document would be:
while it is not registered. And then:
after it is registered.
However, as I said, I do not know of any RFC or any other document that makes this kind of recommendation. (But I think there should be one.) And don't know how Mozilla, IE, Opera, etc will handle them. Although, given that they implement RFC 3023, these MUST be recognized as XML MIME types. However, I haven't tested whether they implemented RFC 3023 correctly or not.