Infomancy n. 1.The field of magic related to the conjuring of information from the chaos of the universe. 2.The collection of terms, queries, and actions related to the retrieval of information from arcane sources.

Cataloging the blogosphere

June 15th, 2005 by infomancy

I have this idea that has been going through my head for a while. It started when one of my staff and I were talking about my starting my MLS with cataloging as my first course; I am enjoying the class, but it does rather start the ride already in high gear. We were talking about how a certain very large company bought Web Feet, previously a small company that provided library catalog records for quality, educationally appropriate websites. The idea is that when a student searches for “dogs” in the catalog, it shows a link to the American Kennel Club website in addition to books on dogs. Alas, the small company was bought up, and the price more than doubled.

So, in talking about creative alternatives we could offer our member libraries, I had this crazy idea. LII has an RSS feed for their New This Week updates. Note: 1) RSS feeds are written in XML. 2) DublinCore is a very well known XML library cataloging schema. 3) As such, there exist many tools to transfer XML records into MARC (the standard schema used in most library automation systems). Somehow, in less time than it took you to read this – and much less time than it took to type this – I had this flash of brillance. Edit: Umm…with less caffeine and more sleep this doesn’t sound so good. What I meant to say is, as I begin to learn more about cataloging, I find I like the subject even if I am not very good at it. I just had an interesting idea, that, to my newly attained cataloging knowledge base seemed pretty cool. We could create an automated script that would process an RSS feed and convert it into a MARC record to upload into our catalog! (Hopefully more to follow if I can find the time to open a conversation with LII about the possiblity of doing this.) And thanks, KGS, for pointing out my bass ackwards approach here, sorry.

The more I think about this, the more I like the idea. In sharing it with my cataloging professor, she found it a fascinating concept that she hadn’t previously heard of. What better way to help continue to promote the legitmacy of blogs than by having them included in library catalogs…

Edited: June 20, 2005

9 Responses to “Cataloging the blogosphere”

  1. Lori Gluckman Says:

    I just heard about this post through Catablog and find it insightful and right on target. As a recent MLS grad, I am working as a content manager for a dot com in a very non-traditional role. With fantastic support from my supervisors, I am in the process of implementing standardization and cataloging principles into our developing systems, including Dublin Core and controlled vocabularies. Your comment about RSS and MARC as applied to LII resources is just the kind of application which needs to happen to increase access and interoperability. Additionally, it is crucial that librarians take the lead in such endeavors and demonstrate our value. Bravo!! Keep us posted and let us know if any assistance on this project is needed.

  2. K.G. Schneider Says:

    Cool idea, but why not start by opening the conversation with LII first and find out if we’re already doing something like this? We’re not hermits in a cave, or brains in jars. We’re real people. Plus we have an interesting project in work with California Digital Library that is similar to what your’e doing and shows that others (er, like LII itself) have been “taking the lead” as well. We’ve already mapped and described fields for an LII-to-MODS crosswalk, for example.

    Also, on entering LII items into catalogs, let me tell you 100 reasons why that sucks. It’s an idea that was explored in the mid-1990s.

    We’re totally open to you doing interesting stuff with our feed, but can I recommend that in your flashes of brilliance you communicate your ideas to the people whose content you are interested in to see if they’re up to something or PLANNING A MAJOR MIGRATION IN ABOUT TWO WEEKS THAT WILL LIKE SO TOTALLY MUCK UP YOUR SCRIPT… ;-) Not to mention letting you know why we recommend X but not Y. Particularly with our (ahem) copyrighted content.

    Talk to us, really, it’s a cool idea. Our items are already MARCish and we cross-walk to DC. Communication: catch the fever!

  3. Christopher Harris Says:

    Thanks for the feedback. Just to eat my crow a bit more publicly, my apologies for the lack of clarity in the above post that led to a miscommunication on my part. My intention was simply to begin exploring in my mind an idea that I had. LII’s feed was used as an example of a neato application and was a way for me to wrap my mind around what I am learning in class. I assure you nothing has been or would be started without full communication and full respect for LII’s copyrighted material.

    There is no project in the works at this point, but rather a thought from my point of view that there are some great ideas being published in the blogosphere. I explored this more in a failed attempt at creating a podcast – I can’t get it off the recording device because I was silly about where I recorded it. But, since I am not necessarily thinking clearly as I drive home late at night after three hours in class, you haven’t been able to hear my thoughts.

    Let me clarify a bit more here: I enjoy reading a number of blogs dealing with professional issues in education, instructional technology, and library science. These are not “journal” type weblogs, but almost more like self-published professional content serials. Yet, there have recently been a number of negative articles in mass-media that downplay the importance of blogs (Gormangate among others). So my thought was, how could blogs, and in particular RSS feeds from blogs, interact more closely with libraries and library catalogs?

    One of the challenges of adding internet material to catalogs I have seen is the concern that web resources are not “selected” by a librarian and are thus outside of her/his selection control. But then I got thinking about RSS feeds. With RSS and syndication, I am very actively selecting a group of resources. Based on good stuff from a particular feed, I choose to continue to receive future good stuff from the same feed. This seems to be similar to an academic library purchasing all books from a certain university press, or a school library buying a series of non-fiction books from a reseller.

    There is a very long history of librarians selecting resources that lends a certain legitimacy to them. While sites like Infopop and Technorati help index the blogosphere, they aren’t providing the same level of expert review. So, I wondered if RSS and catalogs couldn’t somehow work together to populate a collection of reviewed and cataloged web resources.

    Now this is just a basic idea, and may very well not be at all possible. I don’t claim any sort of expertise, and I know it doesn’t even begin to look at the issue of link validation, weeding, etc. Still, I thought it was a fascinating thought of what RSS could provide…

    Again, many thanks for the feedback/politely worded kick in the pants. I will try to be more careful about what I post and how it sounds.

  4. M J Barczak Says:

    Regarding Mr. Gorman’s rant on Blogs and the reaction thereto, one has to fathom: what all those bloggers are doing takes a lot of time especially the web site blogs set up … and are they blogging on the job or are they out of work or do they blog in their spare time … who’s minding the library?


  5. Infomancy » RSS and CataBlog Says:

    [...] real access) following some of the discussion regarding a recent post I wrote about using RSS to populate catalogs with web resources. My use of LII as an example of a very relevant RSS feed for sch [...]

  6. Steven Harris Says:

    The idea of including RSS feeds in MARC records is brilliant. LII is not, however, the kind of site that came instantly to my mind. Many newspapers, magazines, and journals now have RSS feeds. The application of this technology within the publishing industry will undoubtedly grow. It only stands to reason that libraries ought to provide access to these feeds, especially if the publication in question is of interest to a particular library’s patrons or is subscribed to in some other format. So, I’m think of this as a way of including a link to the RSS as part of the MARC record for something that may already be in your catalog (which may be something different than what you originally proposed).

    The need for communication between interested parties is clear in this case (in order to maintain a link to the proper feed, etc.), but it is unclear to me (as K.G. Schneider implies) that it would be necessary for a library to seek permission to include RSS links in its catalog. Do I have to ask permission to copy the feed into my feed reader? I don’t think a catalog record is an infringement of copyright. Obviously, publishers can choose to limit access to their feeds to subscribers or licensees, if they want. I’m just thinking of the catalog as another way of disseminating the feed, another way for patrons to discover the feed. Surely content providers would like that!

  7. Donald McMorris Jr Says:

    Excellent Idea Indeed! Do you know anybody who started a project such as this yet? All my searches for RSS to MARC bring up comments about this page (Hey, you’re famous!).

    I am working on building a library for a non-profit organization (which I am also working on building). I thought it would be neat to provide the ability to search e-Book sites via the catalog. The way I thought I would do this is create separate databases for each eBook source, and run a script that converts the eBook source’s RSS index to MARC format. Then, they would automatically be inserted into the database. The main PAC would then be able to search these databases for eBooks. The idea of separate databases is so I could have those several thousand records offloaded to another server.

    Also, do you know of a “Spider” that outputs in MARC format? For example, if I run a mirror of some resources (IE: eBooks), I think it’d be cool if a “Spider” could index my local mirror and output the results in MARC format.

    Your idea is a great one! Evidence of this from the links of many many other commenters. Good luck in classes.


  8. Christopher Harris Says:

    I know there are some people working on this idea, but I haven’t had a chance to pursue it further. Actually what I am more interested in anymore is not just the development of MARC records from RSS, but more importantly the development of MARC records as a way to record some of the information concepts being developed in blogs.

  9. Infomancy » All Your _____ Are Belong to Us Says:

    [...] human-checked exemption, I forged ahead without that search phrase. Is this an answer to cataloging the blogosphere? I think it could be, especially when someone comes up with a handy WordPress plu [...]