Infomancy n. 1.The field of magic related to the conjuring of information from the chaos of the universe. 2.The collection of terms, queries, and actions related to the retrieval of information from arcane sources.

Introducing: FISH

January 4th, 2007 by Christopher Harris

Are you satisfied with your OPAC? In all likelihood, the answer is no. Actually, the answer will porbably always be no. You can probably provide me with even more examples of how the OPAC sucks especially well in your library, but instead let’s engage in some root cause analysis.

Given: OPACs Suck
But why? I would not be alone in noticing that by their very definition, OPACs no longer meet our needs. They were designed and named for an era of computing that has come and gone.

Online – because until then electronic catalogs had been offline with terminal access only.
Public Access – as opposed to private university mainframes with usage accounting that tracked every second of access.
Catalog – a set of MARC records.

If you are trying to find a better OPAC, the three concepts within the very name are not even base expectations today. Remember, the “Online” of the first OPACs meant Telnet; later it was upgraded to something like Gopher. And then you have that word “catalog.” To mangle the Bard, “What’s in a name? That which we call a catalog by any other word would help users find resources as easily.” While you know that “catalog” is the correct terminology for a set of crafted records that replicate in miniature the essence of a collection to facilitate finding, many people just remember pulling out long drawers of cards. So please join me in stipulating that OPACs…well, suck is the wrong word. It demeans our profession’s imaginative (and quite groundbreaking) work to develop an online finding mechanism that foreshadowed the massively powerful directories and search engines that power the Internet today. Let us instead say that:

The OPAC has left the building.

By Any Other Name
Okay, so we just did away with OPACs. Wasn’t so hard, just a few words to type. Guess I can go grab a second cup of tea and…eh? What’s that? We need a replacement finding tool?

Well, we might just have something that could work. Our library system has spent the last three years looking for a new Union Catalog/ILS. Long story short, we haven’t found what we need on the market. [Hint: The ILS needs to leave the building!] We looked at a number of open source offerings as well, but again, the was always something in the ILS package that didn’t work. Overall, we aren’t that unhappy with our current vendor’s circulation module. It has nice screens, and does the job without a lot of fuss and mess. Cataloging isn’t the worst thing in the world either. But the search? We have some serious concerns here. Poor layout, poor design, default sort is by last cataloged date, no online component possible within distributed server configuration, etc. So what we decided to do was tackle one thing at a time. Modularize the system and rely on APIs instead of “integration” to make everything work together. Up first was a replacement for our aging union catalog.

FISH
FISH: Free (as in kittens) Integrated Search Handler. But never, ever tell your patrons that (unless they want to know). Backranym it to “FISH Isn’t So Hard” or “Finding Instead of Searching Helps.” But let’s break down the orignial meaning:
Free (as in kittens): This is a concept that I first heard from another SLS person, Pat Neal. While the open source/free software community commonly talks about the differences betwee free (as in speach – meaning that the soruce code is open, readable, and editable) and free (as in beer – meaning that there is no cost for the software), Pat talked about free (as in kittens – meaning that while there may not be a cost to the software there are vet bills, food costs, time for cuddling and playing, and dedication as the cat grows over time). FISH is built using only free software (speech and/or beer) but is certainly something that, like a kitten, has other costs in the form of hardware, staff time, and training.
Integrated: While I may have just finished saying that the ILS needs to leave the building, the different modules of our system will certainly need to talk to each other. The difference is that we are developing from the very beginning around the concept of APIs. API stands for Application Programming Interface, and is really just a computer-code way for two different programs to exchange data quickly and smoothly – see Wikipedia for more information. Integrated also comes in to play with the hope that eventually we will be able to incorporate more and more co-search capabilities in the form of sidebars or direct integration into search results.
Search: Here we get to the heart of the concept. What we have done is built a front-end for an API-based search. By using the incredibly powerful IBM OmniFind Yahoo Edition, a free (beer) search engine built using Apache’s Lucene technology, we are able to provide relevancy-ranked responses that are displayed in a friendly environment [With a big thanks to the University of Rochester's eXtensible Catalog presentation for verifying that we were on the right track here]. In other words, we aren’t doing the heavy lifting here. We extracted MARC records and then, thanks to the efforts of one of the system’s Specialists of Library Technologies, Andy Austin, parsed the MARC data into a MySQL database. The MySQL database is then used to build a series of dynamic webpages – one for each ISBN as well as pages for authors, illustrators, subjects, etc [hat tip to Casy Bisson's WPopac here!]. We then asked OmniFind to crawl all of those pages into a cached index to allow searching. Long story short, it works like a charm! The searching is fast, and we are able to tweak relevancy, add synonyms and spelling errors, even highlight “featured results” – in other words, we have our own search engine.
Handler: While we could have stopped there, a search engine that spits back search engine results, we went a step further by using the open source content management system Drupal to “handle” the results.
Instead of this:
basic search results
We wanted something that would display covers and additional information on the initial results page. This is an early beta annoucement to share the proof of concept as we continue through development, but you can see the basic idea here:
handled search results
Approaching FISH as a search handler has also let us spend more time on developing our results display, and less time on the mechanics. By which I mean we can focus on finding instead of searching. Again, this is still very much in development, but we wanted to kick off the new year by sharing our progress. The system’s other Specialist of Library Technologies, Michael Nyerges, is bringing a wealth of experience to the development of our results interface. Here is a basic screenshot of where we are now with handled results.
handled book
We will be adding integration with our book review module as well as sidebars to link with other searches and resources. The goal is to build on some of the ideas that have been put out there in the library world, but tweaked for the school environment. Our interface is going to look very different because we have different customers. One of the great things about using Drupal as a handler is that we can have entirely different interfaces based on the user’s login.

Go Forth and Play
If you are interested in taking a closer look at what we are doing, we would welcome any feedback. Be warned, however, that as a working project the beta site may or may not be up at any given time and it will be changing. FISH is located at http://fish4info.org/union.

In closing, this is not my program. I had a crazy idea one day, but the credit goes to the excellent team I work with – especially Andy Austin and Michael Nyerges. Many, many thanks also go out to the other teams from libraries that are innovating and leading. North Carolina State University (Go Wolfpack! I actually got my first masters in education/instructional technology there), the University of Rochester, and Casey Bisson. Most importantly, since this project is built using free and/or open source software we certainly want to thank everyone who contributes to the common good of software development. A special thank you (even if it never reaches them) to IBM and Yahoo for releasing IBM OmniFind Yahoo Edition as free software with a great API that finally made this idea possible.

FISH will be released as a Drupal module in the not too distant future.

5 Responses to “Introducing: FISH”

  1. M Says:

    This post was a Ringmaster’s Pick for the 62nd Carnival of the Infosciences, 7 January 2007. This installment of the Carnival may be found at: http://marklindner.info/blog/2007/01/07/carnival-of-the-infosciences-62/

  2. Karen Says:

    Chris – This looks really, really nice. You, Andy and Michael have done a great job!

  3. Stewart Says:

    Fantastic! This shows real promise.

  4. Christopher Harris Says:

    Thanks for the feedback. We are continuing to work on this, and hope to have a final version ready over the summer.

  5. Alejandro Says:

    We did a small test (9,800 items) using Drupal and the Faceted Search module. We first converted MARC into a Tab-sep file, then loaded onto a CCK field using the node_import module. It works great! Although, it took a LONG time to load up.

    We broke out the LCSH into factets, the item type is another facet, as well as the author and publication year. The faceted search module helps the user narrow the search quite easilty. We also have item recommendations based on those tags.

    Try it out! (Not guaranteed to stay put for long). http://enlinea.mty.itesm.mx