Homework in Taxonomy

In my experience, I’ve witnessed business people reference IA as the process of building a sitemap. If that were true, then Information Architects would probably be one of the most lucrative jobs in the market place. (Thank you for paying me $80 thousand a year to crank out a sitemap on the back of a napkin.)

You many not have heard the term Taxonomy before, but in IA circles, it is an extremely important word. First, let’s define Taxonomy. According to WordNet a taxonomy is a classification of organisms into groups based on similarities of structure or origin etc. Let’s apply this users on the web looking for information: searching for information based on pre-existing set of knowledge of words or phrases in their own language. If we stay focused on the English language for a minute, let’s take a look at the complexity of finding information. If I was to search for “services”, am I to understand this to be services offered by a company or a reference to the professional services industry.

For the Revolve Nation project, I needed to take this understand of taxonomy and build a controlled vocabulary. I particularly find MMI’s definition of controlled vocabulary most useful: A vocabulary is a set of terms (words, codes, etc.) that are used in a specific community. The key here is “a specific community“, and that is what I need to tackle.

So, what sort of ways can entrepreneurs and investors find information that is relevant to them? Well, there are two areas that we decided to break down: 1) industry classification, and 2) functional classification. Let’s start with the functional classification since this is relatively easier than industry. The goal of this controlled vocabulary is to provide a quick way for the users of Revolve Nation to browse for information pertaining to their industry.

For functional classification, I needed to think of how American companies are typically structured: marketing, finance, IT, engineering, and so on. Building this controlled vocabulary is relatively easy since there are not many ambiguous terms of phrases, with the exception of the word operations, which can be a catch all for many departments. Done…moving on!

Industry classifications from Yahoo, Google and DMOZ
Comparing industry classifications from Yahoo, Google and DMOZ

For the industry classification, I needed to see how the long tenured online directories classified industries. I picked Yahoo, DMOZ and Google to see how they setup their controlled vocabularies and then began jotting down the ones that seemed to be consistent across all three. Moreover, this community was never going ever see the likes of someone from Aerospace or Agriculture, so by process of elimination, I was able to narrow it down to 15 good candidates.

Okay, now I needed to see how this vocabulary was going to work. Some troubling areas already started to pop up regarding disambiguation, particularly with the phrase “Professional Services”. Does a financial planner consider themselves a consultant when thinking of industry or do immediately look for a category called “Finance”. My guess would be the latter. The same applies to a lot of freelancers, which are theoretically “consultants”. I realized that I was falling down a slippery slope with such a broad category. Instead of trying to figure out a phrase that could handle such a large population of varying industries, why not eliminate it altogether and allow the user to suggest an industry. With enough users suggesting an industry, there will be enough demand to warrant an addition term in the controlled vocabulary.

With my final list of terms in hand, I am ready to start incorporating the taxonomy into the CMS and start tagging content.

For more information on controlled vocabularies, see the article, “What is a controlled vocabulary?” by boxesandarrows.


About this entry