Tag clouds of today are so yesturday

• Bookmarks: 120 • Comments: 1

Anyone living a Web 2.0 lifestyle – using applications like Flickr or Technorati – is sure to have seen a tag cloud. Typically tag clouds depict the frequency a tag has been used within a system. The larger the word, the more times that tag has been used. The idea is that tag clouds are a good indicator of community behavior, which is very misleading.

Example tag cloud from FlickrTag clouds do provide a navigation interface which requires almost zero learning. Relative size seems like a universal concept for importance. Clicking on a tag usually shows a collection of items tagged with that word. The cloud is often contextual to a given page so digging into a collection is as simple as clicking tags in the cloud.

Tag clouds as we know them are actually not very useful. They are, in fact, a tease – so easy to use and communicating just enough to be interesting. The problem is that just because a tag is used in high frequency is not an indicator for what a community finds interesting. It is literally a display of how often a tag is used within a given context. In many cases the word flower is distinguished from flowers, even though they are obviously very similar in spelling and meaning.

A friend and colleague pointed me to a blog posting discussing how clusters were introduced to Flickr – groupings of related tags without the use of a high-level label or facet. If clustering can be done well, it offers a more interesting possibility for tag clouds. Instead of simply reflecting the use of a given tag, tag clouds could display the activity for a given cluster. It might very well be that the community is interested in food, but more people are using the tags “family”, “friends”, and “porn” so, as a user, you would never know. The use of clusters is an opportunity to reveal the higher level topics a community is actively working with. Flickr has chosen not to label their clusters. You can actually explore Tags / clusters / clusters. However, all you are really doing is browsing clusters of tagged images that are also tagged clusters. A less confusing example is exploring Tags / summer / clusters which limits the clustering to all images tagged with summer, which can be seen as a facet – a folk-facet. Certainly the URL convention makes it feel like a facet, but, is it really?

[digression starts]

Taxonomists seem to be fascinated with the emergence of folksonomies, but are quick to remind you that they are very different things. So, a facet that is applied by a user is different that one used by someone versed in the science of classification. Again, at first glance, Flickr appears to be automatically identifying facets, but is really just generating clusters from leftover tags. For example, look for clusters on Bergdorf Goodman and you get items tagged nyc, newyorkcity, newyork, etc. A typical faceted browse might show a breakdown within that category of clothing lines (i.e. mens, womens, childrens etc). Instead you are left with the breakdown of whatever the collection has been tagged. So, while it is possible to select a tag that might be a facet, for Flickr, it is not a facet, it is a tag. However, I would maintain that a user could intend to apply a facet through tagging, but the usefulness of that tag as a facet is lost because the system itself is unaware of the higher level categorization. Additionally, it would require that the tags, if clustered browsing was going to return a faceted collection would need to be limited to those expected from a faceted collection. (i.e. In the case of Bergdorf, mens, womens or childrens) Otherwise, the faceted browsing would be a mess, worse than not having the ability at all, because there would be no science to the tagging.

Taxonomies and formal faceted browsing are different activities from tagging and need separate user experiences. In that last example, it is clear that a user could intend a tag to have a higher level grouping property, but unless the system understands this then it is in fact, just a tag. If categorization was important in Flickr, I would expect a different interface from tagging to facilitate the classification.

I often ask what the difference is between content that is tagged or categorized “dog”. If I were surfing facets I might have different breeds under “dog” and maybe items tagged “dog” without a breed. If I were surfing tags, I would expect all dogs and if interested in a breed, I would enter it as a tag. Combine the two and I might browse a collection by facet, “dog” and then filter by tag, “flatfaced.” Is this any different than compound tagging, where one tag is ANDed to another tag? (i.e. dog AND flatfaced) Is the difference limited to the intention of the user who assigned “dog” as a facet and not as a tag? What happens when a user applies the same word to both the category and tag? Are they plugged into formal classification enough to distinguish the differences in meaning? Should they be? Should they care?

[digression ends]

Someone needs to birth version 2.0 of the tag cloud where the visualization is driven by the clustering of tags and not just the use frequency of a tag. Maybe someone has. I can even imagine layering user activity of browsing tagged items on top of the activity of users tagging. A fabulous side project!