Folksonomies in the GLAM context

Folksonomies, also known as crowd sourced vocabularies, have proved their usefulness time and again in terms of deepening user experience and accessibility, especially in cultural heritage institutions.  Often termed GLAMs (Gallery, Library, Archive and Museum), these institutions can use folksonomies to tap into the knowledge base of their users to make their collections more approachable.

For example, when one user looks at a record for Piet Mondrian’s Composition No. II, with Red and Blue, she may assign tags for colour blocking, linear and geometric.  Another user may tag the terms De Stijl, Dutch and neoplasticism.  By combining both approaches to the painting, the museum ensures that those without contextual knowledge have just as much access to their collections as those with an art historical background.  Linking tags can allow users to access and search collections at their needed comfort level or granularity.  It also frees up the time for employees handling those collections, since they can depend—at lease in part—upon the expertise and observations of their public.

The diversity of tags also increases findability levels for educational uses of online collections.  If an elementary school teacher wants to further her students’ understanding of movements like cubism, she can just open up the catalogue of her local museum to explore the paintings tagged with ‘cubism.’  This becomes increasingly important as field trips are increasingly unavailable to public schools and larger class sizes require more class prep than ever.  With the linked folksonometric vocabulary, the teacher need not fight for the field trip nor dedicate even more time to personally constructing a sampling of the online collection to display.

Crowd sourcing knowledge can also go beyond vocabularies and prove especially useful in an archival context.[1]  When limited information is known about a collection or object and those with institutional knowledge are unavailable (a persistent problem plaguing archives), another source of information is needed.  The best way to do that is to tap those who would have the necessary expertise or experience.  For example, Wellesley College’s archive began including digitized copies of unknown photographs from the archive in the monthly newsletters emailed to alumni.  With those photos, the archive sent a request for any information that alums could provide.  In this way, the archive has recovered a remarkable amount of knowledge about the historical happenings around college.

But folksonomies and other crowd sourcing projects are only useful if institutions incorporate the generated knowledge into their records.  Some gamify the crowd sourcing process in order to engage users, but then lack the impetus to follow through in terms of incorporation.  Dropping the ball in this way may be due in part to technical challenges of coordinating user input and the institution’s online platform.  It may also stem from the same fear that many educators hold for Wikipedia: What if the information provided is WRONG? Fortunately, both anecdotal and research evidence is proving those fears largely unfounded.[2]  The instinct to participate with institutions in crowd sourcing is a self-selective process, requiring users to be quite motivated.  Those interested in that level of participation are going to take the process seriously, since it does require mental engagement and the sacrifice of time.  In terms of enthusiastic but incorrect contributions, institutions may rest assured that the communities that participates in crowd sourcing efforts are quite willing to self-police their fellows.  If something is wrong or misapplied (eg Stephen Colbert’s Wikipedia antics), another user with greater expertise will make the necessary alterations, or the institution can step in directly.

GLAMs are experiencing an identity shift with the increased emphasis on outreach and engagement.[3]  The traditional identity of a safeguarded knowledge repository no longer stands.  GLAMs now represent knowledge networks with which users can engage.  If obstacles exist to hinder that engagement, the institution loses its relevance and therefore its justification to both the bursar and the board.  These open sourced projects can break down those perceived barriers between an institution and their public.  Engaging with users at their varying levels and then using that input shows that the institution envisions itself as a member of that community rather than a looming, inflexible dictator of specific forms of knowledge.  Beyond that, though, institutions can use all the help they can get.  Like at Wellesley, an organization may be lacking in areas of knowledge, or they just don’t have the resources to deeply engage with all of their objects at a cataloguing or transcription level.  Crowd sourcing not only engages users, but it also adds to an institution’s own understanding of its collections.  By mining a community’s knowledge, everyone benefits.

From a more data science-y perspective: Experimenting with folksonomies in an image based collection

For a class on the organization of information, we read an article covering an experiment analyzing the implementation of user generated metadata.[4]  For those in image based institutions looking to build on the attempts of others in the field of crowd sourcing, this experiment is a solid place to begin.  Some alterations, however, may prove helpful.  Firstly, I would recommend a larger pool of participants from a broader age range, at least 25-30 people between the ages of 13-75.  This way the results may be extrapolated with more certainty across a user population.  Secondly, for the tag scoring conducted in the second half of the experiment, I would use the Library of Congress Subject Headings hierarchy in tandem with the Getty’s Art & Architecture Thesaurus, so as to compare the user generated data to discipline specific controlled vocabularies.  Retain the two tasks assigned to the users, including the random assignment of controlled vs. composite indexes and the 5 categories for the images (Places, People-recognizable, People-unrecognizable, Events/Action, and Miscellaneous formats).  In terms of data analysis, employ a one-way analysis of variance, since it provides a clear look at the data, even accounting for voluntary search abandonment.  By maintaining these analytical elements in the one-way ANOVAs and scoring charts for the pie charts, it would be easy enough to compare your findings with those in the article to see if there’s any significant difference in representations of index efficiency (search time or tagging scores) for general cultural heritage institutions and GLAMs with more image based collections.

Continue reading “Folksonomies in the GLAM context”