Folksonomies in the GLAM context

Folksonomies, also known as crowd sourced vocabularies, have proved their usefulness time and again in terms of deepening user experience and accessibility, especially in cultural heritage institutions.  Often termed GLAMs (Gallery, Library, Archive and Museum), these institutions can use folksonomies to tap into the knowledge base of their users to make their collections more approachable.

For example, when one user looks at a record for Piet Mondrian’s Composition No. II, with Red and Blue, she may assign tags for colour blocking, linear and geometric.  Another user may tag the terms De Stijl, Dutch and neoplasticism.  By combining both approaches to the painting, the museum ensures that those without contextual knowledge have just as much access to their collections as those with an art historical background.  Linking tags can allow users to access and search collections at their needed comfort level or granularity.  It also frees up the time for employees handling those collections, since they can depend—at lease in part—upon the expertise and observations of their public.

The diversity of tags also increases findability levels for educational uses of online collections.  If an elementary school teacher wants to further her students’ understanding of movements like cubism, she can just open up the catalogue of her local museum to explore the paintings tagged with ‘cubism.’  This becomes increasingly important as field trips are increasingly unavailable to public schools and larger class sizes require more class prep than ever.  With the linked folksonometric vocabulary, the teacher need not fight for the field trip nor dedicate even more time to personally constructing a sampling of the online collection to display.

Crowd sourcing knowledge can also go beyond vocabularies and prove especially useful in an archival context.[1]  When limited information is known about a collection or object and those with institutional knowledge are unavailable (a persistent problem plaguing archives), another source of information is needed.  The best way to do that is to tap those who would have the necessary expertise or experience.  For example, Wellesley College’s archive began including digitized copies of unknown photographs from the archive in the monthly newsletters emailed to alumni.  With those photos, the archive sent a request for any information that alums could provide.  In this way, the archive has recovered a remarkable amount of knowledge about the historical happenings around college.

But folksonomies and other crowd sourcing projects are only useful if institutions incorporate the generated knowledge into their records.  Some gamify the crowd sourcing process in order to engage users, but then lack the impetus to follow through in terms of incorporation.  Dropping the ball in this way may be due in part to technical challenges of coordinating user input and the institution’s online platform.  It may also stem from the same fear that many educators hold for Wikipedia: What if the information provided is WRONG? Fortunately, both anecdotal and research evidence is proving those fears largely unfounded.[2]  The instinct to participate with institutions in crowd sourcing is a self-selective process, requiring users to be quite motivated.  Those interested in that level of participation are going to take the process seriously, since it does require mental engagement and the sacrifice of time.  In terms of enthusiastic but incorrect contributions, institutions may rest assured that the communities that participates in crowd sourcing efforts are quite willing to self-police their fellows.  If something is wrong or misapplied (eg Stephen Colbert’s Wikipedia antics), another user with greater expertise will make the necessary alterations, or the institution can step in directly.

GLAMs are experiencing an identity shift with the increased emphasis on outreach and engagement.[3]  The traditional identity of a safeguarded knowledge repository no longer stands.  GLAMs now represent knowledge networks with which users can engage.  If obstacles exist to hinder that engagement, the institution loses its relevance and therefore its justification to both the bursar and the board.  These open sourced projects can break down those perceived barriers between an institution and their public.  Engaging with users at their varying levels and then using that input shows that the institution envisions itself as a member of that community rather than a looming, inflexible dictator of specific forms of knowledge.  Beyond that, though, institutions can use all the help they can get.  Like at Wellesley, an organization may be lacking in areas of knowledge, or they just don’t have the resources to deeply engage with all of their objects at a cataloguing or transcription level.  Crowd sourcing not only engages users, but it also adds to an institution’s own understanding of its collections.  By mining a community’s knowledge, everyone benefits.


From a more data science-y perspective: Experimenting with folksonomies in an image based collection

For a class on the organization of information, we read an article covering an experiment analyzing the implementation of user generated metadata.[4]  For those in image based institutions looking to build on the attempts of others in the field of crowd sourcing, this experiment is a solid place to begin.  Some alterations, however, may prove helpful.  Firstly, I would recommend a larger pool of participants from a broader age range, at least 25-30 people between the ages of 13-75.  This way the results may be extrapolated with more certainty across a user population.  Secondly, for the tag scoring conducted in the second half of the experiment, I would use the Library of Congress Subject Headings hierarchy in tandem with the Getty’s Art & Architecture Thesaurus, so as to compare the user generated data to discipline specific controlled vocabularies.  Retain the two tasks assigned to the users, including the random assignment of controlled vs. composite indexes and the 5 categories for the images (Places, People-recognizable, People-unrecognizable, Events/Action, and Miscellaneous formats).  In terms of data analysis, employ a one-way analysis of variance, since it provides a clear look at the data, even accounting for voluntary search abandonment.  By maintaining these analytical elements in the one-way ANOVAs and scoring charts for the pie charts, it would be easy enough to compare your findings with those in the article to see if there’s any significant difference in representations of index efficiency (search time or tagging scores) for general cultural heritage institutions and GLAMs with more image based collections.


[1]  L. Carletti, G. Giannachi, D. Price, D. McAuley, “Digital Humanities and Crowdsourcing: An Exploration,” in MW2013: Museums and the Web 2013, April 17-20, 2013.

[2]  Brabham, Daren C. “Managing Unexpected Publics Online: The Challenge of Targeting Specific Groups with the Wide-Reaching Tool of the Internet.” International Journal of Communication, 2012.

Brabham, Daren C. “Moving the Crowd at iStockphoto: The Composition of the Crowd and Motivations for Participation in a Crowdsourcing Application”. First Monday, 2008.

Brabham, Daren C. “Moving the Crowd at Threadless: Motivations for Participation in a Crowdsourcing Application”. Information, Communication & Society 13 (2010): 1122–1145. doi:10.1080/13691181003624090.

Brabham, Daren C. “The Myth of Amateur Crowds: A Critical Discourse Analysis of Crowdsourcing Coverage”. Information, Communication & Society 15 (2012): 394–410. doi:10.1080/1369118X.2011.641991.

Lakhani; et al. “The Value of Openness in Scientific Problem Solving.” 2007. PDF.

Saxton, Oh, & Kishore. “Rules of Crowdsourcing: Models, Issues, and Systems of Control.”. Information Systems Management 30 (2013): 2–20. doi:10.1080/10580530.2013.739883.

[3] “Throwing Open the Doors” in Bill Adair, Benjamin Filene, and Laura Koloski, eds. Letting Go?: Sharing Historical Authority in a User-Generated World, 2011, 68-123.

[4]  Manzo, Kaufman and Flanagan Punjasthitkul.”By the people, for the people: Assessing the value of crowdsourced, user-generated metadata.” Digital Humanities Quarterly (2015) 9(1).


Further Reading on Crowd sourcing, Social media & Open access (drawn from JJ’s class syllabus for 11 April)

4 thoughts on “Folksonomies in the GLAM context”

  1. Hey Elizabeth,

    While I think I have heard the term “folksonomy” buzzing around a little bit before, your use of it here might be my official introduction, so thanks! I’m surprised it has not come up in any of my classes so far, especially Intro to Archives or Organization of Information. I haven’t really spent much time tagging work for fun but I can see how just a few minutes here and there by a number of museum nerds can really free up some employee time. Your example from Wellesley College’s collection of alumni brings up the point JJ made in class about the rareness of someone serendipitously spotting their uncle in a photo in the collection, but at least with a more focused project family of the alum or the alum herself would know what they were looking for and perhaps be motivated to help in a way that is mutually beneficial to alumni and the archive, so that sounds like a perfect use of this type of crowdsourcing.

    1. Hey Lauren—
      Glad to introduce you to folksonomies officially! They were what got me through Organization of Information with Losee. I think why they’re so delightful is that (if planned properly) they ground the gamification movement in tangible, usable products. We’ve spoken a good deal in past few class sessions on relinquishing control. Most GLAMs already have controlled vocabularies in place within their catalogue, but the more free-form folksonomy vocabularies can behave as support to the more controlled model—It doesn’t have to be all control or all chaos. The real struggle, though, is that most places don’t account for the amount of planning on the back end between user interface and the IT department’s coordination. Which is why so many of the gamified tagging programs that intended to incorporate results are defunct. This is why we need more digitally literate librarian-types spearheading these projects!

  2. Elizabeth,

    This is a great overview of the current state of the field for folksonomies. You do a great job of covering the many benefits for participants, institutions, and end-users of content that this crowdsourcing technique promises. You are right to identify one major obstacle in the great deal of work that folksonomy games and other interfaces require to keep sustainable. Even though these crowdsourcing efforts do bring in a lot of volunteer labor for institutions, they also require a lot of work in house so that they don’t become defunct—however, I would argue that labor and cost saving should NOT be one of the main reasons that institutions pursue these kinds of efforts, so additional upkeep inhouse should not be a huge deterrent. Rather, institutions need to pursue these kinds of crowdsourcing efforts for the many other reasons you mention, such as increased engagement by a community of users and wider access to resources for users with varied backgrounds.

    There is one other potential obstacle to these projects though that I think is important to consider: actually gaining a critical mass of participants so that the project is able to get off the ground. Wikipedia, as you discuss, is a great example of such a critical mass that allows for information to be fact checked and kept up to date. Although there are other gender, race, and class biases with this community of users that also need to be addressed with Wikipedia, there are a sufficient number of contributors to keep the project viable. This is often not the case with smaller scale crowdsourcing efforts at lower profile institutions. There are certainly still benefits for smaller institutions to launch these kinds of projects, but the staff will need to do more legwork to get the projects off the ground, including advertising their projects over social media platforms or targeting a very specific community (as with the Wellesley project that you mention.)

  3. Hi Elizabeth,

    Like Colin and Lauren, I agree that this is a useful and well-balanced overview of folksonomies in crowdsourcing. I especially like your example of Wellesley’s use of the alumni newsletters as a way for the archive to collect information about specific photographs–not only because alumni connected to the photos can add needed metadata for the collection descriptions, but also because it’s a way to promote the archives’ holdings and develop user interest among alumni who are not necessarily connected to the chosen photographs.

    I also appreciate your bringing in the article on analyzing user-generated metadata, with suggestions for refinement of the experiment. As you suggest, it would be an interesting research project to compare the results with other experiments of a similar nature–not just in image-based institutions, but across the GLAM world.

Leave a Reply

Your email address will not be published. Required fields are marked *