• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar

FromThePage Blog

Crowdsourcing, transcription and indexing for libraries and archives

  • Home
  • Interviews
  • crowdsourcing
  • how-to
  • Back to FromThePage
  • Collections

Feature: Subect Graphs

July 13, 2007 by Ben Brumfield

Inspired by Bill Turkel's bibliography clusters, I spent the last week playing with GraphViz. Here is a relatedness graph generated by the subject page for my great grandfather:

One big problem with this graph is that it shows too many subjects. Once I add subject categorization, a filtered graph becomes a way of answering interesting questions. Viewing the graph for "cold" and only displaying subjects categorized as "activities" could tell you what sort of things took place on the farm in winter weather, for example.

Another issue is that the graph doesn't effectively display information. Nodes vary based on size and distance. Size is merely a function of the label length, so is meaningless. Distance from the subject is the result of the "relatedness" between the node and the subject, measured by page occurrences. This latter metric is what I'm trying to calculate, but it's not presented very well. I'll probably try to reinforce the relatedness by varying color intensity using the Graphviz color schemes, or suppress the label length by forcing nodes into a fixed size or shape.

Of course, common page links are only one way to relate subjects. It's easier, if less interesting, to graph the direct link between one subject article and another. Given more data, I could also show relationships between users based on the pages/articles/works they'd edited in common, or the relationships between works based on their shared editors. A final thing to explore is the Graphviz ImageMap output format, which apparently results in clickable nodes.

I'll put up a technical post on the subject once I split the dot-generation out into Rails' Builder — currently it's just a mess of string concatenation.

Filed Under: subject links Tagged With: features

Reader Interactions

Comments

  1. Iestyn Lewis says

    July 19, 2007 at 2:34 am

    Ah, graphviz! In ’99 I used graphviz to display biological pathways for a contract job. I spent at least one sleepless night trying to figure out a way to get an arbitrary graph with no intersecting lines, before reading somewhere that that was actually impossible.

Primary Sidebar

What’s Trending on The FromThePage Blog

  • How Do You Know Whether AI Is "Good Enough"?
  • Introducing Gemini 3.0 Support in FromThePage
  • Start Reading Old Handwriting: Some Recommended Books
  • How to Handle Racial or Ethnic Slurs &…
  • Guide to Digitizing Your Archives
  • How Do I Read Old Handwriting?

Recent Client Interviews

An Interview with Candice Cloud of Stephen F. Austin State University

An Interview with Shanna Raines of the Greenville County Library System

An Interview with Jodi Hoover of Digital Maryland

An Interview with Michael Lapides of the New Bedford Whaling Museum

An Interview with NC State University Libraries

Read More

ai artificial intelligence crowdsourcing features fromthepage projects handwriting history iiif indexing Indianapolis Indianapolis Children's Museum interview Jennifer Noffze machine learning metadata newsletter ocr paleography podcast racism Ryan White spreadsheet transcription transcription transcription software

Copyright © 2025 · Magazine Pro on Genesis Framework · WordPress · Log in

Want more content like this?  We publish a newsletter with interesting thought pieces on transcripion and AI for archives once a month.


By signing up, you agree to our Privacy Policy and Terms of Service. We may send you occasional newsletters and promotional emails about our products and services. You can opt-out at any time.  We never sell your information.