Back in 2019, before the COVID-19 pandemic, Ben and Sara Brumfield sat with Karen Trivette and Geof Huth, husband and wife, and partners who run the podcast An Archivist's Tale. In this podcast, Ben and Sara discuss the origin story of FromThePage, their own backgrounds leading up to working in transcription software, motivations for the work they do, notable FromThePage projects, and more. You can listen to the podcast below, on Spreaker. It's also available on Apple Podcasts and on Spotify. The podcast transcript is provided below.
Karen Trivette: Good morning, everyone. I'm Karen Trivette and I'm an archivist, and this is another episode of An Archivist's Tale, a podcast created by my husband, Geof Huth, and me. Good morning, Geof.
Geof Huth: Good morning, Karen. We are still in Austin.
Karen: Yes, We are.
Geof: It's a little warm, but it's not too bad.
Karen: Not too bad. It's early, though.
Geof: It may get warmer.
Karen: We are in town for the Society of American Archivists' annual meeting.
Geof: Right. Which for us pretty much starts today, because we have meetings today.
Karen: Yes, we do. And speaking of today, it is Friday, August 2, 2019 and the time is 9:24 Central Time.
Geof: Oh, that's true. Welcome to Austin, ladies and gentlemen.
Karen: Well, we are excited today, as we always are, to greet our guests. And it's a special occasion because it's another part of our 2X2 series, where we interview a couple, be they business partners or life partners.
Geof: Or both.
Karen: Or both, as is the case today, and we are delighted to have you with us. If you'll introduce yourselves, please.
Sara Brumfield: My name is Sara Brumfield. I'm a software engineer and a small business owner.
Ben Brumfield: My name is Ben Brumfield. I'm a software engineer and a part of a small business owner.
Karen: Welcome to the broadcast.
Karen: Well, podcast.
Karen: Okay, very good, very good. So I'm just going to dive right in with our first question, and that is, tell us your archivist support origin story.
Ben: Sure. So my archivist support origin story really starts off as a family papers story. Really my introduction to this world was through a set of historic documents that were a set of diaries kept by my great-great-grandmother. On her death, the diaries were distributed to all of her grandchildren and so--and I see Karen wincing here. Exactly, right?
Ben: So the collection was completely dispersed, and my family inherited two diaries. And in 1992, my father transcribed one of those diaries on WordPerfect 5.1, or something like that. And printed them out and had them copied at Kinko's. And for years after that, he sent them off to relatives in that small part of rural Virginia that he was from. They became a kind of samizdat hits.
We would go to that area, and people we didn't know would come up and talk about how their nephew had made a photocopy of that diary. And you'd have 80-year-old women coming up and saying, "Whenever I get up and I have a bad day, I go and I read one of the diary entries, in which here's this 80-year-old woman who's getting up and she's feeding the chickens, and she's cooking dinner, and she's making a quilt, and all those kinds of things. And that just kind of keeps me going." And the power of these primary sources, to kind bring meaning to the public, right? These particularly elderly rural public members, was really inspiring. So, wow, this is fantastic. I need to do the same thing.
So I tried to transcribe the other diary...
Karen: Oh, yeah.
Ben: Several years later. And I discovered that I couldn't do it alone. I don't know that much about tobacco agriculture. I don't know the people and the places who were involved in the project. Years had passed, and I'm a different generation. So a lot of the terminology I didn't really get. Around this time, Wikipedia was getting started. And I really loved the Wiki model of collaboration. When you have volunteers who can work on the same kinds of documents, at the same time, but on their own time, without somebody saying, "Well, you're doing this piece by then. Here you go, I'm assigning this chapter to you or you."
I decided that that was what I needed for my own family papers project. So I built the tool that we eventually...
Sara: Since we're software engineers, right?
Ben: Ah-ha! And frankly it was a lot easier to write software than it was to transcribe documents, right?
Ben: These are hard things, right?
Ben: So I released that as open source. Other people started using it. But other people in my family started using that as well. We started seeing people who I didn't know work on transcribing the diaries, and they would work on entry, a few entries a day. And they would call up their elderly relatives that were involved. And the word got out, and the collection started coming together.
Ben: I would get packets in the mail that would have a diary in them, with just a note saying, "It looks like you'll do more with this than I will."
Karen: Oh, wow.
Ben: So yeah, so that was really how we got started. And once we released the tool open source, it was used by libraries, archives, and museums, and historians and documentary editors, to work on their own collections.
Karen: Very cool.
Sara: Ben went to the very first THATCamp: The Humanities and Technology unconference. And I think that was a big piece of like starting to transition from this personal project to...
Ben: To something that could be more generally useful by people. And in fact, we hosted the second... The very first regional THATCamp was associated with SAA, in 2009, here in Austin.
Sara: Ten years ago, yeah.
Ben: We and one instructional technologist at the University of Texas, and two archivists who were going to be in town, put the first regional THATCamp unconference together. And of course that was primarily for archivists. So that's our introduction to the archives world.
Karen: That's just incredible. I'm curious to make sure that listeners know exactly what your enterprise is.
Sara: Ah, right.
Ben: What is the software we're talking about here?
Sara: Maybe who are we, we should have started with that. Our software is called FromThePage. It's a collaborative or a crowdsourcing transcription platform. It's released open source, so it's installed generally at universities that have larger IT staffs, [who] install it for pedagogy and research purposes. And then we also run it as a software as a service, so an online service that archives, libraries and museums pay us a fee for to them run their projects on.
Ben: And to be clear, it's not a digitization outsourcing service. There are many of those in which you have OCR, you have documents to be scanned. Our tool is a platform for people who already have their material digitized. Often already online somewhere. And they have, in many cases, volunteers who are coming in and doing work with the institution. But, you know, they're going on vacation or, particularly for state institutions, there are a lot of people who are interested in working with the state archives, who don't live in the capital. This is a way for them, for the archivists to reach out to those volunteers, and those interested members of the public, and engage them working on transcribing documents that have already been digitized.
Karen: Beautiful, beautiful. And it's no secret to our listeners that you are a very generous sponsor of--
Karen: ...An Archivist's Tale.
Karen: So thank you for that support.
Sara: We like the idea of being in peoples' ears, right?
Karen: Oh, yes.
Sara: It's a neat model.
Karen: Well, Sara, tell us a little about your background, how you migrated to this place.
Sara: Right. So I'm a software engineer, I worked for IBM and a number of startups for over 20 years. And I was always around when we were working on FromThePage, and doing some things with it, but not really doing most of the work. And then we got to a point where we decided to really invest. We actually hired another developer to help us make it look great, because we're not very good at the looking great side of software development. We're good at the writing good code, but not the making it look pretty.
Sara: And we started thinking about how we could move this to the center of our lives, because it's a lot more fun than working for a corporation. Which we were both doing, right?
And so, I'm trying to think what... I love business, small business. I grew up in a family business, my parents bought and sold heavy equipment, construction equipment. And that's a model that really appeals to me. We have two daughters, I like having business discussions with them and making-- The act of earning a living is kind of real when you're running your own business, right?
Sara: That speaks to me, in a real values-based way. So let's see, Ben quit his day job?
Sara: 2012. And I quit mine about four years ago, so 2015, to do FromThePage, and then kind of ancillary work. We do a lot of consulting for digital humanities work as well. So, is that enough of an origin story? It's not really a good origin story, but it's a little--
Karen: I think it's fine.
Geof: We don't have any specific requirements.
Karen: No, exactly, exactly.
Ben: No radioactive spiders, or something.
Geof: Nothing's going to go wrong.
Geof: It's going to be great. So with one of the things--and we talked about this a little yesterday in our last podcast--is that we might have records, but people want a kind of access that was not before possible, and they want it all the time. And so we used to think, "Well, it's not reasonable for you to expect us to make this all of this so available." But now, it's like, well, if you want to get to your customers, you have to have that digital access. And even if they just put up digital images, well you can see them, but you can't search them. And if you cannot read older handwriting, as more and more people cannot, you actually can't access them in any useful way.
Geof: And so this kind of way of dealing with archives, which was totally antithetical to us say 20 years ago, is now really central. Not everybody has agreed that it is, but it really became [so].
Geof: Just to tell you one little story about my previous employer. I worked for the New York State Archives, and New York State Archives has large collections of Dutch colonial records. The one problem is, they were mostly burned in this giant fire in 1911. But they're still around, they were transcribed to some degree ahead of time, and they've been translated.
Geof: They had a project which put up an image of every side of every sheet of paper. And then gave you a transcription of it into Middle Dutch. And then gave you a translation into modern English.
Sara: How much did that cost them to put all that together?
Geof: It's hugely costly, exactly. The one thing is, part of it was funded by the government of the Netherlands, because the Netherlands was trying to essentially stitch together their heritage of, let's say, conquests, because that's essentially what it was. Which spans the Caribbean, Indonesia, New York, and other places around New York.
Geof: Some place in Africa?
Karen: Yeah, yeah.
Geof: I couldn't remember. And of course they included themselves too, in the Netherlands. And so there was this source to it, for this country that's a fairly small country, but did really have a very wide reach in the world at one time.
Geof: And so it sometimes matters what you want to say about yourself, when you do these things. So what do your, to the people that you work with... What are the things they're trying to do with this? How is this allowing them to fulfill their archival mission?
Ben: It varies from project to project, really.
Sara: Very much, yeah. So we have seven of the state archives, running projects on FromThePage. And I think they're very public oriented, because they're part of state governments, and because they know people are looking for their records. They're looking for family history, they're even just looking up like Maryland has a project to do marriage records. Because they're like, "This is the thing that people come and look up the most, right? If we can get this online, it's a big public service."
Sara: But we have a lot of like ... You want to tell the Meredith story?
Sara: I think that's a fun public one.
Ben: Meredith is a digital archivist at the Alabama Department of-
Sara: This is Meredith Mc-
Ben: McDonough, right.
Sara: McDonough, yeah.
Ben: The Alabama Department of Archives and History. They have this combination of a very frequently used collection, which was a set of World War I service cards. So there's one index card for each soldier, sailor or marine, who served in World War I, from the state of Alabama. And it mentions where they enlisted, race, like where they were born. Things that are of interest to genealogists, but you also have unit and things like that.
Ben: They only had them cataloged essentially by county and by first letter of surname.
Sara: Which is better than what--
Ben: Right, I mean you can get somewhere. But you can get somewhere, presuming you know what county the person you're researching was from, right? Otherwise, you kind of have to look for all of these. And so they really wanted item level, card level transcription and indexing, so that they could index them in a more complete way.
Ben: She and Steve Murray approached us, and the Council of State Archivists funded some additional features in our tool, to allow field based transcription for these kinds of form records.
Karen: Forms, right, instead of diaries.
Ben: Right, instead of diaries and letters and that.
Ben: They rolled out this project, to transcribe 111,000 index cards. One of the interesting things about the way that they did this project though, is that Meredith in particular was very concerned that the people who are mentioned, the counties that are mentioned, have the first crack at working on this, right? This isn't about getting free labor as cheaply as possible. This is also about connecting people with their heritage, through the state archives and those documents.
Ben: So she had people contact her, and she would assign them material based on where they were from.
Sara: It's a private transcription project. We're like, "You're crazy, you want to do this private?"
Ben: You're going to limit your volunteer pool? That's ridiculous.
Sara: It is.
Ben: That's a lot of work. And no, actually, archivists know more than we do.
Yeah, so it was really very successful. You had people who were, in some cases, would pair with each other, and they'd work on a set of counties that they were interested in. There were a handful of people from out of state who contacted her who said, "Well I don't live there anymore, but I grew up here, and my ancestors are from here. Please can I work on Coffee County," or things like that. I think it was very successful, and they ended up-
Sara: It was our most successful project ever.
Ben: Right. They ended up going through the 111,000 index cards in about two-and-a-half months.
Karen: Oh wow.
Ben: It actually threw kind of a kink in the works, because they had all these plans to publicize the project and make it open, including print publications and quarterly magazines. And when the announcement came up in the quarterly magazine saying, "Hey, members of the public, come work on this." And members of the public started contacting her, it was over.
Sara: But that is a testament to Meredith, and the fact that she thinks of herself as a public archivist, right?
Sara: And she's very good. She is very good at outreach, she knows all the local county genealogy societies and historical societies. She's done other projects where she's taken photography collections out to the neighborhoods, where the photographs were taken, and gotten identification by talking to people. By doing community events and things.
Sara: I think that model of public archivist is one that is very powerful. We work with public historians, like the Kentucky Historical Society, this Patrick Lewis, he's moved on now. But he has this very well thought out sense of himself as a public historian. And that is, I think, a very compelling model for historians and archivists both, when you start talking about interacting a lot more with the public.
Sara: And how do we have a respectful and useful and-
Ben: It's not really the thing ... You know, it's not a one-size-fits-all kind of perspective, right? We have projects in which this is absolutely the wrong thing to do. And a lot of them nowadays, I think that there's four or five projects using our software that are working with indigenous documents, and indigenous communities. And access there, and limiting access, and trying to assert authority over the materials, is really, really important to these communities. You have situations, and there's-
Sara: So, example.
Ben: Right, I mean a good situation is the Howitt and Fison papers. Howitt and Fison were a couple of Australian anthropologists during the Victorian period, who went out among the Aboriginal communities in Australia, and collected stories and rituals and all this. Folklore, a lot of linguistic records, all kinds of useful and interesting material. These communities have been shattered during that process, and since then, and you know there was some demographic collapse.
Ben: The Dieri communities are very interested in reclaiming this heritage. And they're looking at this material, and in many cases, the material that they're working with from Howitt and Fison, may represent stories that have been lost among the Dieri communities. But they're also finding things that perhaps should not be public. There are specific rituals that are insider only. There's a lot of gendered kinds of stories or rituals that are really only supposed to be done by this group of people. Whether that's men, or members of some particular group.
Ben: And so limiting access to those is important, and in many cases, they're only finding out that the access needs to be limited as they transcribe. So they're transcribing some of these materials and saying, "Okay, let's take this one down."
Sara: Private, yeah.
Ben: "We didn't know what was in here when it was scanned, and now that we know, we don't want it showing up in Google results."
Ben: One of the interesting things about that though, is that you have these colonial records that indigenous communities are working with. And a lot of them are taking the opportunity to talk back to the people-
Sara: The archives.
Ben: ... making the colonial records. So you'll have pages that are transcribed, recounting some story, and then the notes underneath it, by the Dieri people who are working on this. They will say, "Well, you know, my great-uncle also told me this story, and this is wrong. This is what he actually said."
Ben: Or they'll say, "Well this guy got this completely wrong, here's what's going on." So they are recontextualizing those. And that's really just the ones that we have access to, that we know about. There's a handful of projects that have installed the tool privately. And they are working on taking colonial records and creating new metadata that represents their own values and their own perspective. Transcribing it, using it within their communities and classrooms. And I will never see that. And that's fine, right?
Ben: That's not the goal.
Sara: Although that's what Ben's goal actually is. Not to see the restricted material, but he really loves that we get to see this broad swatch of history. And a new project will come on that is working in Mixtec, or Nauhtl, or something. He'll be like, "Ooh, I'm going to go buy a book on."
Geof: And this is an interesting example of what we usually call decolonial, colonial, somebody else say it.
Geof: Thank you.
Decolonizing I like better, because I can say it. Because if they're not indigenous created records, but they're indigenous created rituals that are recorded in there. And so it is interesting, especially in Australia, which did have some particularly bad situations. I mean I would say the United States was particularly bad too, but Australia still seems more painful to me when I look at what happened there. That they're saying, "Okay, we're going to essentially give up this stuff that the white people have written down, and give it back to the control of the people who actually made the original ideas that were in there."
Especially since these are rituals. And if you have rituals that are supposed to be in an inner sanctum that nobody from the outside can get to, it makes a lot of sense. But it also requires us to think differently about access, and to do things that we don't like to do. As always having been an archivist in public institutions, my default is always you should have access.
Karen: Yes, same here.
Geof: You know, there's almost no time that I stop-
Sara: Yeah, and we're open sourced and open access software people, right?
Sara: We think this is important too, but yeah.
Ben: We were starting our careers in the 90s, and so you know, this utopian, everything should be free thing is, it's really hard to shake.
Sara: And it's okay to not shake it, that is a noble model as well. But, yes.
Ben: But you know, you have to realize that it's not the only answer.
Sara: Right, it is not the only answer.
Ben: Right. And that's something that ... And there may be derivatives of some of these that we will have access to. We're just starting a project with the Standing Rock Sioux Tribe, that is working on transcribing documents that are written in Dakota and Lakota, as, privately, right? As part of a primary source-based approach to language revitalization. So they're going to incorporate this and these documents as part of their language instruction, language revitalization tools. And trying to bring those in, well they have to be transcribed first, so they come to us.
Ben: There's a decent chance that five years from now, I might be able to sit down and learn Lakota, using this. But that's only after it's been vetted and controlled by the community.
Geof: Right, and it's an important issue, because people don't worry about this as much as they worry about things like let's say climate extinction, to use a dramatic term. But language extinction is real, and it's actually speeding up. I mean in China, it is national policy to kill off everything except for Mandarin. So this is problematic, because when you lose a language, you lose part of the cultural richness of the world. And it's hard to recreate a language.
Geof: If you say, "Where did any people ever really truly regain their language?" The Jews of Israel is probably the only answer. I've never come across anything else. I've seen many attempts, but it's really problematic if you cannot keep it alive and move it by voice.
Geof: From generation to generation.
Ben: Right. It's interesting that you mention climate extinction and those kinds of events, because there's another interesting example of that that I'm not sure it relates to archives, but it is an example of a case in which first-world assumptions about the way that information flows and the way information is used can bite us. So some of the material that we work with is scientific material. Field books, field notes. That is really important nowadays to get transcribed, because you can't point a weather satellite at a given area a hundred years ago, right? There are no time machines, [crosstalk 00:26:03].
Geof: Not yet.
Ben: ... as we study those. And habitat loss and things like that. So the original journals kept by early naturalists are pretty important. So we do some work with people doing citizen science efforts.
Ben: And there was a big issue in the citizen science world about a year ago. There's a really popular program for bird watchers out of Cornell, called eBird. It's a great program. In which modern birdwatchers record whatever birds they see, whatever species they see, and it's all uploaded to a database, and the database is all free to use. Which really is in keeping with open science and open access.
Sara: And if you're going to do citizen science-
Ben: Crowdsourcing. Yes.
Sara: ... and crowdsourcing, then you need to make the results open.
Ben: I've given a speech about making sure your volunteers can actually use the data that they contribute, right?
Karen: Of course, yes.
Ben: And that's great. In North America.
Ben: Well so in the Indian state of Kerala, there is a really bad poaching problem with endangered species. And the idea of there being an open database of bird sightings that anyone can download, and that there are people in Kerala who are well intentioned bird watchers who go out and they snap a picture of something fantastic, and they enter it into the database. And the next day, it's gone, is a big challenge. I mean I feel like it's not just a matter of language extinction and culture, but I mean even with species extinction, we have to be careful about these assumptions.
Geof: Right. And we need to make changes based on such facts. And that's good. That's good. I wonder if what we call the Lab of O, the Lab of Ornithology at Cornell, I wonder if they're working internationally to try to help with this. Because it's one of the big ornithology places in the world that had what was supposed to be the last recording of ... What's that woodpecker? Ah, that woodpecker that was extinct? Very famous woodpecker.
And so I held that in my hands. All that is, is an acetate, you know, tape. But still, it's like, "Wow," to think about that. So it's a great institution. But we have to change the way we do things, because everything is now dangerous. And we have to figure it out.
What we figured out from the internet is we were able to make the world more dangerous, because ... This is a [unclear 00:28:39].
Ben: [inaudible 00:28:39] think about it.
Sara: Yes, yes.
Geof: If you can pull together so much information, sometimes you can do things worse than were possible before. And it was problematic because if you grow up in this western tradition, which then grew to access to everything that you could get. Information should be free. Or which was the model of the internet.
Karen: And eventually, it's like, "Well, maybe we've got to shut something down." It protects us also financially. It's so many ways in which things can go wrong. So I hope you guys aren't adding to this.
Ben: Well I mean the problem is, it's entirely possibly that we are. And all we can do is listen and react.
Sara: Because we also believe that what we're adding to the world is valuable, right?
Sara: It's not ... You know, you don't throw the baby out with the bath water.
Ben: But there are cases in which we are careful to see problems with things that maybe shouldn't be public. And we are pretty careful. There are a few projects that we have, two in particular, that start with redacted documents. Now we make sure that those redactions are on-image redactions, because you don't ... You know, you don't want somebody to download them say, "Oh, well let's turn off this filter." And then suddenly you have names of migrants and detention facilities, who are appearing online.
Ben: That's not what we want.
Geof: Yeah, the world is more complicated now, but it's more interesting.
Geof: So here's an interesting question. You guys are, as you call yourselves, software engineers.
Geof: But you're surrounded by, I mean physically surrounded by, in this room, in your restroom. I had to go to the restroom, ladies and gentlemen. You know, in this house, with books that show a real connection to the humanities.
Sara: Oh yes.
Geof: More than anything else.
Geof: As a matter of fact, except for the genome book, I'm having a hard time finding a book that isn't [inaudible 00:30:53].
Sara: You know, business and science is about half of one of the bookshelves. But most of the bookshelves are humanities things. So we both have degrees ... So we have degrees in computer science, undergraduate degrees in computer science. We both did second majors. So I have a degree in the study of women and gender, which I loved. Because it was naval gazing 101, when you're 18 years old. Especially when you are one of the only women in your computer science classes.
Sara: I wrote a paper on this, 10% of my graduating class were women.
Karen: Oh my.
Geof: Seems high.
Sara: No, no, no.
Geof: At the time.
Sara: That was about the bottom. I mean it never got that much lower than about 10%. But 10% is not very high.
Geof: That's what I mean.
Sara: But it was also great, because you got to sample. I mean I took classes in religious studies, I took classes in history, I took classes in sociology. You know, I got to take classes all over the humanities and social sciences both, and they were great. Because I didn't have to specialize within this one lens of thinking. Which to me was always a lens of thinking outside the box. Because people who have looked at things, especially so I was in college in the 90s. At that time, we're trying to look at kind of traditional knowledge from a different perspective. And I think that, as a training ground, has really helped me think about things like business, or technology, or living life, from a different perspective. And being more thoughtful about that.
Ben: And then my background, my other major was in linguistics, and specifically historic linguistics. So I spent a lot of time taking one or two semesters of different languages, just kind of to see how they work. So that's been very handy, in this work, dealing with texts and textual scholarship. But issues of transcription and translation and transliteration, these are all things that we had to grapple with, when you're dealing with cuneiform documents, for example. And having that background has even been helpful when we're dealing with twentieth century American archival material.
Geof: Wow, that's kind of interesting. So you got into this because Ben had a great-great-grandmother, or great-grandmother, who had these diaries. And that is a cultural record. It might be a [crosstalk 00:33:19] culture, if you want to think it that way. But it represents the broader culture of that area.
Geof: Is your sort of bifurcated, or let's say conjoined interest in computing and the humanities, is this what sort of makes your business work? Is this why you think you are actually trapped in this world?
Sara: Well I mean, lucky to be in.
Ben: Yes, this is why we do it. I mean nobody goes from industry into archives and history for the money. Right? And the idea that we're able to do this, combined with the autonomy of having our own business, of being able to work from anywhere, that's the draw. And being able to come into a community that is frankly underserved by technologists. There's software, it does its job. But there's not ... We don't have a lot of peers who are having conversations about what people working with historic documents need, who have already developed all these tools and ...
Sara: The feedback we get on our software is, "It's so beautiful. It works so well." I'm like, really? I mean, it's okay. By our standards, it's like, "You kidding me, that's five years behind the times." But it's not 10 or 15.
Ben: It's not like an app on your phone or something like that, right?
Ben: But being able to engage with different pieces of history, with different kinds of documents. We've done a lot of work recently with financial records relating to slavery. Before that we were doing work with a lot of colonial records in Nahuatl and geographic records of early Latin America. And being able to go from all those things. That's amazing. That's so much fun.
Ben: And not just the documents, but the people. Working with the people.
Sara: We, I mean, we have the best customers and collaborators. Archivists and historians are nice, they have interesting material. They are enthusiastic about the projects they want to do.
Ben: And we learn so much from them.
Ben: It's just, it's incredible. It's lots of fun.
Sara: We're so lucky.
Sara: Because it gives us an excuse to go learn all these other things.
Sara: Which we talk to people who like to go learn all these other things for fun.
Geof: It's good to have that. You seem like you're happy.
Sara: I think we are.
Geof: But just a small little dip into the world that you showed us, and I'm here like, "Wow, they get to do so many interesting things." Because you can touch the entire world, and not have to leave Austin. And still have a big effect on things. So I mean it's not just making money, which you need to because as I keep telling people, "People have to eat."
Sara: Yes. We have two daughters. I'd like them to go to college someday.
Geof: So I'll tell you my story about vendors. I used to have my staff, they'd say, "Oh those vendors, you can't trust vendors."
Geof: And I said, "Okay, ladies and gentlemen, let me just say one thing, okay? How do you think we're going to do this work if nobody helps us? And who's going to do this for us for free? They have to eat." There's nothing bad about this, you just have to accept this. This is how we get things done, and this is how we achieve our goals, that otherwise we can't.
Geof: If you have a bad vendor, which has occurred occasionally, I have run into them. I just cut them off, that's fine.
Geof: So, it's just like real life, they're actual real people. Sometimes I go to conferences, and nobody will talk to the vendors afterwards, because, you know, they're just a little worried. We have one friend, she said, "Oh, you want to go to dinner with us? Oh, nobody wants to do that."
Geof: I'm here like, "I'm not going to buy anything from you, but you're good to talk to."
Geof: It's like, I like thinking a little humanistically sometimes.
Geof: But just amazing, the things you guys have done. So how many continents are you working on?
Geof: Can't be more than seven.
Sara: No, can't be more than seven. So definitely lots in North America.
Sara: We've done a couple projects in Australia.
Ben: Right. The National Archives of the UK, is starting up [crosstalk 00:37:50].
Sara: Plus, yeah, the British Library did a project.
Ben: And the Slovakian National Gallery, so that's Europe.
Sara: What, Slovakian? Slovenian?
Sara: Slovakian, I always get those messed up.
Geof: They are different.
Sara: Yeah, I know.
Geof: I've never been to Slovakia.
Sara: So yeah, decent amount in Europe. Victoria and Albert. Lots in the UK, right? But we do support other languages, but not in our-
Ben: In our interface.
Sara: You could transcribe in other languages, you can't [inaudible 00:38:15]. We haven't internationalized our interface yet.
Ben: There is a user, someone who's installed the software in Belgium, who is really, really interested in launching an African project. Again, colonial records from the Belgian Congo, and trying to decolonize those in a way. And the challenge that they have and that we have, is that in order to reach those communities, you need to be able to reach them in French. An English language interface is insufficient.
Ben: Really that's the next big step for the software, is to try to internationalize the interface so that it can handle Japanese, Spanish. I might add French.
Sara: We have people interested in Japan.
Sara: We don't have any projects in Japan yet, do we?
Ben: Not live, because we don't have the interface.
Sara: Not live, because we don't have an interface, right? And we do some collaborations with ... Where's Gimena? She's in-
Ben: Oh, she's in Argentina, yes.
Sara: Argentina. And that's actually interesting, because those are projects that are collaborative between a digital humanities center in Argentina, and the LILAS Benson Library, so at the University of Texas at Austin. So they have a lot of colonial records and we did a project this past year, where it was a ... What do you call?
Ben: Pelagios funded thing, was that what you call the grant?
Sara: Yeah, but no, no, no. What's the document?
Ben: Oh, it was a gazetteer, right. Latin American gazetteer.
Sara: A gazetteer, year. A list of all the places in different parts of Latin America.
Karen: Wow, yeah.
Sara: And doing, this was an OCR correction project. And trying to correct that and then tag it. And be able to export places and then map them. So that was a big complicated project.
Karen: That's cool.
Sara: But it was neat, because it had collaborators all over the world. There was somebody in Europe, two people, two groups in Austin, and Argentina.
Ben: That was one where we got to use our, the whole family business thing. Was when we got our 14-year-old working, doing OCR correction. So we now have a 14-year-old who knows probably more than any other 14-year-old about Latin American place names, beginning AIA through AL.
Karen: Wow. All right.
Sara: She'll be at the booth at SAA.
Karen: Oh good.
Sara: On Monday, if you would like to meet her.
Geof: Wow, well that will be interesting. I'll have a lot of questions for her.
Sara: You know, it's actually fascinating to watch the kids. Because you know, we don't push what we do from the business thing. But they both work on their cursive handwriting. Our 14-year-old will write in cursive, because she's figured out it's faster, which is true.
Ben: And she's taught herself, right?
Sara: Yes, totally self-taught, because they've learned, you know, had two weeks of it in third grade or something, and that was about it. And the third grader has started playing around with learning cursive.
Karen: Oh, that's cool.
Sara: Because they see it as something valuable.
Geof: It is faster. The thing that I like is, I was talking to Karen about this just the other day. Is that when I was in Kindergarten, in a German school in Portugal, I had to learn how to hand write. So I had very, very careful handwriting, and it was very beautiful. So that really affects everything in my life. As a matter of fact, I make art that's hand written. It's not even calligraphy, it's just handwritten. But it's based on the fact that I have, deep in my bones, all of this.
And so sometimes I use a European p because that's the p, an open p, because that's the cursive p that I learned. Because it sort of says something about my past. All these things, the way we make marks. The way we save information, says stuff about us, and makes the world real again, when it's not even there anymore. And that's what I like about these projects.
Karen: Well I'm just curious, what's the day-to-day like, for FromThePage?
Ben: So for the software system, or for our own life, right? A lot depends, right, because the software system, we-
Sara: [crosstalk 00:42:12] talk about the software system. [crosstalk 00:42:13] our life, yeah.
Ben: Okay, it really depends on the project, because we will see ... When we wake up in the morning, we will look online and we will see one user who we guess must be in England? Who has been working solidly on Indiana World War I service cards for hours.
Ben: And then more people in North America wake up, and they start working on these projects, and we start seeing other projects come in. There's a slavery and race reconciliation project at Sewanee, where we'll see some people working on that.
Sara: We're pretty sure those are students that they've ... They're working with.
Ben: Right, part of it is it's during the day.
Sara: It's during the work day.
Ben: Part of it's, they're kind of slow. It also depends, well, I mean ... To be fair, the material varies, right? There is a huge difference between transcribing a type written index cards, in which you're just typing maybe a dozen fields. And worse case scenario, working on a Medieval manuscript that's written in Insular minuscule, and is an Anglo-Saxon, that lots of abbreviations, right? There are huge differences. There was one point in which both the Alabama World War I card and the Indiana World War I card projects were active, and we were seeing 150, almost 200 pages an hour being transcribed.
Ben: By volunteers on the site.
Sara: By volunteers, yeah.
Ben: So I mean sometimes the speedometer kind of gets pegged.
Karen: Yeah, wow.
Ben: Right. So that's kind of a-
Sara: Yeah, so that's kind of the software and what's happening with it. But our day-to-day life, it really varies. We try to come in and we get an email overnight that kind of shows us all the new things on the system, and try to respond to that if there are customers who need an email that, you know, "Looks like you're almost done. Do you need help uploading new stuff," or things like that. New users coming in that aren't transcribing things. That's a problem we're trying to capture, is sometimes we'll have people sign up for an account, and not actually transcribe anything. Okay, what are we doing wrong? How do we fix the software to capture those people?
New trials coming in, or people asking for information about FromThePage. We liked to have more of that, but not always. So we kind of do that for the first maybe hour or two of the day. We still do a lot of contract work, for software development in the digital humanities. So some days we'll concentrate on that for the day. Other days we'll concentrate on FromThePage, we have FromThePage Fridays. Which is when we choose to concentrate on FromThePage.
Karen: Oh, very interesting. Yeah.
Sara: We do a lot of --s o about once a month we send a marketing newsletter to our transcribers. We don't call it a marketing newsletter, but it's like, "Hey, here's the cool new projects that are on FromThePage." Or, "Here's something you might be interested in." I try to come up with a theme.
So like the one that we sent out this week was Eye Candy. And it was the prettiest things that we have. So there's a field book from an Orchidologist that the Chicago Botanic Library is transcribing. And then some of the medieval manuscripts from the Parker Library at Corpus Christi College Cambridge, they're beautiful, right? And then this Dickens manuscript.
Ben: Yeah, Deciphering Dickens, which is ... It's almost the opposite of beautiful. Charles Dickens did so much in the way of scratch throughs and obliterations. When he changed something, I mean he just clobbered out whatever it was that he had written.
Sara: The most complicated stuff.
Ben: But it almost looks like a ... What is the name of that, the kinds of poems where somebody blanks out everything except for a few words?
Sara: Oh, a black-out poem?
Geof: Cancellation poem?
Ben: Cancellation, yeah.
Sara: Oh, I've never heard that phrase.
Ben: The Dickens manuscripts almost look like cancellation poems.
Karen: Oh boy.
Geof: Wow, that's pretty great. Yeah, my actual field of expertise is visual poetry, which includes cancellation poems.
Sara: Ah, okay.
Geof: And I actually was working on a dictionary of terminology in it for years, and then decided I'd do some other things for a little bit.
Geof: But that's kind of an interesting life. But what I'm wondering is, what do you know about these transcribers? Even if, sometimes, you know, in Alabama, you knew that they were either from the county or used to be from the county. But what about ... Why do people do this? Do you have any idea in the other situations? Where they're not necessarily tied together, like some guy in England doing Indiana?
Ben: So it's a little bit hard for us to say about the end users, because most of the interactions are done by the people at the archives and the institutions who are running the projects. We do know that there are a lot of volunteers who spend a lot of time working on projects, who are really motivated by subject matter, and they love the process. Because it's a deeply immersive process to transcribe something.
Ben: And those volunteers we'll see, will jump from a project to a project, right? They'll spend several months working on Alabama, then they'll switch to Indiana, then they'll switch to Maryland, or something like that. Often within the same either type of document or the same subject matter.
Sara: World War I.
Geof: Right, World War I.
Sara: There's a lot of people who love World War I, because there's been like three projects or more.
Ben: You know, if there is any document relating to 19th Century Texas? I know one volunteer who will work all weekend, if we let him know. Because that is something that he just loves.
Sara: Some of the very early transcribers we had on the Julia Brumfield project.
Ben: Right. Which is a project that we run. The ones that we run, we know a little more about.
Ben: And from talking to other people running these projects, these are not atypical. There will be people who do have a connection with the material of one sort or another. Sometimes, a family connection. So one of the distant cousins who was working on the Julia Brumfield diaries, she had had one of the diaries in her possession. She was the widow of the diarist's grandson. Sorry, great-grandson. And he had borrowed the diaries from his mother, and they were snowbound in North Carolina for almost a whole week. And they just sat in bed, reading entries in the diary to each other. Then he had died of a heart attack at the age of 40.
Ben: And when these diaries went online, and she started transcribing them, she would work through a few of them, and then she would call her mother-in-law, and talk through what she had read. Because her mother-in-law is mentioned in them as a small child growing up in the household. And that kind of connection, it's this really important part of her life.
Ben: At one point, I saw her go from transcribing about 20 or 30 pages a day, to five, and then to three. And I asked her what was going on. And she said, "Well, I keep hoping that you're going to upload another one for me to work on. So if I can kind of stretch this out until you do, then I won't lose my routine."
Sara: Her momentum, her routine.
Ben: And this is true for other people. There's another transcriber who had done a little work for Family Search Indexing. Did a vanity search on his name. He found our site, and realized that the diarist's mailman was who he was named after, was his namesake.
Sara: His great-uncle.
Ben: Yeah, it was his great-uncle. So he starts transcribing that, and he would have a regular routine in which every day at his lunch break, he would log on, he would read two or three pages of the previous day's work he'd done. He'd work for about 45 minutes.
Sara: He'd edit them, update them.
Ben: Right. Work for about 45 minutes transcribing. And then stop and then read ahead the next two or three days' worth. And just every day, for an hour, during his lunch break, that's what he'd do.
Sara: What's interesting about this guy, his name is Wooding. And when we finished the Julia Brumfield diaries, he's like, "What else do you have? I'd like to keep working on this sort of stuff." We're like, "Well, you know, the only other project we really have going on is this Ornithology Field Book project at the," was it Vertebrate Zoology in Berkeley.
Ben: Museum of Vertebrate Zoology.
Sara: He was like, "Well, you know, I've done fish my entire life," because it turns out he was a statistician for the water board in, I don't know, something in Virginia that included water. Counting fish and the environmental impact. "I think I'll try birds." And so he worked transcribing on their bird material, it was field books. But he also helped them develop ways of doing markup for species names and thinking about the data, right? If you think of collections as data, and things like that. He was able to take his professional experience, plus his transcription experience. Work on this new project, and bring value not just as a transcriber, but as a collaborator.
Sara: When that project was done, he moved on to other projects.
Sara: He's now one of the most valued transcribers at the Library of Virginia. Who he's in Virginia, but they, the Library of Virginia is kind of the state archive plus the state library. And they've been doing transcription projects for years. They have experience on like three or four different platforms, and there's not that many of them.
Sara: And we do a lot of work with them. But yeah, but Nat shows up there, and there's a volunteer profile of him there. So people love this work, right? It's generally retire- ... You know, Nat's a retiree, right? And you're looking for meaningful work, especially meaningful work you can do if you're home bound or that you don't have to travel, you don't have to be in a particular place to do. And so they go from project to project, doing different things. I think he did a lot of court records work.
Sara: Like historical court records work for Virginia.
Sara: And he found really interesting stories there too.
Geof: Well, I find it all fascinating.
Geof: There are all these worlds out there, that we don't know anything about. And then we stumble upon them, and there's all this deep riches and these streams of ore that bring beauty and riches into the world, and we just don't know about them. But it's great to get this kind of access, which I really do think is essential.
I should tell you, I have actually transcribed a diary. My mother's Bolivia diary. Which I transcribed. It was in a little plastic bound notebook, it was just a note- ... It wasn't even ... It was a notepad where you're supposed to rip off a page and throw it away.
Geof: And so there are little squares. And what I realized is, my mother never could spell as well as I could. I mean never. So it's completely transcribed accurately, so I do not respell anything.
Sara: Mm-hmm. Yes.
Geof: Everything is just absolutely there. There are no (sics), nothing, it's just this is the way it is. It ends without a period. So it has a little note about transcription, etc. But I still have, I do have my grandfather's diaries and my grandmother's, which were put into calendars of different types. And so eventually I'll do those, but they're very brief.
Sara: That's bonkers.
Geof: And my grandmother, you know, she doesn't recognize my birthday, because she didn't keep a diary for that. I mean I have at least 50 years of diaries for her.
Geof: So there's a lot.
Karen: Oh my.
Geof: And so you can see, and really the only thing she does is say, "I bought some food, I had lunch. Went to a party." Almost no words. And then at the bottom it'll say, "Jeanine born", which means one of my sisters was born. Pretty dramatic.
Sara: Julia Brumfield's diaries are not that much more interesting, but at least you get a record of like what's going on with the agricultural planting.
Karen: Oh wow, yeah.
Sara: And the mailman's car broke down. Oh boy.
Ben: like that, it's a little [unclear 00:54:27]
Geof: My mothers was a lot better, but she only kept hers for a short time. But also, it's like, I went up into the mountains, this is when we lived in the Andes. And then I got home late, and then I got bitten by a dog, and my father was so angry at me. He said, "I hope I get rabies," and I said, "Wow, I didn't remember this [crosstalk 00:54:50] stuff." But you know. But it's kind of inter- ... The good thing is, the ability to remember things that even you experienced, that now have disappeared out of your head, because they weren't important enough.
I only know that because I transcribed it, otherwise I wouldn't have. And the transcription helps, because you're in it. As you said, you're in it so deep, because you know, I've got to get the letters right, I've got to make sure everything's accurate, so you learn it really well.
Karen: Yeah. Well I was just going to add, we're about to celebrate our 75th anniversary at the Fashion Institute of Technology.
Sara: Oh wow.
Karen: And we have been just feverishly going through our historical photographs.
Karen: And we're about to launch a site with our alumni affairs unit, to say help us identify these people.
Sara: Yes, yes.
Karen: So, I mean it's a little different, but.
Ben: No, it's ...
Sara: It's fascinating and interesting, yeah.
Ben: But it's still the same kind of crowdsourcing. It's not transcription like we do, but-
Karen: Exactly, yeah.
Ben: ... [crosstalk 00:55:50], yes.
Karen: So we're really excited about that.
Geof: There's still typing involved.
Karen: Oh yes.
Karen: Oh yes, definitely.
Geof: That's right.
Karen: So I think it's time to ask our next question.
Karen: Which is, if you each could tell us what keeps you passionate about this work you do? You've got so much invested in it. Tell us.
Sara: I think for me, it's seeing successful ... Okay, you create something. We create software, right? We're software engineers, we've created this software, and we've created the business to help make the software successful. And it is. Like we look at it in the morning and we're like, "Look, people are doing meaningful volunteer work, contributing to the world, in a very small way, but by transcribing these documents. And like it's working, and people like it, and people like us!"
I mean not so much us. But you know, when you ... So the challenge when you create things, businesses and software, is like when it doesn't work? It's personal, right? It's so personal. And as you struggle to make it successful, you often feel unsuccessful, and like you're not. It's just, you know what, this isn't working.
Then there are these moments that you look at it and you're like, "But this is working." People can use the software. They're enjoying it, they're contributing. And that's just pretty awesome.
Karen: Yes. I would imagine so. The closest I've ever come to that experience is working on full commission.
Karen: In a sales position.
Sara: Oh, yes.
Karen: And you just, you have to have your whole body, heart and soul invested in the work.
Karen: But once it starts to-
Sara: It's emotional, right? Hard.
Karen: It is, very hard. But once you see a successful transaction, it buoys you and it makes you ready for the next one. So very good. Ben?
Ben: Well for me it's the tantalizing dream of each new project to come.
Karen: Oh boy. [crosstalk 00:57:48]
Ben: I just get so excited. I mean, you were talking about us being able to see a lot of different aspects of the world and of history. It's very similar to what the two of you do in An Archivist's Tale, of going around and interviewing different people. And that's so exciting, right?
Karen: It really is.
Ben: And that, for me, I get really, really into these things as Sara said. You know, I'll spend a year trying to teach myself Nahuatl, because we're working on a Nahuatl language project.
Sara: He doesn't need to do that.
Ben: Well, yeah, [crosstalk 00:58:19].
Sara: It's not a requirement.
Ben: It's like [crosstalk 00:58:20], I'll say, "Ooh, this project is a lovely excuse to work on Manchu." Learn about financial records of slavery, or things like that, right?
Karen: Yeah, yeah, yeah.
Ben: And so there's that. But there's also some of what Sara's talking about. For me, I've always had this vision that the public could contribute meaningfully to history. And it just, as somebody who started off doing family history, it just killed me when I would see people who meant really well, and they'd say, "Well you know, I have these primary sources from a relative. And what should do I with them? I'm going to write a book."
And so you have a Civil War diary, and rather than transcribing that diary, and putting it in a format in which it could be used by scholars, you end up with an amateurish, lost cause narrative, again. That kind of was the model, 10-15 years ago, when a lot of amateur history among the public. And I've always thought, if you could get people and channel some of that-
Ben: ... enthusiasm and that effort, for people who have documents, or are willing to work with documents that are in institutions. And get them to use as much scholarly rigor as possible.
Ben: And turn those into things that can be reused. Perhaps by their relatives, who are interested in reading Great Grandpa's diaries. But perhaps as a transcript and a finding aid. Or as a source for a researcher. You know, that that kind of collaboration could actually work. And seeing it work, seeing this form of crowdsourcing and this form of collaboration between professionals in cultural heritage, and between the public, actually show some fruits.
Watching emails by people saying, "Here's what I found in the documents I was transcribing. I learned these things about the antebellum south that I thought were not a big deal. It turns out they are real and they are a big deal." "I learned this thing about the way that lighthouses work on the Oregon Coast." And you know, these kinds of things that people come up with, are really powerful. That gives me a real sense of pride.
Karen: That's incredible. It's like a constant new open window experience.
Karen: Very good.
Geof: And the thing to keep in mind is this. Archivists sometimes worry that people are taking over their work, you know, and like people are doing it for free. But the fact is, no archives would take this on.
Geof: No archives would say, "We're going to take our professional staff, and we're going to make them transcribe hundreds of thousands of pages." Because the costs would be too great. And when you have volunteers who really want to produce something for you, because they're tied into the goal, they're tied into the subject. They're tied into this little view that they're going to get. And that they want to do something. It ends up having this positive effect for everyone, including the people it's being made for, these users who we don't know yet. We don't know who they are.
Geof: But this makes stuff so much more useful. And this is why, for instance, you have such things as military records, and court papers, and diaries transcribed, because it gets to so many people. Meaning, everybody's family is recorded in record somewhere. So if you get records of genealogical value, that especially opens up the world.
But you're also talking about records of cultural value to cultures who have almost no access to their recorded culture. And might not have even, in some cases, created it themselves. So it does, this rich thing, it allows humans to be humans. And get to the core of our lives, which is to remember the past, to live it, understand the world, and to believe there's some reason we're here. And so, the way you guys talked about it, I just-
Karen: Yeah, yeah, absolutely.
Geof: It was really quite an interesting time, so thanks a lot. [crosstalk 01:03:02]
Sara: Our pleasure, yeah.
Geof: This is why I like to do these, because we learn so much. And the question about passion is there, because we believe that the people who do the best work are doing it because they're driven by passion.
Geof: And that's what we see in you guys. Especially Ben, as he tried to learn a new language every [crosstalk 01:03:21].
Ben: I might be doing that anyway.
Karen: This gives you a little more of a push.
Sara: A reason, a justification.
Karen: A justification.
Ben: Don't tell the IRS, but it means I get to expense the language books that I buy.
Ben: Only if it's related to a client.
Geof: We can tell them that. They might have access to it.
So ladies and gentlemen, that brings us to the end of another episode of An Archivist's Tale. Another great time with people trying to make archives all that they can be. Thanks for listening, and do not wail, do not beat your chest, do not cry, do not worry, do not pout. Next week, 10:00am Eastern Time, yes, there will be another episode. So until then, bye-bye.