LPC Blog

The Library Publishing Coalition Blog is used to share news and updates about the LPC and the Library Publishing Forum, to draw attention to items of interest to the community, and to publish informal commentaries by LPC members and friends.

Transitions is an occasional series where community members reflect on the things they have learned while moving from one institution to another or one role to another. 

By Monica Westin, Google Scholar partnerships lead / technical program manager

In the spring of 2014, I left a PhD program in classical rhetoric to try out a career in scholarly communication. I was immediately hooked by what I saw as unsolved problems in the ecosystem and the potential impact of making academic research easier to access. Except for a brief stint at HighWire Press, I spent the following four years in the institutional repository and library publishing space, first at bepress and then at CDL’s eScholarship, the University of California’s system-wide repository and publishing platform. 

One Monday in November 2018, three days after leaving my job as publications manager for the library publishing program at the CDL, I started a new role as the program manager for partnerships at Google Scholar. The past two and a half years have been eye-opening.

I have three strong memories from my first week. The first is knowing I had made the right decision to take the job when my new boss, Google Scholar co-founder and director Anurag Acharya, described the mission of Scholar to me in our first meeting: that “no matter the accident of your birth,” he told me, you should be able to know about all the papers written in any research field you might want to enter. What you did with that knowledge was up to you. 

My second memory is the expression on Anurag’s face when I admitted I didn’t really understand what robots.txt instructions did. “Goal: be more technical!” I wrote in my notebook that afternoon after spending hours looking up basic web indexing protocol information on Wikipedia. I don’t think he looked quite as disappointed as I remember, but I knew that I could no longer get away with not knowing how things worked. 

The third memory I have is something both Anurag and Alex Verstak, Google Scholar’s co-founder, kept telling me the first week: “This job is big.” The scale of outreach at all levels we do was a huge surprise to me.  At one point, my predecessor in outreach and partnerships at Scholar, Darcy Dapra, measured tens of thousands of Google Scholar external partnerships, from homegrown single-journal sites to campus-based institutional repositories and OJS instances, and publishers and publishing platforms of all configurations, from open source to proprietary, nonprofit societies to major commercial systems and multinational conglomerates. Each one of these partners creates publications indexed by Scholar, and each partner involves outreach from the Scholar side at some point.

I hadn’t realized before I moved into this role how proactive the Google Scholar team is about ensuring all of these sites are indexed well. Before my job at Scholar, I had no sense of whether and how much anyone working at Google Scholar might be looking at sites like institutional repositories and campus-published journals– I had the feeling that they would care far more about commercial sites and platforms in general, which is simply not the case. Now I know that if you haven’t heard from us about your library published journal site or repository, it’s likely because no news is good news. 

Just in the repository and library publishing space, I send about a hundred emails each month related to the indexing of individual sites and have regular conversations with all repository and library publishing platforms, open source and proprietary. I realized quickly that even though Scholar indexing and error detection are automatic processes, each error report creates a new project. In each case, I pick up the error report for each site and try to find a person working on the site who can hopefully help fix these errors.  

I spent a large part of my first year, at least half my working hours, devoted to helping improve indexing & inclusion for repositories and library-published/ campus-published journal sites. So far, I have worked directly with over six hundred of these sites, almost all the platforms they use for infrastructure, and dozens of user and community groups. My first major project at Scholar was a series of webinars for DSpace repositories and OJS journal instances globally, working with EIFL, DuraSpace, and dozens of local partners all around the world. I produced webinars about these platforms in collaboration with local partners for Indonesia, Costa Rica, Argentina, Brazil, Ecuador, Spain, Mexico, Peru, Nigeria, Belarus, Ghana, Kenya, Tanzania, Uganda, Ukraine and Zimbabwe. 

One thing I learned through these webinars, and my outreach work in general over the last two and half years, is that the success of a library publishing program, at least in terms of technical stability and global visibility/discovery, has very little to do with using new shiny software. I have seen publishing programs run on expensive proprietary sites riddled with errors that prevent indexing at all; and some of the best campus publishing programs run on uncustomized DSpace and OJS software implementations in countries where basic internet access is often unstable. 

In fact, now that I am a few years into my job, I worry increasingly about the sustainability of the library publishing space when prestigious grants seem to be largely earmarked for the creation of new platforms, instead of helping to fortify, maintain, and build on the platforms we already have. Fortifying and building on existing systems is always less exciting than new ventures, but it is a crucial part of a healthy ecosystem. 

For example, automating metadata management and error detection for large repositories would allow sites with many articles to make their authors’ research as globally visible as possible. As repositories continue to grow, this kind of maintenance will become crucial, especially given the increasingly limited resources within libraries. Many of the newer repository platforms lack basic indexing setups, and while they might have sophisticated interfaces, don’t actually serve the library community the way that the older open source systems do. While there are some fantastic new library publishing platforms out there, the reason we recommend OJS, DSpace, Digital Commons, and EPrints at Google Scholar is that they work out of the box, without additional efforts by under-resourced librarians. 

Ruth Tillman’s hilarious and scathing blog post Repository Ouroboros captures my observations about what isn’t working in this space, where a primary focus on developing new infrastructure can undermine the work of generous, energetic, and visionary library technologists.