Topic: Privacy and Analytics
The rise of usage analytics presents a variety of challenges and opportunities for library publishing. While services such as Google Analytics allow publishers and authors to better understand how readers are finding, using, and sharing publications, tracking also raises questions of patron privacy and ethical data usage. As universities increasingly use analytics—usage statistics, altmetrics, bibliometrics, etc.—to measure “productivity” through Current Research Information Systems (CRIS), publishers must consider the broader information ecosystem of publishing analytics.
Privacy is a complex issue that varies widely in its conceptualization and legal implications. For the purposes of this document, we primarily focus on U.S. (and occasionally U.K.) examples that affect reader privacy. The context of privacy norms and laws may be different in other countries.
Patron privacy is a cornerstone of library practice. The American Library Association Intellectual Freedom Committee states “In a library, user privacy is the right to open inquiry without having the subject of one’s interest examined or scrutinized by others” (ALAIFC, 2014). With the post-9/11 expansion of mass surveillance in the US through legislation like the Patriot Act, many libraries have reaffirmed their commitment to protecting potentially sensitive information. Organizations like the Library Freedom Project have created resources to teach librarians about surveillance and how digital tools can be used to safeguard privacy.
At the same time, the publishing business model has increasingly shifted to incorporate the collection, aggregation, and analysis of usage statistics. The uses of this data can include:
- Personalization, potentially including reading recommendations and/or saved content
- Reporting to university administrations on researcher publications and “productivity”
Publishing programs have an interest in collecting readership data because it can help demonstrate the value of the program and help library staff better understand how they can improve these services. However, this data collection can run counter to a library’s commitment to protecting patron privacy—and may jeopardize relationships with faculty who are resistant towards movements to measure researcher impact. Publishing programs must determine how they will balance their need for assessment with reader privacy. A library publishing site should point to existing policies on the library’s website, which should be followed in addition to specific considerations as a publisher.
Includes using HTTPS, use of reader analytics, tracking usage with personally identifiable information/sending that back to vendors, especially over insecure channels. Also in scope, making it clear to readers what is being tracked and having opt-out options in place.
This section introduces relevant resources on the topic, and provides context and guidance that will help library publishers to use them effectively.
Using the HTTPS protocol by default on websites has become standard since 2016. The Library Freedom Project calls HTTPS “a privacy prerequisite, not a privacy solution” (LFP, n.d.). HTTPS is not only good for privacy, but it is good for Google rankings; beginning 6 Aug 2014, Google has given HTTPS sites a small boost in rankings, and in Dec 2015 they began to prefer indexing HTTPS pages instead of HTTP. Google Chrome now displays a “not secure” warning for all HTTP pages. The institution’s central information technology departments or the vendor of a hosted service should be able to set this up for a library publisher. Remember that HTTPS only prevents eavesdropping on the connection and as such is only a small step toward privacy.
- Bahajji, Z. A. (December 17, 2015). Indexing HTTPS pages by default [Blog Post]. Retrieved from https://security.googleblog.com/2015/12/indexing-https-pages-by-default.html
- Bahajji, Z. A., & Illyes, G. (August 6, 2014). HTTPS as a ranking signal [Blog Post]. Retrieved from https://webmasters.googleblog.com/2014/08/https-as-ranking-signal.html
- Internet Security Research Group. (n.d.). Let’s Encrypt. Retrieved from https://letsencrypt.org/
“Let’s Encrypt is a free, automated, and open certificate authority (CA) for implementing HTTPS, run for the public’s benefit.”
- Library Freedom Project. (n.d.a). Retrieved from https://libraryfreedomproject.org/
- Library Freedom Project. (n.d.b). The library digital privacy pledge. Retrieved from https://libraryfreedomproject.org/ourwork/digitalprivacypledge/
- Schecter, E. (April 27, 2017). Next steps toward more connection security [Blog Post]. Retrieved from https://blog.chromium.org/2017/04/next-steps-toward-more-connection.html
Social Media Sharing Buttons
Buttons to allow easy sharing of content on social media are quite popular on websites. There are concerns that they may slow down websites and may send information to advertisers, allowing individuals to be tracked across different sites. If sharing buttons are used on a site, there are options that do not set cookies. In his post about library tracking and privacy, Eric Hellman states, “Libraries need to carefully evaluate the benefits of these widgets against the possibility that advertising networks will use [a patron’s] search history inappropriately” (Hellman, 2015). American Library Association (ALA) privacy guidelines for websites state “Libraries should carefully evaluate the impact on user privacy of all third-party scripts and embedded content that is included in their website” (ALAIFC, 2016).
- Hellman, E. (June 16, 2015). Toward the post-privacy library? Public policy and technical pragmatics of tracking and marketing. American Libraries. Retrieved from https://americanlibrariesmagazine.org/2015/06/16/toward-the-post-privacy-library/
- Kmetko, L. (June 30, 2015). A big test of social media buttons – performance, privacy, features [Blog Post]. Retrieved from https://www.xfive.co/blog/social-media-buttons-test-performance-privacy-features/
There are many different ways that publishers may collect reader analytics. Perhaps the best known, Google Analytics are used by many libraries to obtain aggregated data about how our websites and publishing platforms are used. This information can help us improve the website and focus on content that is of greater interest to our readers. However, by using Google Analytics, we are providing Google with information about our readers. In 2016, Google altered their default terms (with an opt-out) so that one’s web activity may be associated with personally identifiable information (PII), allowing DoubleClick’s ads to provide relevant/customized advertising. By using Google Analytics on our publishing sites, our readers are being tracked for advertising purposes. According to Eric Hellman’s research of ARL libraries, in spring 2016, 72% of ARL libraries use Google Analytics. While there has not been a similar study of library publishers, it is likely that the use of Google Analytics is also prevalent. Privacy issues with Google Analytics were also addressed by Patrick OBrien and Scott W. H. Young at the 2016 Digital Library Federation Forum. ALA’s privacy guidelines state: “Careful consideration should be given before using a third party to collect web analytics (e.g. Google Analytics) since the terms of service often allow the third party to harvest user activity data for their own purposes” (ALAIFC, 2016).
Library publishers should use services that have opt-out policies. However, the prerequisite for this is that readers know that such a service is being used, is tracking them, and that opting out is an option. Google Analytics U.S. terms of service state:
- Google, Inc. (2016). Google Analytics terms of service. Retrieved from https://support.google.com/analytics/answer/7124332?hl=en
- Hellman, E. (May 23, 2016). 97% of research library searches leak privacy… and other disappointing statistics. [Blog Post]. Retrieved from https://go-to-hellman.blogspot.com/2016/05/97-of-research-library-searches-leak.html
- OBrien, P., & Young, S. W. H. (2016). No such thing as a free lunch: Google Analytics and user privacy [PowerPoint Slides]. Retrieved from https://scottwhyoung.com/talks/google-analytics-web-privacy/
General Data Protection Regulations (GDPR)
The European Union’s (EU) General Data Protection Regulations went into effect 25 May 2018. In his report on the GDPR, Barmak Nassirian explains that the regulations “cover[s] all facets of information management including the collection, retention, deletion, breaches, and disclosures of personal data” (Nassirian, 2017). Library publishers may have authors, editors, and reviewers in the EU, so must consider their personally identifiable data. The Public Knowledge Project (PKP) has recently released GDPR Guidebook for PKP Users. Bepress commits to ensuring that Digital Commons will be compliant by May 25, 2018. The exact impact on library publishers outside the EU is not yet clear. [Editor’s note: This topic is developing rapidly, and will be further revised in future versions of the Framework.]
- MacGregor, J. (April 30, 2018). GDPR Guidebook for PKP Users, Version 1.0. Vancounver, BC, Canada: Simon Fraser University. Retrieved from http://docs.pkp.sfu.ca/gdpr_pkp_guide.pdf
- Nassirian, B. (August 28, 2017). The General Data Protection Regulation explained. EDUCAUSE Review. Retrieved from https://er.educause.edu/articles/2017/8/the-general-data-protection-regulation-explained
- Trunomi. (n.d.). GDPR key changes: An overview of the main changes under GDPR and how they differ from the previous directive. Retrieved from https://www.eugdpr.org/key-changes.html
If a library publishing program includes student works, consideration should be given to the ethical and legal implications of making student work public. U.S. publishers should familiarize themselves with FERPA, the law governing student privacy rights, and obtain publishing waivers where necessary. Publishers should also consider their ethical responsibilities to students and consider if a student may be at risk if their work is published.
- Office of the Chief Privacy Officer, U.S. Department of Education. (n.d.). Protecting student privacy. Washington, D.C.: U.S. Department of Education. Retrieved from https://studentprivacy.ed.gov/
Omnibus site dedicated to helping stakeholders understand and uphold student privacy regulations.
- U.S. Department of Education. (n.d.). Family Educational Rights and Privacy Act (FERPA). Washington, D.C.: U.S. Department of Education. Retrieved from https://www2.ed.gov/policy/gen/guid/fpco/ferpa/index.html
Journals collect information on authors and reviewers to support article submission and review. Unless these processes are completely open, publishers must ensure the author and reviewer information, logins, and content of the reviews is kept secure. Library publishers may also allow readers to submit comments, which may require authentication. Publishers may also keep lists of individuals for marketing and outreach purposes, and these too should be kept securely. Library staff should review contracts with external vendors to ensure that they are familiar with any analytics these platforms may collect.
Most library-published content is open access, but libraries that publish subscription-access content will also need to maintain lists of subscribers, which could be linked to payment information. This vastly increases the complexity of keeping information secure; a third party system to manage these accounts may provide better security than managing this in ad hoc manner. In order to keep this information secure, it would be best for the library publisher to rely on institutional identity management systems, such as Shibboleth. Library publishers should discuss these issues with their central IT departments to follow the local recommendation and get support from experts.
As with any personal information that is collected, it is important to not collect more than is needed, to not retain it longer than necessary, and to make sure the information is kept secure. The Federal Trade Commission advice for mobile health app developers “if you don’t collect data in the first place, you don’t have to go to the effort of securing it” (FTC, 2016) is good to keep in mind.
Library publishers should also be aware of what is being logged and what log files are being retained. Again, working with institutional IT experts will be helpful.
- Federal Trade Commission. (2016). Mobile health app developers: FTC best practices. Washington, D.C.: Federal Trade Commission. Retrieved from https://www.ftc.gov/tips-advice/business-center/guidance/start-security-guide-business
- Federal Trade Commission. (2015). Start with security: A guide for business: Lessons learned from FTC cases. Washington, D.C.: Federal Trade Commission. Retrieved from https://www.ftc.gov/tips-advice/business-center/guidance/start-security-guide-business
New Resources Needed
This section highlights gaps in the landscape of ethical publishing resources, and suggests areas where development of new resources could have a significant impact.
- Further research is needed on the kinds of tracking analytics used by library publishers, e.g. Google Analytics.
- Clear options for analytics, should library publishers choose to use them
- Clarification on what library publishers outside the EU must do to comply with GDPR
The recommendations in this section draw on the resources above to provide guidance to library publishers looking for concrete, actionable steps they can take in this area. They are by no means the only place to start, and they may not be feasible or appropriate in all situations, but they may provide a good a starting point for many libraries.
- Any library publisher that is not using HTTPS by default should work to make the change immediately.
- Disclose any analytics services your site uses. Check if there are opt out policies for the analytics services you use, and if so, be sure to publicize their use.
- Make sure you keep all PII secure and that you do not collect or retain any that you do not need.
- Rely on institutional solutions for personal logins, such as Shibboleth.
This section lists additional resources on this topic that may be of interest to library publishers.
American Library Association. (2014). Privacy: an interpretation of the library bill of rights. Washington, D.C.: American Library Association. Retrieved from http://www.ala.org/advocacy/intfreedom/librarybill/interpretations/privacy
American Library Association Intellectual Freedom Committee. (2014). Questions and answers on privacy and confidentiality. Washington, D.C.: American Library Association. Retrieved from http://www.ala.org/advocacy/privacy/FAQ
American Library Association Intellectual Freedom Committee. (2016) Library privacy guidelines for library websites, OPACs, and discovery services. Washington, D.C.: American Library Association. Retrieved from http://www.ala.org/advocacy/privacy/guidelines/OPAC
bepress. (January 23, 2018). Behind the scenes at bepress: Improving customer experience with investments in infrastructure and security [Blog Post]. Retrieved from https://www.bepress.com/behind-scenes-bepress-improving-customer-experience-investments-infrastructure-security/
JISC. (2018). General Data Protection Regulation (GDPR). https://www.jisc.ac.uk/gdpr
Lynch, C. (2017). The rise of reading analytics and the emerging calculus of reader privacy in the digital world. First Monday, 22(4). https://doi.org/10.5210/10.5210/fm.v22i4.7414
Marden, W. (2017). Third-Party services in libraries. In B. Newman & B. Tijerina (Eds.), Protecting patron privacy: A LITA guide, pp. 57–83. Lanham, MA: Rowman & Littlefield.
Newman, B., & Tijerina, B. (Eds.) (2017). Protecting patron privacy: A LITA guide. Lanham, MA: Rowman & Littlefield.
NISO. (2015). NISO consensus principles on user’s digital privacy in library, publisher, and software-provider systems (NISO Privacy Principles). Retrieved from https://www.niso.org/publications/privacy-principles
Peterson, A. (October 3, 2014). Librarians won’t stay quiet about government surveillance. The Washington Post. Retrieved from https://www.washingtonpost.com/news/the-switch/wp/2014/10/03/librarians-wont-stay-quiet-about-government-surveillance/
Smith, K. (2015). Where does FERPA fit? [Blog Post]. https://blogs.library.duke.edu/scholcomm/2015/02/23/where-does-ferpa-fit/
U.S. Department of Health & Human Services. (n.d.). Health Information Privacy. Retrieved from https://www.hhs.gov/hipaa/index.html