Digital Privacy in the Community College Library
By Megan Kinney, City College of San Francisco
The FTE for the 114 California community colleges in the 2016-2017 academic school year was 1,126,709.3.1 In our libraries, there are too few of us to help the number of students we serve, too many hats we have to wear to meet their needs, and growing our staffing is a challenge. That said, we must be incredibly careful in our service. Our students belong to our most vulnerable populations, and in their daily life they are heavily surveilled. In our system, 43% of the students we serve are first-generation.2 In very recent times, a number of our students willingly divulged detailed information about themselves and their families in the Deferred Action for Childhood Arrivals (DACA) application process. We have seen the erosion of trust as DACA has come under attack by the new administration. A large body of their personal data now exists in one place and can be used in ways they did not imagine when applying.
As a Library Freedom Institute participant, I was given the time and space to think through the privacy challenges we face as librarians in our current digital reality. Things have changed quite a bit since I learned about librarians pushing back on requests for patron checkout histories. We are asked to prove student success as tied to library usage through card swipe entry tracking, information literacy session attendance, book checkout counts, and other learning analytics. Despite this, we have many professional guiding documents related to protecting user privacy. This includes being transparent about user information in libraries (including how it is collected, shared, and used), paying attention to data retention policies, and considering what happens with users’ data once it leaves the library. Third parties are privy to user data while our communities surf vendors’ sites for research purposes, to book study rooms, and sometimes for the particular purpose of parsing our analytics for us.
Our professional ethics can be called into question as we reach for funding and attempt to prove our efficacy at our colleges. I recall my first attempt during a campus Equity funding application process (proposing the library’s effort to contribute to equity efforts on our campus), and struggling with one question in particular. How could we prove we were moving the needle for our school’s equity target populations? How would we measure it? My colleagues and I could not think of a safe way to measure our equity target groups without doing so on an individual level, and we were unwilling to forgo our professional ethics. We instead used figures from available academic research, and attempted to use this to persuade funding for our privacy-conscious route. Why would we study our students when the data exists elsewhere? This did not work.
At regional workshops, we started asking our peers how they were going to handle this. We found libraries gaining significant funds for textbook loaning programs, equity librarian positions, and more. When we asked how they were proving the efficacy of these efforts, we heard about sign-in sheets (no titles, but names + student IDs), and other system workarounds (counts downloaded on the hour from patron records identified with demographic codes, then presented in aggregate without student names + IDs).
In recent years, we’ve also seen the investment by our districts in retention software solutions. I was invited by student services at my last institution to attend Hobson’s University, in the midst of our campus pilot program utilizing Starfish. Starfish is an “early alert” system, allowing instructors to flag students with needs (such as financial aid intervention and counseling assistance), and give them ‘kudos!’ It also allows for referrals to get research help from librarians. In one session, a community college explained how they track student retention over several semesters. The tracking includes what the student does (access the library, tutoring, counseling, etc) in the system. The presenters said they found great partnership with the library but, “Did you all know librarians have a special code?” I assume the “code” is the ALA Code of Ethics, as they went on to explain that the librarians at the college would not use the system to track students individually, but they worked to find another method for getting their aggregate data into the retention tracking tool.
The issue of retention software is tricky because there are some convincing use cases, as well as some examples of my worst fears realized. In some of these systems, it is possible to use “predictive” analytics to determine whether a student might be successful or not, based on patterns that “emerge” from their time at the school or transcript data. This can be used to create targeted interventions to help students overcome their struggles, but schools can also use it to decide if they will let students in at all. My favorite part of being a community college librarian is WE ACCEPT EVERYBODY.
It would be nice if tools could help support us in the work we do without being discriminatory, but we are all familiar with the daily examples we see in the news about how personal data is being used for nefarious purposes. Even when the data appears to show that retention systems work, other factors may contribute to improvements. For example, in a talk for the Berkman Klein Center for Internet & Society at Harvard University, Virginia Eubanks (author of Automating Inequality: How High-Tech Tools Profile, Police, and Punish the Poor) noted:
...Georgia State University in 2012 moved to a predictive analytics in their advising... Their retention rate went up something like 30%.
But the part of the story that gets buried over and over again every time that it's written about is that at the same time they moved to predictive analytics, they went from doing 1,000 advising appointments a year to doing 52,000 advising appointments a year. They hired forty-two new full-time advisors. And that always ends up in paragraph 17 of these stories.
Forty-two new full-time advisors! Granted, their FTE is 44,8194, but what a dream to hire the people power you need to effect change like that.
If we must engage in analytic parsing because of the institutions we serve, trust is an important component to consider when managing large sets of personal data. One interesting research project in the academic library sphere at the moment is the Data Doubles project. The project digs into "student perspectives of privacy issues associated with academic library participation in learning analytics (LA) initiatives." At ACRL 2019, researchers Kyle M. Jones and Michael Perry discussed students indicating trust in libraries. In focus groups, they and their co-researchers found that “[even] though students were positive about library learning analytics, they did express a number of questions about the practices, especially since they had never been informed that their library had access to or was analyzing certain types of data...”5 What do we do with this trust?
We may feel like we are doing our best to serve students in every possible way we can (through our stacks, at our desks, online), but what does it mean when we send students into other entities’ websites to meet their research needs? What kind of information is passed to the vendor? How are vendors using it for their own purposes? Read Mimi Calter’s editorial for Scholarly Kitchen to learn more about “silent sharing” by vendors.
One response from libraries is to draft up or sign on to privacy related statements. As stated in The Role of the Library Faculty in the California Community College, which was passed via resolution 16.01 by the Academic Senate for California Community Colleges, “Privacy of users is inviolable, and community colleges should make certain that policies are in place to maintain the confidentiality of library records and library use data.”6 The Stanford Statement of Patron Privacy and Database Access is one such example of efforts in this regard. A privacy statement from our Council of Chief Librarians is underway.7 In addition, it would behoove individual libraries to have privacy and data management policies that are publicly available.
There may be concern that the language of policies is inaccessible to some users, or that users simply don’t read them. Consider how you can fold privacy topics into library orientation sessions. For example, while teaching about database searching, spend a minute on the authentication process. You probably already distinguish between on-campus access, and logging in while off campus. Explain briefly how IP addresses work and how they are surfing via the school IP address when logged in with their student ID. This can help students to feel safer in searching for sensitive topics. (But, oi, what about those proxy logs?) You can also talk about cookies in teaching about the web searching process. Let students know what information is being communicated to the search engine, the algorithms that decide what results they get served, and that some search engines are advertising companies. Cookies are de-identifiable and get used to serve very targeted advertising. They are collected by companies online who then turn around and sell that data.
I was inspired to join this profession by librarians that came before me - by their strong convictions about privacy. My concerns grow daily, most recently driven by recent news related to the Santa Cruz public library system and LinkedIn. I’m sure yours do, too. What else can we do about it? [Seriously - let’s talk.]