2024 Collaborative Book Review: Interview with Catherine D’Ignazio and Lauren Klein, authors of Data Feminism (2020)

On May 20, 2024, HASTAC Scholars Abby Cole, Stella Fritzell, Hamida Khatri, and Parisa Setayesh met with Catherine D’Ignazio and Lauren Klein to discuss their 2020 book, Data Feminism, which was also one of the subjects of the 2024 HASTAC Scholars collaborative book review project. The following is a corrected transcript of their conversation.


Abby Cole:
Could you briefly walk us through the process that led you to writing the book such as what inspired you, and when you knew you wanted to take on this project, or anything extra that lends to that?

Lauren Klein:
Catherine and I connected over our shared interest in feminist data visualization. We had each been thinking about the topic independently. For me, this involved looking at some early feminist examples of data visualization and thinking about the ways in which not just women visualization designers, but all sorts of ways of feminist thinking might allow us to expand the range and capability of visualization of the present. (I actually just finished a full draft of this project a couple of weeks ago at very long last). Meanwhile, Catherine had been thinking about the same topic in a contemporary context and I’ll let her talk about that a little bit more. But when we first got together we thought we would just write a short paper about visualization, which we did. But then we were approached to write a whole book about feminist data visualization. Initially, we thought the book was just going to be like the paper, but longer. But as we began to think about what we would need to cover, we realized that you couldn’t talk about data visualization without talking about the entire process that leads you to the point where you can create a visualization. This involves analyzing the data, but before that, creating the data set, and before that, identifying the research question, and before that building relationships with communities. So that was what took us to the framing of what would become Data Feminism. Catherine, do you have anything to add to that?

Catherine D’Ignazio:
At the moment where we came together, I had fairly recently graduated from the MIT Media Lab and it was a hype moment, similar to the hype moment of AI that we’re living in right now. The hype moment of that era was big data. I think one of the things that I was most shocked by was that there was so little critical discourse around critical approaches to data like: Who collects the data? Which data are not collected? Why was data collected? Who owns the data? Who gets to use the data? Who even has the capacity to use the data? So many of these questions weren’t being asked or answered at the time. I think that’s shifted now because there’s so much emerging and critical work with critical data studies, critical internet studies, AI ethics and more. But, at the time, there weren’t that many people asking those questions. So it was really like a breath of fresh air to meet somebody else who was also asking those questions, and to come together and then try to think through how to frame an alternative approach to data and data science that was critical. Also, we wanted an approach that was not a full-on refusal within the sense of no data or no AI, but instead saying, how do we refashion some of these technological practices that aren’t serving us and aren’t serving many people, particularly those people who are harmed by structural inequality?

Lauren Klein:
Maybe I’ll just add one more thing to that. While it was true that people in tech spaces were not having these conversations about critical approaches to data, one thing that was true, I think, for both me and Catherine, was that we had seen these conversations and participated in these conversations elsewhere. For me—and I’m making this explicit because of HASTAC—I had spent years in digital humanities, which was a place where we were thinking about how to adopt a critical, informed, contextualized, historicized approach to data and digital technologies more broadly, and still use those tools and technologies in intentional, informed ways. So it’s not that there weren’t examples. It’s just that the people who were advancing the hype at that point around data science were not looking far and wide enough to see examples of better data practices. And a large part of Data Feminism was to bring those examples to a broader audience.

Hamida Khatri:
How did you decide on the methodology for collecting and analyzing the data for your book? Were these any methodologies that you avoided for ethical reasons?

Catherine D’Ignazio:
We didn’t do a lot of primary source data collection for the book. It was mainly through literature review, case studies of existing work and some of our own work. But we did conduct some original interviews for the book. So I guess that was a form of qualitative data, but fairly standard. It was mainly to get more background and depth on some of the examples that we use in the book as cases or illustrations of some of the principles.

Lauren Klein:
One other part of our methodology that is maybe interesting in this context is how we selected our examples. We had to make a series of decisions about how we would select our examples—which examples we would choose and how we would bound their scope—and we spent a really long time thinking about this. In terms of background and positionality, the perspectives that Catherine and I brought to the table were pretty similar to each other. We’re both white. We both have multiple academic degrees. We’re both moms. We’d both worked in similar tech spaces. And we were aware that we couldn’t represent a wide range of perspectives from only drawing from our personal experience. So we knew from the jump that this project had to be one that centered experiences and perspectives that were not our own. We also both felt very strongly, both ethically and intellectually, that because of our place in the United States, we had an obligation to center experiences of race and racial oppression alongside questions of gender and gender oppression. And we were also schooled in intersectional thinking. So we very deliberately sought to feature examples and experiences of people with more direct connections to the range of forms of oppression that we were talking about in the book.

The other thing that is worth talking about in this context is our decision to bound the book around the United States. At a certain point, we asked, should this be a global project? Should we aim for proportional representation of all places in the world in terms of the examples that we drew from and the people we talked to? But we realized that we couldn’t do that. Not only would it be very logistically challenging to write a history of data practices from everywhere in the world, but the other consideration was that we were not the individuals in the best positions to do that. At that point neither of us had practices that reached much further than North and South American continents. So we adopted an approach to gesture outward to projects elsewhere in the world, while making clear that we were not claiming to speak on behalf of all data practices everywhere. That way, ideally, other people who were interested in data practices elsewhere in the world would take up similar projects and continue to expand the work that we had done.

Stella Fritzell:
I particularly appreciated the auditing process that you included in the book, which I believe was conducted by Isabel Carter, regarding certain metrics that you wanted to meet and certain values that you wanted to express. In your reflections on the audit, you indicated that the final copy of Data Feminism actually fell short in meeting the goal metrics that you had originally set. Would you mind elaborating on this some more, thinking, perhaps, about how you might now remedy some of those gaps, or if those are even gaps that could be remedied.

Catherine D’Ignazio:
It was a really interesting experiment, I would say. We thought a lot about it in retrospect, and I’m not sure if I would ever do it again in that same way. The basic idea was that when we wrote the book, we set up these metrics around different axes of oppression, and asked what we were going to do in terms of citational justice, what were our goals for whose voices we were going to try to center—a certain percentage of women authors, a percentage of trans authors or voices or cases, or a percentage from the global south. So we set those goals up when we wrote the first draft. Izii, who was our research assistant at the time, and a student of mine, went through and cataloged everything. And then we revised, and revised, and revised. We published a community peer review draft and got a ton of feedback. A lot of our revision was citational revision, because people told us to look at this citation, to read this or that thing. In the end, our citational justice goals went down, which we reflect on in the piece that accompanies the audit.

Since then, I’ve been thinking about this peer review process as a function of academic structures. It was the academic people who largely weighed in on the text and normed us into, “Why aren’t you citing these people?” A lot of the suggestions were folks that we were deliberately not citing, because they were already part of the canon that we were trying to avoid. We had lots of conversations where we were both irritated, like, “Why did they think we don’t know this person?! Maybe we just have to cite this person to prove to people that we know them?” So it provoked a lot of those interesting conversations, and in certain cases, we did make the decision to integrate a handful of those people. Though I think we still manage to avoid citing Foucault. But I’m not sure, because Foucault was recommended at least 50 times.

I think it’s fascinating, actually, that we were normed into the canon, and we also allowed ourselves to be normed into it. So we didn’t meet our goals. But then, I also think we wouldn’t have known that we fell short of our goals if we hadn’t been quantifying them.

But I still have reservations about this method. There is something that feels really reductive about it, too. You sort of pigeonhole someone, where this author is representative of trans people, or this author is representative of women, and this author is representative of the Global South, or whatever. So you’re basically putting people into these buckets and then trying to game the buckets, or something like that. And that doesn’t feel right to me as a citational justice process. Yet if we hadn’t done it, we would probably just have cited more of the same white, Global North, academic, cisgender, Christian, Anglo, etc, etc, people. I still don’t have an answer about what’s the right thing to do, but it was an interesting experiment at the time and it still provides me with a lot of food for thought. We have talked at certain points about doing a new edition or revision to Data Feminism, but would we do the same thing? I don’t know. I’m actually really curious to hear what Lauren has to say about this, too.

Lauren Klein:
What I have to say is that I agree with everything that Catherine said. It definitely made us more conscious of our choices. But in the end, the numbers don’t speak for themselves, to quote ourselves back at us, and this also helps support a larger point, which is that you can’t get the full picture without looking at other things in addition to the data. For example, at a certain point, we had a choice about whether we wanted to separate out the footnotes from the main content. And I think—Catherine, you can correct me if I’m wrong—we decided not to do that because we didn’t want people to feel like we were playing with the numbers. But I do think that if we separated out the main content from the footnotes, we would have had better measures. This is because a lot of the choices that we made were rhetorical choices like, who to foreground when you’re telling a story, or giving a genealogy. You have a lot of power as the author to say, “My story starts here,” and those were the moments in the book when we chose with a lot of intention the person who we wanted to voice the main concept.

I increasingly spend a lot of time thinking about what you can contribute as a scholar who writes long form scholarship, and who’s not always out in the streets. And where I’ve landed is that what we do in our jobs, what we are trained to do, is to tell compelling stories. One of the powers that we have as scholars is to shape how people perceive the unfolding of events. And it’s really important to think about the people you choose to center, the voices you choose to feature, and the examples you choose to foreground. That’s something that has been an evolution for my scholarship, going from dissertation, to first book, to Data Feminism, to the stuff I’m working on now. But it’s something that I think everyone could spend more time thinking about.

Hamida Khatri:
So speaking of storytelling and giving voices, what advice would you give to young scholars or data scientists who want to incorporate feminist principles into their work?

Lauren Klein:
Short answer, do it!

But longer answer: one of the things that I really love about a feminist methodological approach is that if properly enacted, it has the effect of increasing all that we know. With a feminist lens that you gain more knowledge, you gain richer knowledge, you gain more voices, and therefore greater understanding. When you frame it in those terms, feminism is something that anyone should be on board with.

One of the interesting things that happened after we published Data Feminism is that a lot of people said to us, “Why did you call it Data Feminism? You’re just describing good research practices.” But it was really important to us to name feminism as the intellectual and ideological origin point for this work, because that was what taught me and Catherine the importance of this more inclusive and expansive approach to generating knowledge. I think it’s true that feminism has been responsible for a lot of good research practices becoming more mainstream, and it’s important to articulate when the stereotypical version of feminism that people have in their minds is: “Oh, they’re always complaining or criticizing.” But, in fact, feminism is one of the most generative lenses that you can take to any idea or process or action.

Catherine D’Ignazio:
Yeah, I concur with all that, and I would hope that we’re at a point where people aren’t facing too much pushback for naming their theoretical frameworks or their methods or their motivations as explicitly feminist. But I think it really seems to vary field by field — you still encounter folks who feel very constrained by the norms of their field and wouldn’t be able to be openly feminist in their field. But I think there’s a way to do feminism from every position, even if the feminism has to be undercover sometimes.

Abby Cole:
Yes, so that way of thinking really informs my own personal research. My work looks specifically at emerging journalism, and considers a feminist approach to journalism that informs social justice. So I was very excited, because you connect a lot with that in the book as well. So I really support journalists embracing a feminist standpoint, such a strong objectivity, through their work. The book also suggested that journalists should consider their positionality and interpret the data for the readers. So my question is, considering that this goes against the traditional practice of the ideal of neutrality in a reporter, could you elaborate more on how you feel that journalism, specifically, should embrace this positionality? And what might that look like in practice? And then, second to that, how might that specifically inform journalism that’s grounded in social justice?

Catherine D’Ignazio:
So I taught for a number of years in a journalism program at Emerson College. I find journalism to be this very interesting space, because, on the one hand, so much of the data driven work that’s going on right now that is oriented towards social justice is coming out of journalism. There are really strong data journalism practices, and journalists are conducting a lot of really interesting original investigations. They are collecting some of their own data, and producing their own databases and data sets and then reporting from those. There are some super interesting models, like ProPublica for example. I’ve always found it interesting how they would do a big story and make a national database, and then, local outlets could use their data and report in their own context and add to and expand on the story.

But I totally hear you on the norms part, but I feel like there are bridges that could be built. There were some conversations around solutions journalism that I found interesting, for example, or community based journalism. At least we were beginning to develop alternate models for how news happens, and ask what is the standpoint of the person who’s doing the reporting. It’s a different thing when you’re a “neutral observer” or when you’re positioned as somebody who wants to help solve a problem, like this solutions oriented journalism, versus somebody who’s based in a community and doing community led journalism. So some of those are potential pathways towards reconceptualizing, and I think breaking down some of the objectivity norms.

In fact, the objectivity norms lead us down some really harmful paths in terms of the “both sides-ism” and false equivalencies that we see now. Finally, journalists have stopped addressing climate change from “both sides” for the most part. But this is such an ingrained thing where you teach journalists that if something’s controversial, you have to get quotes from both sides, which I think is hogwash, to use a good word. I don’t think there are ever two sides, right? And there are also these ways in which journalists, and those reporting norms, have actually been weaponized by right wing extremists to end up elevating very extremist and radical points of view as the “other side” of something.

So the use of objectivity norms is actually lending aid to the rise of right wing ideologies and right wing extremism in that way. I think there is a path to understanding that, but it has to be through education, and also providing people with new models and new modes. What does it look like, for example, to work in solidarity? I think also in academic research we don’t have a great model for that and we get challenged. I get challenged around using solidarity as a method in my work. But I also think that when it’s a question of human rights, I’m not going to be like, well, there are both sides. There aren’t two sides.

Lauren Klein:
I agree with all of that, and maybe I can expand a few points from a historical perspective in light of what we know about the construction of objectivity. You could think of Lorraine Daston and Peter Galison[1], or Shapin and Schaffer[2], or any of these canonical texts on the history of science. What they tell us is that objectivity itself is a rhetorical construct that emerged in order to grant authority and facticity to newer modes of knowledge making.

I’m not an expert in the history of journalism, but one can assume that the same patterns we saw in other earlier forms of knowledge making have come to apply to journalism. So this idea that there is a neutral, objective version of things that is not filtered through human experience is in itself a false construct. But this is not to say that everything is relative, and that we should not believe the news. Rather, you need to ask who is doing the telling, whose perspective are you hearing, and in whose interests is any particular article serving? Answering those questions will give you a better sense of what it is that you’re reading,

In addition to the substantive examples that Catherine was talking about, there’s been really interesting work on actual word choice, and how there’s no context-free language, and that is just how we communicate. There’s been a quantitative study of words used to describe the coverage of Israel versus Palestine in the New York Times, and it found widely divergent word choices. Things like this may not even be conscious. It might be, but maybe not. The point is that there is perspective encoded even at the level of words, and I think the world would be so much better, and that we would have such a richer understanding of nearly everything, if we understood that fact. I mean, all information comes from a context and paying attention to the context is how we get a more nuanced picture of what’s actually going on.

Stella Fritzell:
A question that popped out to me as a reader of Data Feminism, particularly in chapter three, which I believe is also spurring Abby’s thoughts about journalism, objectivity, and neutrality, is one of interpretation. Namely, who, in the process of data collection and representation, is permitted or responsible for the interpretation of that data. I’m not sure I ever reached a satisfactory answer as I thought about that question personally, so I’m hoping that you can reflect a little bit on that topic. Who should ultimately be responsible for interpreting data, or who is allowed into that process? And at what stage is it? Is the analyst the visualization designer, the publication editor, the readers themselves, and / or any other actors?

Lauren Klein:
I think the answer to your question is all of those people and at all stages, but we are more deeply attuned to certain moments of interpretation and not others. For example, anyone who’s ever collected data knows that the minute you decide how to collect your data, you’ve locked in many choices. What types of data are you going to collect? What categories will structure it? These decisions are interpretations, right? They represent your view about what kind of information can be captured by the data, and that ends up shaping how the data can then be analyzed. But when you get a data set from someone else, you often feel like your data analysis starts at that moment. I mean, that’s the part of the process that’s called analysis, right? So one takeaway is just to recognize all of the moments when active interpretation is taking place, and account for it when you are drawing conclusions.

But there is a danger here—which is that by identifying all of the different ways in which data is shaped by people, it gives other people the grounds to throw out the information that is gained. One of the toughest rounds of questions we ever got was when we presented our work to the CDC shortly after the introduction of the Covid vaccine. They were really angry with us because they had worked so hard just to get the public to believe the data. And then we were there suggesting that they should also explain that the data was more complicated, and that there might be more to interpret along the way. The CDC just wanted people to take away a clear message, which was, “You should get vaccinated because it will save your life.”

At the time, I didn’t have a great answer to that comment. Their point seemed very valid. But now, reflecting on it a bit, I think what you could ask along the lines of Data Feminism is, “Who is doing the data collection, the analysis, and the communication here?” And the answer is experts trained in epidemiology. So if you can say, “Okay, it’s experts who are trained in a field, with actual public health degrees,” then as viewers we be able should trust that information, certainly more than something on the internet where there’s not a way to figure out where the interpretations are coming from and what those agendas they were informed by.

Catherine D’Ignazio:
I also remember that moment. And actually, it relates to a debate that is going on right now in visualization research around depicting uncertainty. And the debate is firstly about how to represent uncertainty visually in a medium that uses mostly clean lines, geometric shapes, and so on—so how to represent uncertainty in a medium that appears to convey certainty. And then, secondly, how much uncertainty should be represented? At what point when you start to show uncertainty will people actually start mistrusting the message or the messenger? There’s this really interesting paper that we read for my spring Data Visualization class, co-taught with Arvind Satyanarayan, where they went out and interviewed practitioners about how much uncertainty they put in. And for the most part, data visualization practitioners really avoid showing any uncertainty, because they don’t want to detract from the facticity or perception of truthfulness of the message. So it’s a really interesting conundrum, actually.

I also agree with everything that Lauren said in terms of the interpretation, particularly the interpretive nature of many steps in the data processing pipeline that we don’t really think of as being interpretive. Maybe we think about visualization as just a visual thing. But in fact, the laying out of data, which columns you use in your data set, that’s an act of interpretation. It’s actually a huge act of interpretation, because you’ve decided which variables are included and which variables are not included. So even that—what is the available data and what are the sorts of features or variables associated with it—hugely affects everything downstream. It’s the same with the analysis and the same with exploratory visualization, where you’re trying to figure out what the messages are. Each of these involves active interpretation. And who gets to participate in that is very much worth thinking about. One of the things, for example, that I talk about in my new book, which is based off of Data Feminism, is the progressive levels of distance. Often the people working in mainstream data science get data from these original interpretive acts of actually collecting data about the world, they’re downstream. Usually you’re so many levels removed from that active interpretation when you’re actually doing analysis or modeling or visualization. So there’s a lot of loss of context and information along the way. And it’s especially dangerous when you don’t even understand the fact that you’re accessing an interpretation. You’re not actually accessing “raw” knowledge about the world.

Parisa Setayesh:
Connected to that, I have a question about environmental data, which can be very complex, especially because it is prone to misinterpretation or miscommunication, and it deals with similar challenges in communicating uncertainty while assuring accuracy and inclusivity; in the process of selecting cases for the book, or in your other work, have you encountered instances of work with environmental data that you considered using?

Catherine D’Ignazio:
We didn’t talk a ton about environmental data specifically in Data Feminism. And it’s actually something we’re talking about right now in a new paper about Data Feminism for AI, so we are thinking about the environmental costs of large language models like what it takes from an environmental perspective to actually have something like a ChatGPT.

One example that comes to my mind that we didn’t talk about in Data Feminism, is this really interesting journalism project that uses data in the form of qualitative stories. It’s called, ISeeChange, it is a website, and a storytelling platform. It’s based on this idea that people on the ground are seeing climate change in their everyday lives, particularly those in rural communities and those who work on the land, but the current atmosphere of partisan discourse on climate change may cause them to say that they don’t believe that climate change is happening. And yet they’ll simultaneously talk about all of these ways in which they’re seeing change happen on the ground in their lifestyles and their livelihoods. So it’s premised on this idea of collecting stories of what people on the ground are seeing happen in their own backyards and everyday environments. What I took away from the project is the focus on reaching rural communities in particular. I think it’s a powerful example of trying to meet people where they are and then scaling up from that. The reason it becomes data is because it’s done at scale, and there’s many of these stories that are collected. Even if you may be predisposed to not fight for climate legislation or something like that, you can start to hear stories from people like yourself and potentially make your own connections, rather than being told this top-down from an authority.

Stella Fritzell:
In the introduction of Data Feminism, you propose a very broad audience for your book, one that some might say is quite ambitious. My own encounters with your book, and those of my colleagues, have all taken place within the university library setting. Have you, as authors, encountered this book in unexpected places, or heard reports of such encounters outside of the university library? Do you feel that that book has reached the audiences that you set out to reach when it was published four years ago?

Catherine D’Ignazio:
I think it went way further than we had expected. We were being very ambitious—I can’t remember exactly what the full list for the intended audience was—but I do think that Data Feminism reached far more people than we ever expected it to reach. At the time, I was thinking of myself as speaking to both the practitioner communities and the scholarly communities that I was a part of—critical data studies, sociology of information, information scholars, some arts people, data journalism people, some designers—basically an interdisciplinary, but kind of niche audience. That was the world that I hoped the book would impact. But we’ve definitely encountered a broader audience. We’ve had all manner of invitations from diverse places, like CAUSE (Consortium for the Advancement of Undergraduate Statistics Education) and the European Parliament invited us to speak with them. The book has traveled in such a way that we couldn’t have imagined it, I’m not sure we ever meant it to travel so far. It’s been really exhilarating to see it picked up in these different places. But it sometimes puts us in awkward situations, because people write to us to tell us, “I’m working on feminist foreign policy, and I’d love your input on something, and I’ve been using Data Feminism.” And we think that’s great, but we can’t offer much more than what’s within the book. I’m definitely not an expert in feminist foreign policy, for example.

Lauren Klein:
Just to add another recent story that really inspired us, we recently received an email from someone named Jose Ribas Fernandes, who works at a Canadian gas and oil pipeline regulator. He wrote to let us know that our work had been taken up in that context, despite what we might expect from the oil and gas industry. And he proceeded to tell us about this amazing series of projects that his group had undertaken, having gone really systematically through our book. For example, in Canada there’s a test employees take on their views, similar to the US Federal Employee Viewpoint Survey. So they plotted the demographics of who was doing well on the test and who wasn’t, and used it to advocate for changing HR practices. And then they did another project where they looked at testimonies recorded during pipeline hearings from First Nations, Métis, and Inuit peoples, who had seen their land being impacted by resource extraction, the loss of traditional species, pollution, and so on. They discovered these fascinating tribal histories that were contained within the testimonies. But they were not identified as such because only the pipeline company was in the metadata. The group realized that they could stitch together these personal accounts and find stories that they thought had been lost. And so they’ve been engaging in dialogue with the nations’ own archives to repatriate the accounts of the people whose land was taken from them. To hear about that was really fascinating and compelling.

As far as the audience I was writing for, I thought a lot about reaching students, since at that point I had been teaching at Georgia Tech for almost a decade. I now work at Emory, but when I was writing Data Feminism I was still at Georgia Tech in an interdisciplinary humanities department. So I had a lot of experience in the classroom of trying to explain humanistic concepts to engineers and finding contemporary examples. I think I had a good sense of what examples would reach people in more technical spaces. But there’s one other thing that was really motivating for us—and I’m speaking for Catherine here, but I think this is true for both of us—and it’s that we were both certain that we were right. We just believed in what we were writing. There were some claims that we would develop to a strong conclusion, and we wouldn’t necessarily have the exact citation yet. But we’d do some research to back up the claim and find citations immediately. For example, with respect to the representation of women and racialized minorities on boards of tech companies and in executive leadership. We generally had the sense that representation was bad, because the decisions that were being made by these giant tech companies clearly seemed to be happening because they didn’t have representation from a perspective that was not a white, male, western-educated, or able-bodied perspective, but we didn’t have the data at our fingertips. Then we went digging into the data and it was even worse than we thought. For example, the number of women CEOs at the top tech companies was zero. There were no women. No one was there. And there was a certain confidence that came with experiences like this. In academic writing, you are often encouraged to present your contribution in a way that is very modest. But in this case we felt so strongly about what had to be said, and the fact that no one had said it yet, that our confidence and tone came from that.

Catherine D’Ignazio:
I totally agree with that and it’s actually one of the liberating things about writing in a book format versus writing in a paper format. We could make those claims and illustrate them rhetorically. I would say it’s a very rhetorical book, in a sense that it’s putting forward a lot of claims, but it does try to substantiate those and uses available evidence, and so on. Many of the claims are strong and they’re expansive, but we’re trying to think expansively. I think that’s something inherently available to you in the book format and it’s exciting. This came out of the collaborative process, too, because then we could talk about things and make sure that we were aligned on what we were saying. So there was a check in terms of having another person there in the process. At the same time, if we’re tackling really big issues, like structural inequality, we do need narrowly focused studies that provide evidence for certain things, but we can’t be limited to only doing that work. Because then you spend all of your time just proving that inequality exists in hundreds and hundreds and hundreds of ways, but you never actually take action. So we need to be able to think, and act, and speak expansively about these things, because these are expansive structural issues.

Hamida Khatri:
Speaking about availability, you chose to make Data Feminism an open access digital publication. Can you discuss the importance of accessibility and academic work, and how it aligns with the principles discussed in your book? And do you have any advice for HASTAC scholars who might be weighing the benefits of traditional versus open access publication, and who are new to the world of publication in general?

Catherine D’Ignazio:
One of the reasons was that we didn’t have to request it. Data Feminism is actually part of a series called Strong Ideas that was already planned to be open access. And it very much aligned with our goals and we did do other things along the way, like the open peer review process for the draft. Which I think are very aligned with feminist goals. Often, if you are publishing a book, you either need to fight a lot to get it open access, or you need to pay a lot to get it open access. Which is the case of my second book, which I actually applied for and got a grant to make it open access, but it was quite pricey to do that. If open access is a path that you have available to you, I would encourage everybody to take that path. The knowledge can get out to more people who may be in a position where they couldn’t necessarily pay for it, right? They might be globally based around the world, they may not necessarily have access to books and bookstores. And also for students. I think it’s one of the huge things that made Data Feminism something that people assigned in classes, because there was an open access version. Students didn’t need to actually buy the book. So I think in terms of impact, open access would increase your impact.

We were also very fortunate to work with a press that is experimental and is experimenting with open access books and forms of knowledge. But, Lauren, you have more experience with publishers other than MIT Press.

Lauren Klein:
I think most presses know by now that it’s a different audience who buys books and who reads things online. I had come to MIT from years of working with University of Minnesota Press on Debates in the Digital Humanities, which has always been open access for the exact same reasons that Catherine described. UMP did not find a drop off in sales because of the open access versions, so that’s a really important message for presses to hear and for authors who are worried about lost readership. At this point, if someone can’t find your book online and they can find another book online, they’re going to read that other book. The place where this message has not yet come home is to mostly tenure-granting departments, which sometimes think that open access is not as professional or scholarly, especially in cases when you’re given the choice of releasing the book either online or in print.” Both MIT and Minnesota are hybrid. But not all presses are. And that’s where junior faculty members end up in a trickier spot because they know that the open access version will be a wider read. But oftentimes there’s pushback from the department because of the idea that not having a physical book is somehow less rigorous. So that’s where examples like Data Feminism or Debates in the Digital Humanities can really help.

With that said, it’s important to recognize that not everyone has the opportunity to publish things open access. So I get a little uncomfortable when people try to ascribe virtue to us for making Data Feminism open access. Because, as Catherine mentioned, this was a feature of the series already and while we would probably have negotiated to make the book open access had it not been, it does cost money, and only people at certain institutions have funds to pay for OA fees. My previous institution was a public institution and they would not have given me the ten-thousand-plus dollars that it takes to pay for a book to be open access. But institutions with greater resources, usually private, are able to do that more readily. So, should we work towards an ecosystem in which all research is more easily accessible? Absolutely! But there are a lot of considerations that go into play, so it’s important to recognize that it’s not always up to the author themselves, at least in this moment in time.


Citations:
[1] Daston, L., Galison, P. (2007). Objectivity. Zone Books.
[2] Shapin, S., Schaffer, S., & Hobbes, T. (1989). Leviathan and the air-pump : Hobbes, Boyle, and the experimental life : including a translation of Thomas Hobbes, Dialogus physicus de natura aeris by Simon Schaffer (1st Princeton pbk. print. with corrections.). Princeton University Press.

As interviewers, we would like to thank Catherine and Lauren for their time and thoughtful responses. Data Feminism has been an exciting and inspirational text to engage with as part of the HASTAC Scholars collaborative book review process. By sharing this interview, we hope to encourage further conversations with friends, family, and colleagues about the importance of intersectionality, positionality, context and history in data science practices.

You are welcome to comment on this blog post to share your own thoughts and impressions. All comments should be civil in tone and refrain from using profanity or intentionally incendiary speech.