Data Feminism Catherine D’Ignazio & Lauren F. Klein, Data Feminism. The MIT Press, 2020. Pp. vii, 314. ISBN 9780262044004, 9780262547185. Review by Stella J. Fritzell, Bryn Mawr College. sfritzell@brynmawr.edu
This volume by Catherine D’Ignazio and Lauren F. Klein sets out to join the feminist belief in equality of the sexes with the practicality of activist work by laying out a way of working with data informed by the traditions and legacies of feminist activism and critical thought. The authors put forth a seemingly lofty goal for their book—namely, “to take a stand against the status quo” of a world that benefits some at the expense of others (18). In stating such, this book situates itself more as a manifesto than a practical guide. Yet, the information within its pages nevertheless provides firm examples of how such work might be conducted, keeping this manifesto from straying too far into the realm of the philosophical. D’Ignazio and Klein articulate a secondary aim for the volume, which is demonstrably more concrete and measurable in scope, and robustly accomplished. Data Feminism is successful in showing “how governments and corporations have long employed data and statistics as management techniques to preserve an unequal status quo … [and] how the power of data can be wielded back” (17). While the authors draw upon the scholarly and activist work of numerous individuals, they acknowledge a particular debt to Black feminist scholars—including bell hooks, Kimberlé Crenshaw, and Patricia Hill Collins—whose important work on intersectionality and systems of power have changed the face of modern feminism. Data Feminism consists of seven chapters, with an introduction and conclusion, three appendices—a statement of shared values and measurable metrics, a reflection on auditing these metrics, and an acknowledgement of two community organizations—figure credits, sixty-eight pages of notes, and two indices. A table of contents is printed at the end of this review. The introduction defines “data feminism” as “a way of thinking about data, both their uses and their limits, that is informed by direct experience, by a commitment to action, and by intersectional feminist thought” (8). Here the authors put forth the premise that data is any type of information that can be systematically collected, organized, analyzed. This point is reiterated throughout the following chapters, which also rearticulate the fact that standard or traditional data science processes tend to operate with an inherent “privilege hazard” and rarely present truly diverse data. The work of data feminism, then, “is first to tune into how standard practices in data science serve to reinforce [] existing inequalities and second to use data science to challenge and change the distribution of power” (8-9). Each of the volume’s seven chapters is structured around a core principle, which the authors have developed from the foundations of intersectional feminist thought, and provides real-world examples demonstrating how that theoretical principle might be put into action. While interconnected, each chapter approaches the question of data science and feminism from a slightly different position. These may be roughly separated into pairs (with one exception), where one chapter introduces a particular problem within data science, and the other elaborates on its possible solutions or considers the same challenge from a separate angle. Chapters One and Two, for example, are both concerned with issues of power—who has it, how it is used, and who benefits from it. Chapter One, appropriately dubbed “The Power Chapter,” introduces Patricia Hill Collin’s idea of the “matrix of domination” as a means of examining how power operates within the field of data science and demonstrates that this work frequently echoes and reinforces real-world power structures that render minority groups invisible, uncounted, and unrepresented. Chapter Two discusses methods of mobilizing data science to challenge these same extant power structures through the collection of counterdata, through data analysis and data auditing that sheds light on opaque processes, by reorienting the end goal of data science around co-liberation, and through the engagement and education of new data stakeholders. Classification, that of both the subjects and of the practitioners of data science, is the issue at stake in Chapters Four and Five. Chapter Four discusses the process of classification as an important but problematic component of data collection. While the chapter is limited by its overwhelming focus on the binary categories of sex and gender, it does effectively argue that careful consideration must be paid to the circumstances surrounding acts of classification in data science. Chapter Five introduces distinct categories to describe how data science professionals are imagined and how their labor is valued, both within the field and in the world at large, and argues for a shift towards intersectional models of data work as a means of remedying the negative externality that arises when data professionals are too far removed from the contexts of the data which they study. In Chapters Six and Seven, D’Ignazio and Klein discuss the importance of creating transparency within data science work, both by centering the contexts of data itself, and by extending proper acknowledgement and attribution to the many hands that shape the data through processes of collection, analysis, and publication. Of these interrelated chapter pairs, Chapter Three is, of course, the partial outlier.[1] Centered on the principle of elevating emotions and embodiment, this section concerns itself with the traditional and contradictory positioning of data science as a detached and dispassionate field, which nevertheless relies upon human actors—living, feeling individuals—in order to carry out its work. In the opening sentences of the chapter, D’Ignazio and Klein push their readers to question the apparent neutrality and objectivity of data visualizations, which are arguably the most popular medium for disseminating analyzed data to a public audience. They are, of course, right to do so. This apparent neutrality, conveyed by the minimalist approaches to design espoused in the best practices of data science, may be characterized as what feminist philosopher Donna Haraway dubs a “god trick”, a method of data presentation that, whether intentionally or unintentionally “masks the people, the methods, the questions, and the messiness that lies behind clean lines and geometric shapes” (76). Some persuasive agenda is at work even in the plainest of visualizations, or, to frame it differently, a degree of bias will always be present due simply to the human element(s) that collects data, defines categories, directs algorithmic analyses, and so on. To debar emotion from data studies is, then, the authors suggest, at its most innocent, a pointless endeavor and, at its most malicious, a deliberate act of exclusion. In questioning the objectivity of data, D’Ignazio and Klein beg us to consider whether visual minimalism is truly a more neutral means of presenting data, whether it should continue to be an ideal for data visualizations, and how the emotion implicit in any data presentation might be leveraged to aid in education and communication. Rather than valorizing the ideal of neutrality, the authors suggest that a data product ought to be formed and framed using feminist objectivity and positionality. To do so is to make clear the human contexts of data collection and visualization. The significance of context is discussed in more detail in Chapters Six and Seven; here, however, it is not immediately clear how this self-evident means of combating illusory neutrality ought best to be applied to data visualization practices. Data “visceralization,” or data representation that can be experienced both emotionally and physically, presents an interesting possibility for leveraging and embodying emotion for data engagement. Yet, as D’Ignazio and Klein point out, most examples of such multi-sensory data representations have been produced only within research labs, galleries, and museums. Multi-sensory data representation is particularly appealing for its potential uses in widening access to data science for both able and disabled individuals. Most examples included in this chapter, however, are limited and suggest a performative inclusivity. The description provided of A Sort of Joy, which took place at the New York Museum of Modern Art (MoMA), for example, illustrates a highly affective performance piece, which relied upon gendered performers, spoken word, and kinetic movement to demonstrate the discrepancy between male and female artists represented in the museum’s collections (85-87). A non-sighted person, however, would not have been able to appreciate the kinetic aspect of this performance and may have had difficulty discerning the gendered aspect of the performers, given the natural acoustic variances of the human voice. The examples listed in the following pages are less impressive in this respect—even as physical or tactile elements, such as clothing, are incorporated into a data “exhibit”, the presentation of the data itself remains wholly visual (cf. 87-88). This does not serve. If, in making a data representation multi-sensory, the new sensory elements do not in some way replicate as well as reinforce the original (often visual) element, we then render the data fully inaccessible by making it illegible to assistive devices that might otherwise make use of descriptive, alt-text. Of course, “a design choice made in one context or for one audience does not translate to other contexts or audiences” (91). Still, I am sure the authors would agree that accessibility is important, particularly as we move to reintroduce and rebalance assumptions about human emotion and reason within data science. Underscoring the question of emotion and the approaches towards leveraging emotion in data science that are introduced here is the additional issue of who is ultimately responsible for interpreting data or who is allowed into that process. Is it the analyst? The visualization designer? The publication editor? Or the readers themselves? This chapter suggests that such responsibility lies with the designer and editor. These are the individuals whose attempts at neutrality, whose “god tricks” deceive and mislead their audiences. But this also undervalues the ability of the layperson to critically evaluate the information presented to them. There is certainly a rhetoric at work in data presentation, but if one were to do away with the rhetorical devices of standard data visualization practices—two-dimensional viewpoints, clean layouts, geometric shapes, and source citation—frequently associated with objectivity, one also disrupts the processes through which audiences are accustomed to engaging with data. Not every reader will evaluate a data visualization beyond skimming its title or caption for context, but some will, relying on previous encounters with simple design principles to inform their framework of interpretation. Many readers will not engage with citations beyond recognizing them as signifiers of legitimacy, but some will, locating data sources in order to better examine the truth of the graphic or out of pure curiosity. If it is necessary to disrupt these frameworks in order to take a stance with data, in order to leverage emotion, we must also ask where emotion enters into the equation. Are designers and editors simply making room for emotion in their visualizations, inviting a human response from their audiences, or are they seeking a particular reaction and making rhetorical decisions accordingly? When a publication pushes for a particular response, assuming how an audience will necessarily feel when viewing a data visualization, it inadvertently turns away from inclusion, homogenizing its audience rather than acknowledging its diversity and the prospect of varied individual reactions. The answer to this question has significant bearing for fields in which a conscious attempt at neutrality is essential, such as journalism. In such work it is crucial to make room for emotion—to acknowledge that the human subjects and audiences of research are emotional beings. But it is equally critical that a researcher not impose their own emotional positionality onto those connected with their work, as this is the point where bias, whether unconscious or well-meaning, is introduced. Any attempt by a researcher to affect a particular response, even if developed in juxtaposition to other biased or reactionary data representations, is particularly concerning in today’s political environment, in which misleading and intentionally incendiary content is so pervasive. Unfortunately, this issue goes largely unaddressed in the book. D’Ignazio and Klein envision a broad reach for Data Feminism. Their intended audience includes data scientists, feminists, professionals in all fields that make data-driven decisions, and entire communities that seek to resist and/or mobilize data. And more broadly, “everyone who seeks to better understand the charts and statistics … [and] everyone who seeks to communicate the significance of such charts and statistics to others” (19). As previously established, however, this volume is not a textbook, which would introduce audiences to common statistical visualizations and inform them of the contexts of their use; it is a manifesto. Nor is it likely to be picked up by data scientists or busy professionals for whom traditional methods of data collection, analysis, and visualization are well-established features of daily work and possibly a condition of continued employment. This is to say nothing of community members and organizers requiring counsel on data collection and representation, as, despite the many examples it puts forth, this volume is not a guidebook. Those most likely to come into contact with this book, recognize its utility as a resource, and recommend it to others are those same intermediaries whom the authors cite in Chapter Eight as the most likely individuals to take on the arduous task of creating “data user guides” and collating the contexts of open data sets (170-171). Indeed, drawing upon personal autopsy, it is data librarians, Digital Scholars, and Digital Humanists working within library systems and higher education institutions who, being familiar with the book, are able to recommend it to others and thus establish a limited nexus of dissemination. Its clear explanations, engaging images, and real-world examples will prove readily engaging for college-aged or even high-school students who are beginning to critically examine the representations of data around them. With the exception of a few, minor typos—such as the capitalization error on p. 75—this book is well served by clear writing and typesetting. Potentially unfamiliar terminology is visually marked for with italics and the reader is consistently presented with summary explanations of each idea. While the occasional usage of colloquialisms such as “for reals!” on p. 100 may come across as off-putting, the language of the volume is clearly wide-reaching and the use of academic jargon is limited. The book is well served by the amount of visuals included in each chapter. Although helpful for illustrating some of the research presented, it is not always clear how some of these images fit within the context of the discussion. These would be well served by more substantial captioning or explanation in the endnotes. Overall, Data Feminism is an excellent book that clearly argues for the need to critically examine the uses, methods, and contexts of data science.
Contents: Acknowledgements (ix) Introduction: Why Data Science Needs Feminism (1) 1. The Power Chapter (21) Principle: Examine Power 2. Collect, Analyze, Imagine, Teach (49) Principle: Challenge Power 3. On Rational, Scientific, Objective Viewpoints from Mythical, Imaginary, Impossible Standpoints (73) Principle: Elevate Emotion and Embodiment 4. “What Gets Counted Counts” (97) Principle: Rethink Binaries and Hierarchies 5. Unicorns, Janitors, Ninjas, Wizards, and Rock Stars (125) Principle: Embrace Pluralism 6. The Numbers Don’t Speak for Themselves (149) Principle: Consider Context 7. Show Your Work (173) Principle: Make Labor Visible Conclusion: Now Let’s Multiply (203) Our Values and Our Metrics for Holding Ourselves Accountable (215) Auditing Data Feminism, by Isabel Carter (223) Acknowledgement of Community Organizations (225) Figure Credits (227) Notes (235) Name Index (303) Subject Index (307)
Notes: [1] In Chapter Seven, D’Ignazio and Klein also passively acknowledge this seeming disconnect. In summarizing how each chapter introduces possibilities for pushing back against the power of global capitalism, Chapter Three is notably absent: “incorporating an examination of power into a data analysis project (chapters 1 and 2); pushing back against false binaries and hierarchies (chapter 4); including multiple and marginalized voices in the design process (chapter 5); and contextualizing data so that they are not imagined to ‘speak for themselves’ (chapter 6)” (184).