The book manuscript is to be drafted by EOY 2025. Research results will be disseminated as
conference presentations at interdisciplinary academic conferences, such as FAccT or AIES. Some re-
sults also will be published as journal articles. Journals that offer affordable open access publication
will be targeted with priority. The project budget includes funds for open-access publication. Addition-
ally, drafts and accepted publications will be made available on personal websites and repositories.
Special attention will be paid to making the documents accessible to people with disabilities (e.g. by
making available file formats and formatting suitable for screen readers).
The final products align with the project’s goals in two ways. First, the book manuscript allows
for the intended larger synthesis and the analysis of the entire data science project cycle. Second, open-
access journal articles, draft publications, and conference presentations serve important dissemination
functions to advance the impact and the contribution of the project to the humanities.
Since no data collection or analysis is undertaken, and given the norms of academic integrity
and attribution, as well as legal obligations stipulated in author agreements required by publishers, no
significant risks of privacy, confidentiality, or intellectual property are anticipated. Research notes
and manuscript files are stored on proprietary cloud services (esp. OneDrive, iCloud Drive, Zotero stor-
age), which can be considered reasonably safe.
Book Outline: Topics and Research Questions
The book identifies and examines ethical challenges at each step of the data science work cycle, which
consists of (1) project conception, (2) data acquisition, (3) data processing, (4) modelling, (5) evaluation,
and (6) deployment.
Preliminary research topics, questions, and hypotheses have been identified. The
three challenges—of justice, data dominance, and pluralism—frame the discussion throughout.
The first step, project conception, raises the challenge of finding the right problem: To what
questions is data science the answer? One chapter of the book demarcates the limits of data science and
explains how data scientists are responsible for the consequences of their work. A further chapter dis-
cusses which ethical values, if any, should guide data science, such as freedom, neutrality, welfare,
equity, social justice, or the common good.
On data acquisition, one chapter argues that data acquisition starts with a conception of the
social world. Should, for example, gender be represented as binary? Such representations are subject
to ethical evaluation just as actions (Longino, 1995; Basu, 2019; Johnson, forthcoming).
Further chap-
ters investigate the ethics of data (of data ownership and data stewardship).
After data are acquired, the next step is data processing. Data scientists remove data that are
incomplete, invalid, or otherwise erroneous (Ilyas & Chu, 2019). They aim for accuracy: to represent real-
ity correctly (Olson, 2003). But this aim for accuracy is increasingly chimerical as data scientists work with
“soft” data (Akerlof, 2020). Soft data—social, emotional, or psychological properties, such as how suspi-
cious a person looks in a video feed or what emotional timbre their voice has in a recording—do not
represent reality but interpret it. One chapter pursues the research question of what notion of data qual-
ity beyond accuracy should guide data processing.
Once processed, data are used in modelling and evaluation, that is, in building mathematical
representations and using statistical methods to predict, explain, and understand events. Two chapter
investigate how data scientists’ modelling choices impact decision-making. Even technical choices
are value judgments, for example, when assessing whether being a woman causes lower wages. When a
model separates the broader context of gender socialization (e.g., occupational preferences) from the
gender variable, then gender’s effect on wages is reduced (Hu & Kohler-Hausmann, 2020).
The final part of the book, on project deployment, covers the ethics of putting data science to
use. The chapters recommend practices that responsible data scientists should engage in.