My PhD dissertation aims (1) at reconstructing the structure of the context of discovery of
‘data-driven’ (big data, data intensive) biology and (2) at comparing it to traditional
molecular approaches. Within the current debate in philosophy of science, ‘traditional
approaches’ in molecular biology should be understood as the discovery and heuristics
strategies identified by mechanistic philosophers such as Carl Craver and Lindley Darden.
Therefore, key questions of my thesis are: what is the structure of discovery of datadriven
biology? Is data-driven biology methodology different from traditional molecular
The reason for doing such an analysis comes from a recent controversy among
biologists. In particular, sides disagree on whether high throughput sequencing
technologies are stimulating the development of a new scientific method somehow
irreducible to traditional approaches. I will try to disentangle the debate by reconstructing
and comparing data-driven and traditional methodologies. The dissertation is composed
of five chapters.
The first chapter deals with methodological issues. How do I compare data-driven
and traditional molecular biology structures of discovery? Mechanistic philosophers have
extensively characterized the discovery structure of traditional molecular biology.
However, there is not such an analysis for data-driven biology. In order to do this, I will
critically revise the discovery/justification distinction. The debate on
discovery/justification has provided valuable tools on how discovery strategies might be
conceived, and it is clearly one of the main forefathers of recent philosophical discussions
on scientific methodologies in biology and physics.
In Chapter 2 I shall to try to infer a full-fledged account of discovery for datadriven
biology by means of the philosophical tools developed in Chapter 1. This analysis
will be done in parallel to the investigation of key examples of data-driven biology,
namely genome-wide association studies and cancer genomics. In Chapter 3 I analyze the
epistemic strategies enabled by biological databases in data-driven biology. In Chapter 4,
I will show how the discovery structure of ‘traditional molecular biology’ can be more
efficiently rephrased through the same theoretical framework that I use to characterize
Since data-driven and traditional molecular biology seem to adopt the same
discovery structure, one might consider the controversy motivating my research ill posed.
However, in Chapter 5 I shall argue that there is still a valuable reason of disagreement
between the sides. Actually, data-driven and traditional molecular biology endorse
different cognitive values, which provide the criteria for evaluating models and findings as
adequate or not. Here one might say that, although the structures of discovery (i.e. how reasoning and experimental strategies are structured and depend on each other) of the
two sides are the same, the contexts of discovery (i.e. the set of both
reasoning/experimental strategies and epistemic values/background assumptions that
motivate discovery) are different. Therefore, in this last chapter I shall pinpoint the
cognitive values behind traditional and data-driven biology, and how these commitments
stimulate the heated disagreement motivating my research.