Recently, I began seeing citations to Daston’s early work on mathematical probability cropping up in new research on algorithmic decision systems. It turns out that her first book in particular, Classical Probability in the Enlightenment (1988), sets the stage for current debates over quantification and judgment. Its core questions — What is rationality? Why formalize it? For whom? — have only become more pertinent.
Her new projects include a history of rules, and a study of the relationship between moral and natural orders. We spoke over the phone earlier this fall to discuss the history of calculation, the various emergences of formal rationality, and the importance of interdisciplinarity in the social sciences. The conversation has been edited for length and clarity.
JACK GROSS: I’d like to start by asking about your new book project on rules — “all of them, everywhere,” as you said in a lecture this past winter. How did you begin working on this project? How did you become interested in examining rules from a broad historical perspective?
LORRAINE DASTON: It started with a collaborative project on Cold War rationality, when I realized there was a clear difference between traditional philosophical ideas of reason and a form of rationality that came to characterize the American human sciences in the mid-20th century.
Characteristic of Cold War rationality was the view that certain forms of reasoning, core forms of reasoning, could not only be reduced to rules, but to a certain kind of rule in particular — namely, the algorithm: a rule that can be executed mechanically.
For most of their extremely long history, rules have been formulated to allow a great deal of suppleness in their application. That’s because of an ancient philosophical and still unsolved problem: no universal can foresee all the particulars that it will encounter.
In legal contexts, this situation is familiar: sometimes applying the letter of the law betrays the spirit of the law — think of The Merchant of Venice as a paradigmatic case. Rules that are less august in their domain of application than justice — rules, for example, defining the arts and crafts — are always formulated with an eye toward exceptions. They are meant to be tweaked in the face of experience, adjusted to specific circumstances. Of course, algorithmic rules also have a long history: the rules of arithmetic are the ur-algorithmic rules, but they constitute a tiny minority of all rules. Starting in the mid-19th century, when mechanical calculation genuinely becomes practically possible, the ambit of algorithmic rules grows, and by the middle of the 20th century, they become a form of rationality very much associated with the Cold War.
The motivating question of the project was to understand how rules that intentionally involved judgment and adjustment to particular cases gave way to an understanding of rules as rigid, mechanical, and universally applicable.
It seems that there’s a thread that ties together calculation, Cold War rationality, and mathematical probability, which is the rationalization and formalization of various aspects of human judgment. This seems to follow a logic of automating material production, where you look at the total process, disassemble it into component parts, and then deskill and automate it. Is that accurate?
I think that’s correct, but I didn’t initially think of these projects as being linked. It is startling to see how consistent these interests are across my work. There is, however, a major difference between attempts to formalize reasoning from, say, 18th-century probability theory, and more modern notions of algorithmic rationality. In the 18th and early 19th centuries, probability theory was always subject to the judgments of reasonable people. If the results derived from the mathematical theory of probability diverged from those that a reasonable person might have concluded, it was probability theory that had to be revised and not the reasoning processes of a small elite of “reasonable people.”
That really changes over the course of the 20th century. You see it most vividly in the research done by Amos Tversky and Daniel Kahneman in their famous work “Judgment under Uncertainty,” in which human reasoning is seen to be intrinsically flawed as measured by the standards of Bayesian probability theory.
In your 2017 paper “Calculation and the Division of Labor,” you recount both an intellectual and a labor history of what sounds to the contemporary ear like a mechanical activity: calculation. There are striking parallels between the story you tell and the contemporary development of artificial intelligence, which often requires immense amounts of unseen human labor. Could you describe this history and how you came to see the development of calculation as a story about the division of labor?
The first attempts to create mechanical calculators date to 17th-century thinkers like Pascal and Leibniz. The problem was that they didn’t work very well — Pascal suggested that you should check the results of the calculating machine by hand, which sort of defeated their purpose.
At that time, most heavy-duty calculation took place either in government administration or in astronomical observations. In both of these sites, a great deal of calculation had to be done by hand. Already in the 18th century at the Royal Observatory in Greenwich, forms had been developed that divided the task of a complex calculation into small enough steps — so that, at least at the lower end, the Observatory could employ very cheap schoolboy labor in order to complete them economically.
Over the course of the 19th century, schoolboys were increasingly replaced by women as the cheapest and most reliable form of labor. We can read interesting correspondences from astronomers at Oxford and Harvard recruiting the first generation of women college graduates to perform calculations for half the wages of men. By the late 19th century, the Bureau of Calculation at the Paris Observatory was entirely feminized.
One part of this story, then, is the association of calculation first with the division of labor, and then with very cheap labor. This is a simple economic story: an employer divides labor as finely as possible in order to employ the cheapest labor available on the market — schoolboys in this case, and then the highly educated, extremely conscientious labor of women.
There’s another element, however: the actual improvements in mechanical calculation, which allowed for the smoother working of a calculating machine. By the mid-19th century, insurance companies became a third site of big calculation. Prudential Insurance, for example, had a workshop entirely devoted to repairing broken-down calculating machines. Technical improvements strengthened the relationship between the notions of mechanical and calculation. Until the late 19th century, calculation was called mechanical labor, but no machines were involved. After about 1870, it was mechanical in a double sense: machines were involved, and the people who did the labor were considered mechanical.
You’ve used the term “merely mechanical” to describe the connection between people laboring under particular conditions, and a kind of labor that can be executed algorithmically. Where does this phrase come from?
Originally, in ancient Rome, the term “mechanical” referred to simple machines like a pulley or a lever, but, by the Middle Ages, it had already developed derogatory associations. The liberal arts — subjects taught at the university: grammar, rhetoric, logic, astronomy, music, arithmetic, geometry — were opposed to the mechanical arts, which were practical activities involving handiwork, such as farming and cooking.
The word mechanical begins also to denote the kind of person who does that work. In A Midsummer Night’s Dream, the bumbling amateur actors Bottom and his fellow artisans are called rude mechanicals, which suggests both their bumptiousness and the kind of work they do. So, already by Elizabethan times, at least in English, the word had a very pejorative, socially hierarchical meaning attached to it. Those negative connotations deepen in the 18th century. Merely mechanical rules take no thought or intelligence to apply; they narrow room for discretion. We see a hierarchy of intellectual faculties, in which those forms of human activity that are allegedly mindless are described as “merely mechanical,” even if no machines are involved.
One of the great calculating projects of the late 18th century is the French attempt to recalculate logarithms on a base 10 system as a tribute to the metric system of the French Revolution. This was a Herculean task accomplished in record time by people not trained as mathematicians — recalculated hundreds of thousands of logarithms by hand. Legend has it that many of them were unemployed cooks and hairdressers previously from noble households that were dissolved when their masters were guillotined or fled.
This labor was called “mechanical,” even though it’s done only with quill pen and paper. There’s not a machine in sight. These two senses of “mechanical” converge with Charles Babbage, the mathematician who designed the first computers. Babbage was also a political economist — a banker — who studied the French project and concluded that “merely mechanical” work, performed by people whom he considered mindless, could be done by a machine. At least in his telling, that was the insight that led him to the Difference Engine and the Analytical Engine.
There’s another aspect to your initial question: the analogy to the current day. The people behind the curtain in modern AI projects are of two sorts: those who are thinking about how to divide a very complicated task into the tiniest possible steps, very much in the tradition of the history of mechanical calculation; and those whose work is compensatory for algorithmic systems — Facebook moderators, for example, who monitor objectionable content missed by the algorithms meant to eliminate it automatically. It’s really important to distinguish between these two groups of people, because they are performing two very different tasks.
The continuity I was after was between people who work under similar labor conditions as content moderators — often low-wage workers in the Global South, contracted to large American tech firms — but who are performing tasks more closely akin to that of calculators: they are the machine behind the machine, tagging images for computer vision programs and the like. Is the moderation labor that you bring up, which compensates for the limits of machines, new?
If you look at the details of, for example, the British Nautical Almanac, you can see that there have always been people like the moderators who are tasked with cleaning up — people behind the scenes checking and proofreading calculations, and identifying where things go wrong.
The important point is the extent to which this labor has been erased from the triumphal histories of calculation. The histories are in many ways rightfully triumphal — there has clearly been extraordinary technological advance from the clumsy machines of the 17th century to today — but it’s always been the case that effective functioning has been dependent on high-level judgment, often performed by those on the low end of the hierarchical division of labor.
I wanted to ask a general question about your first book, Classical Probability in the Enlightenment. Unlike the work of Ian Hacking and Barbara Shapiro, who were examining the emergence of the concept of probability, your project focused squarely on the emergence of mathematical probability. How did you become interested in probability, and specifically in the story of quantification?
My college education in the 1970s coincided with the first phases of what we might call quantiphrenia: the beginning of academic evaluation in the form of not only grade point averages, but also citation indices, and so on. I became interested in quantification abstractly, but also quite concretely — as I was the object of some of these attempts to quantify judgment.
But in terms of actually working on probability, it was Ian Hacking’s The Emergence of Probability that changed my path — literally overnight. Historians of science are schooled to look for continuity, but Hacking framed his problem in terms of discontinuity. In his telling, mathematical probability hit with the suddenness of a thunderbolt. As he put it, in his characteristic style: before 1654, there was no concept of probability; it was an unthinkable concept. This was a story of radical novelty, in which suddenly, like a volcano erupting, something completely new appeared in the intellectual landscape.
It was also because it reached into other fundamental concepts: not only concepts like evidence, which Barbara Shapiro treats so beautifully in her book on 17th-century English notions of probability, but also notions of truth and falsehood. For millennia, at least in the Western philosophical tradition, logic was a matter of yes-or-no: a statement was either true or false. To have a spectrum, in which there were degrees of probability between the extreme values of truth and falsehood, was a gobsmacking upheaval. These claims in Hacking’s book continue to grip me.
Your book makes a strong case that the quantification pursued by the early probabilists was founded on legal concepts, like “degrees of uncertainty” and “probabilistic expectation.” This comes as a corrective to a common notion, which is that probability theory emerged from gambling or games of chance. How did the early probabilists draw on the law for the concepts that they thought needed quantification?
To rehearse the obvious argument against gambling as the triggering factor: people have been gambling since time immemorial. If probability theory were a kind of natural evolution from gambling, it should have emerged in many places, in many epochs, long before the 17th century. It’s a nonstarter as a historical hypothesis, even if some of the early applications did deal with gambling and chance.
As for the law, there are two aspects of legal thought that turned out to be seminal for probability: degrees of proof, and aleatory contracts.
The first stems from a much ridiculed doctrine of the arithmetic of proof, highly developed in 16th-century legal and casualist thinking, in which the testimony of different witnesses and different kinds of evidence are assigned different weights. For example, a suspect seen emerging from the scene of a murder with an unsheathed bloody sword delivers a high degree of evidentiary probability that the suspect is the culprit; whereas the testimony of a maid who saw her master in an agitated state several hours later is granted a very low degree of probability. This is both because the maid is a woman in a dependent relation to her master and because of the remote inferential link between committing the crime and the observed psychological state of the subject.
Jurists, without having any kind of measurable scale, had already assigned fractions to these so-called indices of proof. This is a form of reasoning which is quite familiar to us, even if its quantification by an arithmetic of proof now seems strange: weighing and combining strong evidence with weak evidence in order to come to a reasoned judgment.
The second aspect of legal theory that underpinned early formulations of mathematical probability was the so-called aleatory contract, which depends on an uncertain future outcome. In commodities markets, purchasing wheat or pork belly futures for the next year would be an example. The buyer and seller are betting on whether the commodities in question will be more or less expensive than they are now. Aleatory contracts were particularly important because they were one way to get around prohibitions against lending at interest, traditionally seen as usury. Aleatory contracts opened a loophole in these prohibitions: taking a risk was considered to be a form of work that entitles the risk-taker (for example, someone who lent a merchant money for a long voyage) to collect interest on an investment. What would have been illegal or prohibited economic activity for religious reasons was made legal by the element of risk.
These two streams of legal thought influenced the early probabilists. And, crucially, the early figures of mathematical probability — Pierre de Fermat, Leibniz, and Pascal — all had extensive experience in the law.
There’s a striking phrase that you use to describe their goals, which is that they “were seeking to quantify the intuitions of an elite of reasonable men.” They were thinking about formalizing legal concepts for business activities, and about how to rationalize certain aspects of the justice system, but they were also seeking to quantify their own cognitive capacities as “reasonable” people.
Right. And the question is why. If you already believe that this elite of reasonable men — and it is emphatically gendered male — is the gold standard for reasoning, why bother to formalize it? Why not just follow their lead? I think there are two explanations: one minor, and one major, and both are still very much with us. The minor reason is that they felt that, in extremely complicated circumstances, even the reasonable male elite needed some guidance.
The major reason is that the vast majority of people did not belong to the reasonable male elite. This was crucial to the Enlightenment project. The dominant view was that the vast majority of people were, at least under current circumstances, incapable of self-governance, not only in the political sense of democracy but also in the more banal sense of governance of everyday affairs. Enlightenment thinkers held that this form of reasoning must be made accessible to everyone. It is therefore no accident that one of the campaigners for probability theory — the Marquis de Condorcet, who was a leader of the early phases of the French Revolution — made it his task to introduce mathematics, including probability theory, into universal schooling for all French citizens. Condorcet’s curriculum featured instruction in probability — a reasonable calculus — to give future citizens a kind of road map for decision making.
Some very early textbooks, like Condorcet’s textbooks for the Écoles Centrales, were suffused with the view that grasping fundamental mathematical concepts could serve as a model for political decisions: it would teach future citizens what a clear and distinct idea is — and thereby inoculate them against political demagoguery. Condorcet advised against learning the multiplication tables by rote because that would be merely mechanical.
There seems to be a tension between the idea that mathematical probability can encourage the development of sovereign reasoning and the notion that we need to quantify and formalize the faculty of judgment.
It returns to the problem of consistency. The minor reason I mentioned — needing an aid for the elite group of reasonable men encountering complex circumstances — was intimately tied up with the problem of enforcing consistency in legal judgments. The challenge for judges is that what may seem just in a particular circumstance may be unjust if generalized. If there’s inconsistency in judgment, it’s prima facie evidence for injustice. The goal is to make one’s judgment consistent over the long run.
There are some perils to this approach: no universal ever fits the particulars. Never in the history of human rulemaking have we created a rule or a law that did not stub its toe against unanticipated particulars (just ask the Facebook moderators who must clean up after the algorithms). I think machine learning presents an extreme case of a very human predicament, which is that the only way we can generalize is on the basis of past experience. And yet we know from history — and I know from my lifetime — that our deepest intuitions about all sorts of things, and in particular justice and injustice, can change dramatically.
What makes your research compelling for readers far outside your discipline is that you historicize areas of science whose language or mode of reasoning has become so commonplace that we forget they have origins in methodology. This holds through all of your projects — from probability to observation to objectivity. What motivates you to identify the histories of scientific methods that have become generalized as intuitions?
I think the best short description of what I do is the history of the self-evident. There's an obvious and perhaps a less obvious attraction to doing this kind of history. The obvious one is the eruption of intellectual novelty, so fascinatingly presented in Hacking’s work. This is the moment when the unthinkable becomes thinkable.
But this leads to another, perhaps less obvious question: How does the once-unthinkable become not only thinkable but self-evident? How does the unthinkable become something we cannot think away anymore? To get at that, we have to tell a history not only of emergence, but of the formation of the deepest levels of our intuition. I have a strong sense that our best reasoning takes place at a semi-conscious level, and I’m very interested in how that lower stratum of reasoning — the intuitions that drive our thinking in philosophy, mathematics, and the rest — is established.
A final question I want to ask is about your discipline. In your lecture for the Social Science Research Council, you spoke about your research on rules as bringing together the “hard” and “soft” social sciences. How do you see the history of science in relation to the social sciences more broadly?
History of science lies in the middle of a triangle defined by the natural sciences, social sciences, and humanities. Many of us are trained in the sciences, while our methods are very much those of the historian. But the history of science has also been quite permeable to the methods of the social sciences. Historians of science, I think fortunately, have been unclassifiable with respect to the three great divisions of knowledge in our current classification system. I think it’s striking, if you take the longue durée view of the evolution of the sciences, to recall where these divisions of knowledge we take for granted come from. It’s salutary to work in a language like German, which, unlike English, did not narrow the meaning of “science” in the 19th century to mean only the natural sciences. Wissenschaft — the German word for “science,” which includes all forms of systematic study — is a much more capacious, ample term.
Something as simple as shifting from one language to another can prompt you to think historically about how all these forms of knowledge are intertwined. It also leads you to think about their shared epistemic virtues. To use an example from the social sciences: if you are looking for causal mechanisms, often only a detailed ethnography will reveal what exactly is the cause of some observed pattern in behavior. And it can work in the other direction — a hypothesis developed from ethnographic work may require statistical testing. These two modes of inquiry, so often opposed to each other, seem to me to work hand-in-glove, at least from the standpoint of the goals of scientific explanation.
Historically, it is actually something of an anomaly in the human sciences that these two approaches have split apart. I think it’s an artifact of the educational system. I find the qualitative/quantitative division to be both puzzling and provincial — provincial in a double sense of it being linked to a particular language and also to our particular historical moment. There’s nothing epistemologically inevitable about it — quite the contrary.
Jack Gross is a writer, editor, and researcher in New York. This interview was produced in collaboration with Phenomenal World.