Artificial Judgment

From Max Weber and Herman Melville to AI Bar Exams

May 07, 2025

When the State Bar of California admitted last week that it had used AI to help write a portion of the multiple-choice questions on its February 2025 bar exam, the legal world erupted in a rare moment of shared incredulity. The outcry was swift: “Having the questions drafted by non-lawyers using artificial intelligence is just unbelievable,” said one professor. Others called it a “debacle,” a “staggering admission.”

But the intensity of the reaction, while understandable, obscures a more interesting set of questions. Not simply: should AI be used to write bar questions? But: what do we think law is? And who, or what, do we believe is capable of administering it?

I. What Is Law For?

Let’s begin at the beginning. Roman law, perhaps the earliest example of a rational legal system designed for scale, was built on the idea of ius, a system of rights and obligations designed to maintain order among free men. But it was also profoundly performative: a matter not only of rules but of rituals. Law was spoken aloud in public forums. The role of the jurisconsult was not to codify, but to interpret.

Similarly, when Blackstone compiled the Commentaries on the Laws of England, he argued that common law was a reflection of natural reason. His innovation was to see law not merely as command but as culture. To be educated in law, then, was not just to memorize, it was to inhabit a worldview, a way of being. Peter Thiel reflects, “My ambition to be a lawyer was less a plan for the future than an alibi for the present.”

So what is being tested by a bar exam multiple-choice question? Is it knowledge? Is it judgment? Is it a form of induction into a guild?

A proxy for character?

A hazing ritual?

Or is it simply a credentialing mechanism, stamping and sorting the next batch of knowledge workers from the legal factory? Why should the Bar exam not be automated?

II. Positivism, Procedure, and the Specter of Formalism

Legal positivism, as articulated by figures like H.L.A. Hart in The Concept of Law, holds that law is not necessarily moral but is binding by virtue of its social recognition. Hart developed his thesis in lectures he began giving in 1952.

Applying Hart’s view, the bar exam is legitimate because of it’s social function, its optics. If AI written bar exams lessens that legitimacy, it could pose a problem; but there is no intrinsic reason to regard AI legal content as disqualifying. This yields a somewhat Straussian application: It’s OK to use AI in law, but only if nobody knows.

It matters less what AI tests than that it tests, and that it does so according to agreed-upon norms.

But the backlash against the State Bar’s use of AI reveals how fragile those norms are. If people begin to believe that the test is not authored by humans, or worse, not by lawyers, then its legitimacy is undermined.

The irony, of course, is that legal reasoning has always flirted with automation. Jeremy Bentham dreamed of a calculus of felicity, and Leibniz of a universal legal code. The rise of legal formalism in the 19th century, with its emphasis on rules over judgment, created a vision of law that might well be implemented by machine.

Our cultural fear is not that AI wrote 23 questions poorly. It’s that it might write them well, and what means for us.

III. Melville’s Billy Budd

Herman Melville’s Billy Budd, Sailor, a novella he wrote towards the end his life, stages a moral drama aboard a British naval vessel, in which law and justice collide fatally.

The protagonist, Billy Budd, is a young sailor impressed into service. He is innocent, beloved, and fatally flawed, and he stutters under stress. When he is falsely accused of mutiny by the malevolent master-at-arms Claggart, Billy Budd reacts not with a speech but a blow (a metaphor for those who are unable to speak the official language of the law, but are nonetheless in the moral right). Claggart dies instantly. The ship’s captain, Vere, convenes a drumhead court to try Billy. The case is intuitively clear—Claggart lied; Billy is good-hearted, but legally complex: Billy struck and killed a superior officer, a capital offense under the Articles of War.

What follows is not a story about lawless cruelty, but about legal fidelity in the face of human complexity. Captain Vere insists the court convict and sentence Billy. He argues that duty, not conscience, must guide them. “We proceed under the law of the Mutiny Act,” he says. Even as he weeps for Billy, he upholds the law’s impersonal logic.

Melville offers no easy resolution. The story becomes a meditation on the limitations of legal codes. Law in Billy Budd is not corrupt; it is coherent. But that very coherence is its curse. It excludes mercy, nuance, and context. It reveals the peril of substituting rules for judgment. Notably, it stages problems that critics today leverage to critique generative AI, long before the advent of LLMs.

Melville anticipates the tension that modern legal systems face: how to be just when justice and law are not synonymous. How do we ensure that machinery doesn’t sacrifice context on the altar of cohesion and efficiency? Is Billy Budd’s case simply a case of limited machinery, limited epistemology, which science and technology can solve, or is it a structural problem, endemic to every system, every procedural machine?

This is what the backlash to AI-written bar questions surface. The outrage is not just about who wrote the questions, but about the fear that our systems of legal training and evaluation have drifted toward the world of Captain Vere: rule-bound, efficient, and indifferent to the human stakes beneath the surface. I would argue, though, that AI is being scapegoated for a drift intrinsic to law, and legal bureaucracy, itself

Max Weber on the Legal Profession

Max Weber famously described law as part of a society’s rationalization process. Modern legal systems, he argued, are characterized by the replacement of tradition and charisma with rational-legal authority. But that authority must be perceived as legitimate; in this, he position’s Hart’s theory of law within a sociological context. Hart’s positivism is itself a function of our turn away from the baroque and byzantine. Charisma is that which breaks the machine, makes a rule of the exception.

An AI written Bar Exam, then, is not a scandal; it simply exposes the trend Weber described in the 19th century. The moment law became rational and procedural as opposed to a matter of discernment, taste, and moral intuition, it readied itself for this moment. It was only a matter of time before we turned legal reasoning into a form AI can now plausibly imitate. (One might trace the trend back even further to scholasticism, which was the theological version of computer science).

VI. A Question of Judgment

Judgment—phronesis, as Aristotle called it—is not rule-following. It is the art of applying rules in context. It requires discernment, experience, even imagination.

Can a language model possess judgment? Not yet. But Bar Exams, and most legal decisions, don’t require judgment in the broad sense.

So here is a question:

If law is more than logic, and judgment more than statistics, how do we preserve the human meaning of legal reasoning in an age when machines can pass the test?

Kitija

May 14, 2025

If we misunderstand charisma as mere eloquence, confidence, or the ability to impress, then some might see AI as charismatic. Just a thought.

4 replies by Second Voice and others

Daniel Aminoff

May 7, 2025Edited

Perhaps one answer to the question would be to encourage dedicated, deep, human deliberation and documentation of alternative contexts for troubling cases such as that of Billy Budd, where comprehensive investigations of many alternatives could take the form of Rabbinic discourse in the Talmud. Such exhaustive and wide-ranging deep dives could be made possible in today’s hectic legal profession by leveraging the productivity-enhancing aspects of AI, to free time otherwise spent on rule-based analyses to allow legal practitioners to spend more time on judgement/mercy muscle enhancement. Judges could/should also engage in this activity, and their views given more weight, given their experiential growth on the bench. AI could then also learn from ingesting this extended corpus and thereby establish and then strengthen its judgement/mercy muscles. AI may also engage in the discourse that expands this corpus as long as its arguments are proposed for the sake of judgement/mercy.

5 more comments...

Discussion about this post

Ready for more?