Proposed Federal Rule of Evidence 707

G. Alexander Nunn & Brooke Conklin, Proposed Federal Rule of Evidence 707, in Nunn on Evidence (September 15, 2025). | Permalink

The Federal Rules of Evidence are not exactly known for responding quickly to evolving scientific and cultural developments, but the rapid proliferation of artificial intelligence in nearly every field, including law, has spurred a notable exception. The Advisory Committee on Evidence Rules has formally proposed a new rule, Federal Rule of Evidence 707, aimed directly at the challenges posed by machine-generated evidence. This proposal, now open for public comment until February 16, 2026, is the first major effort to codify a federal standard for the admissibility of evidence that originates from a computational process rather than a human mind or action.

Background

For years, federal courts have navigated the admissibility of evidence from complex computational systems using a patchwork of existing rules. Evidence from tools like forensic DNA analysis software, cell-site simulators, and algorithmic risk assessments has typically been filtered through Rule 702, which governs expert testimony, Rule 901 on authentication, and the general balancing test of Rule 403. This approach works reasonably well when a qualified human expert is on the stand to explain the machine’s process, defend its methodology, and undergo cross-examination. The trial process, through the judge’s gatekeeping function established in Daubert v. Merrell Dow Pharmaceuticals, can probe the reliability of the expert’s methods, including the computational tools they relied upon.

The problem, as the Advisory Committee’s extensive work has highlighted, arises when no such expert is offered. A party might, for example, introduce a report generated by a predictive algorithm that analyzes market data to conclude the cause of a stock’s collapse. Or, in a criminal case, the prosecution might offer an analysis from an AI video enhancement tool that purports to identify a weapon in a grainy surveillance video. In these scenarios, the machine’s output isn’t merely raw data, it’s an inference, a conclusion that functions as an opinion. Without a sponsoring expert, there is no clear procedural hook for the opposing party to challenge, or for the judge to vet, the reliability of the underlying process. This allows complex, and potentially flawed, machine-generated opinions to reach the jury without the scrutiny that a human expert would have to endure.

Proposed Rule 707 confronts this issue head-on. To appreciate its streamlined design, it is helpful to see the rule in its entirety. The Committee’s proposed rule states:

Rule 707. Opinion Evidence Generated by a Machine or Process

When machine-generated evidence is offered without an expert witness and would be subject to Rule 702 if testified to by a witness, the court may admit the evidence only if it satisfies the requirements of Rule 702(a)-(d). This rule does not apply to the output of simple scientific instruments.

The rule’s architecture reveals three critical design choices. First, it targets a specific subset of machine evidence that would implicate Rule 702 if offered by a human witness. This means the rule doesn’t apply to raw data like temperature readings or GPS coordinates, but it does capture machine-generated interpretations, predictions, or analytical conclusions. The Committee carefully avoided defining what constitutes an “opinion” from a machine, likely recognizing that bright-line distinctions between data and inference become increasingly blurred as computational systems grow more sophisticated. Second, the rule creates a direct transplant of Rule 702’s four-part test, requiring that the machine’s process be based on sufficient facts or data, employ reliable principles and methods, and apply those methods reliably to the facts of the case. This parallel structure allows courts to leverage decades of Daubert jurisprudence while acknowledging that machines don’t have “qualifications” in the traditional sense. Third, the rule’s final sentence excludes “simple scientific instruments,” though it deliberately avoids defining that term in the rule text itself.

The Advisory Committee Note reveals the deeper concerns animating this proposal. The Committee identifies the core problem as one of adversarial testing. When a human expert testifies, opposing counsel can probe the expert’s assumptions, methodology, and potential biases through cross-examination. Machine outputs presented without a sponsoring expert create what the Committee calls an accountability gap, a concern Ed Cheng and I first introduced in 2019. The note explicitly rejects treating this as a hearsay problem, reinforcing that machines cannot be “declarants” under Rule 801. Instead, the Committee frames the issue squarely within the reliability paradigm of Article VII. Significantly, the note warns against parties using machine evidence as an end-run around Rule 702’s requirements, stating that “it cannot be that a proponent can evade the reliability requirements of Rule 702 by offering machine output directly.” The Committee also addresses practical concerns about notice, suggesting that the same disclosure principles applicable to expert reports should apply to machine-generated evidence, though it stops short of proposing specific amendments to the discovery rules.

Perhaps most instructive are the specific factors the Committee Note identifies for assessing machine reliability. The note emphasizes scrutiny of training data, particularly whether it’s “sufficiently representative to render an accurate output for the population involved in the case at hand.” This language suggests courts will need to examine not just whether a system works in general, but whether it works for the specific context before them. The Committee highlights the risk of “function creep,” borrowing a term from privacy law to describe situations where a tool validated for one purpose gets deployed for another. The note also directs attention to validation studies, error rates, transparency, and interpretability. Notably absent is any safe harbor for commercially available software or widely-used systems. The Committee considered but ultimately rejected including explicit language in the rule text that would exempt “routinely relied-upon commercial software,” though the final sentence about simple scientific instruments and the mention of judicial notice in the Committee Note suggest some everyday tools won’t require full reliability hearings. Ultimately, Rule 707 seems to reflect the Committee’s attempt to balance procedural efficiency with the need for meaningful scrutiny of increasingly powerful computational tools.

Significance

The practical significance of Rule 707, if adopted, would certainly be meaningful. It formalizes a judicial gatekeeping role for a new category of evidence and creates a clear framework for litigators. For lawyers seeking to introduce machine-generated evidence, the days of simply authenticating a printout and handing it to the jury will be over. Instead, they will need to be prepared for a substantive reliability hearing, much like a standard Daubert hearing for a human expert.

One of the most critical aspects of the proposal, and a likely focus of public comments, is the scope of its application. Early drafts and Committee discussions have debated whether the rule should include a textual carve-out for the output of “basic scientific instruments or routinely relied‑upon commercial software.” The logic is that no one needs a Daubert hearing to admit a temperature reading from a standard thermometer or a calculation from Microsoft Excel. The challenge, however, is defining the line between a “basic instrument” and a complex analytical tool. The Committee is weighing whether to put a specific exception in the rule’s text, which would give it binding force, or to address the issue in the advisory note, which would offer more flexible guidance to judges. How this question is resolved will significantly shape the rule’s day‑to‑day impact.

Notably, the proposal correctly situates the central problem with machine‑generated opinions as one of reliability, not hearsay. For decades, courts have generally agreed that because a machine isn’t a person, it cannot be a “declarant” under Rule 801, and its output is therefore not hearsay. The true evidentiary danger is not that the machine is “lying,” per se, but that its process is flawed, opaque, or misapplied. By placing the new rule in Article VII alongside the other expert opinion rules, the Committee reinforces that the proper inquiry is a technical and methodological one focused on trustworthiness. This focus on reliability will inevitably create new pressures on discovery practice, as challenging or defending a machine’s output under Rule 707 will require meaningful access to information about how the system works, a topic the Committee has acknowledged will require coordination with the civil and criminal rules committees.

Nunn’s Take

The Advisory Committee’s proposal for Rule 707 is a prudent stopgap. It’s a useful Band‑Aid, but it’s not the major surgery that evidence law truly needs to adapt to an “AI Era” that is here to stay. The Committee correctly diagnoses the most immediate problem, recognizing that a party shouldn’t be able to evade the reliability standards of Rule 702 simply by offering a machine’s opinion without a sponsoring human expert. The proposed rule rightly plugs this loophole by mapping Rule 702’s familiar reliability framework onto this new category of evidence. This move is both pragmatic and doctrinally sound, reinforcing that the core issue with algorithmic outputs is reliability. By treating the output like the expert opinion it often emulates, the rule prevents litigants from laundering potentially junk science into the courtroom just because it comes from a machine. It’s a necessary, commonsense fix that will deter the worst abuses, particularly the current workaround where parties try to slip machine outputs into evidence through lay witnesses or business record affidavits.

But let’s be clear about what Rule 707 actually does and doesn’t do. The rule doesn’t raise the reliability bar. It simply ensures that the existing bar applies consistently. Right now, a prosecutor can offer an algorithmic risk score through a technician who just “pushed a button,” thereby dodging reliability scrutiny. Rule 707 would end this dangerous inconsistency by mandating that if machine output would constitute expert opinion from a human, it must satisfy Rule 702’s standards regardless of whether an expert testifies. That’s important progress, but it’s also the bare minimum.

The deeper problem is that Rule 707 attempts to solve a 21st-century challenge with a 20th-century tool. The rule doubles down on the Daubert framework, which requires lay judges to make deeply technical judgments about scientific validity. This is already a challenge with traditional science. Asking judges to assess the reliability of a complex neural network, which may be a black box even to its creators, stretches this contested paradigm to its breaking point. We’re essentially asking trial judges to become machine learning experts overnight, armed only with the same analytical tools they use for evaluating fingerprint analysis or toxicology reports.

More fundamentally, Rule 707’s greatest limitation is that it still attempts to retrofit an antiquated evidentiary regime into a world that’s increasingly foreign to those imagined at evidence law’s inception. Our entire evidence framework is witness-centric, designed to assess the credibility of human perception and testimony. The rules assume that evidence flows from people who saw, heard, or did something. But modern proof increasingly flows from standardized processes and data pipelines, not from a witness’s memory. Applying our traditional framework to machine-generated evidence creates a type mismatch. We keep asking how an algorithm is analogous to a human expert, when the more fundamental question is whether the process that created the evidence is reliable on its own terms.

This is where my work with Edward Cheng becomes essential. In our 2019 Texas Law Review article, we argue for a paradigm shift from a witness perspective to a process perspective. Evidence law should evaluate the reliability of the entire computational pipeline, including its inputs, governance structures, validation protocols, and monitoring for drift over time. Consider how a credit scoring algorithm evolves: it’s not a static tool but a dynamic system that requires continuous validation as economic conditions change. Or think about facial recognition systems that may perform well on some demographics but fail catastrophically on others. These aren’t issues that fit neatly into Rule 702’s framework of “reliable principles and methods.” They require us to think about evidence production as an ongoing process, not a discrete analytical event.

Rule 707 gestures in this direction but ultimately funnels everything back through the expert witness gate. The Committee Note’s discussion of training data representativeness and function creep shows awareness of these process-based concerns, but the rule itself doesn’t provide the tools to address them systematically. We need rules that mandate transparency about data provenance, require ongoing validation rather than one-time certification, and create mechanisms for monitoring how systems perform in actual use. We need discovery rules that give parties meaningful access to audit algorithmic systems, not just their outputs. And we need to recognize that reliability in the algorithmic age isn’t binary, it’s contextual and probabilistic.

The Advisory Committee’s institutional character makes such bold reforms unlikely. It favors incremental, reactive change that codifies practices already developing in the courts. And Rule 707 follows that script perfectly. It doesn’t create a new framework for AI but simply clarifies that an existing framework should apply in a new context. This conservative approach has its virtues, providing judges and lawyers with a familiar roadmap. But it also keeps us trapped in a witness-centric framework that doesn’t map cleanly onto algorithmic systems.

So yes, Rule 707 is a worthwhile amendment that I expect will be adopted in something close to its current form. It solves a discrete problem in a sensible fashion and will stop the most egregious attempts to launder unreliable machine outputs into evidence. But in my view, its most important implication is that it will force the legal system to start grappling with what a process-based inquiry looks like in practice. When judges struggle to apply Rule 702’s framework to transformer models and reinforcement learning algorithms, when discovery battles erupt over access to training datasets and model architectures, when litigants demand to know not just what a system concluded but how it evolved to reach that conclusion, we’ll realize that Rule 707 was just the opening move in a much longer game. The real transformation of evidence law for the computational age still lies ahead.

Subscribe, it’s free!

Proposed Federal Rule of Evidence 707

Background

Significance

Nunn’s Take

See Also

Brooke Conklin

Naomi Brown

Grace Chapman

Questions Asked

Days Active

Proposed Federal Rule of Evidence 707

Background

Significance

Nunn’s Take

See Also

G. Alexander Nunn, J.D., Ph.D.

A Blog on Evidence Law

About the Author

The Team

Brooke Conklin

Naomi Brown

Grace Chapman

Subscribe to Nunn on Evidence

Support nunn.ai's mission

Conversation History

Your Conversation History

Account

Your Activity

Questions Asked

Days Active

Recent Activity

Recent Searches

Account Information

Guest User

Settings

Notification Preferences

Password Change Information