I, Robot-Reviewer? Generative AI and the future of eDiscovery

 
For over a decade, eDiscovery practitioners and the courts have recognized that machine-learning processes create significant improvements compared to traditional linear document reviews1. Yet emerging applications built on generative AI (Gen AI) technology could significantly impact the accuracy, speed, and cost of eDiscovery document reviews in a radical way, leading practitioners to ask: could these tools one day replace rather than enhance human lawyer oversight for technology-assisted review workflows?

AI in eDiscovery: the rise of machine learning

The Sedona Canada Principles Addressing Electronic Discovery (the Sedona Canada Principles) provide an authoritative framework of best practices for the identification, collection, preservation, review and production of electronically stored information (ESI) in Canada2.

Principle 7 of the Sedona Canada Principles states that: “A party may use electronic tools and processes to satisfy its discovery obligations”. The use of technology to improve efficiencies and decrease the costs of document review in eDiscovery has been codified in most provincial rules of civil procedure or practice directions and has been recognized by Canadian courts as falling within the ambits of the “proportionality principle”3. In discovery, the determination of proportionality in the production of documents includes such considerations as (i) whether the time required would be unreasonable, (ii) whether the expense would be unjustified, and (iii) whether the volume of documents required to be produced would be excessive4.

As a practical matter, because most documents that are reviewed as part of the documentary discovery process are now digitally native, and because the volume of ESI for review has increased exponentially, the benefits of technology-assisted workflows in generating efficiencies and cost savings in document review have made the adoption of AI-related technologies a standard part of eDiscovery practices5. While new Gen AI-based workflows purport to advance these benefits by a quantum leap, it is important to understand the underlying foundation to ensure that the trajectory aligns with legal principles and best practices.

While new Gen AI-based workflows purport to advance these benefits by a quantum leap, it is important to understand the underlying foundation to ensure that the trajectory aligns with legal principles and best practices.
Technology Assisted Review

Technology Assisted Review (TAR)—also known as “predictive coding”—leverages machine-learning algorithms to identify ESI that is most likely to be relevant or responsive to issues in civil litigation, regulatory reviews or investigations6. TAR is used to classify documents based on relevance, privilege, and responsiveness to the specific issues in the matter. For TAR to work well, the system must be trained by lawyers with deep knowledge of both the case and the relevant eDiscovery software.

In the original TAR (TAR 1.0), a control set of documents chosen by human reviewers is used to train the AI model to identify other documents in the wider data set that would likely be relevant7. The human review team continues training the model until it is considered stable: a determination made through statistical sampling and analysis. Once the TAR model is stable, the classification/ranking algorithm is applied to the entire data set8. Documents with a high relevance score are prioritized for review, leading to significant efficiencies in the time required for eyes-on review.

Continuous Active Learning (CAL)

Continuous Active Learning (CAL)—also known as TAR 2.0—refined TAR workflows by eliminating the need for a control set and reducing the amount of training and statistical analysis required. Unlike the two-step training and review process used in TAR 1.0, document reviews that leverage CAL can begin almost immediately because the CAL algorithm continuously ranks documents according to the decisions of the human review team and serves the highest-ranked documents in priority to the reviewers.  In other words, the CAL model “learns” throughout the review process to prioritize documents with a higher relevance ranking at the front of the review queue to ensure that the most highly relevant documents are coded by the review team at the outset of the eDiscovery project. As a result, CAL can reduce the time required to identify relevant documents and can reduce the number of lawyers required to complete a review.

But even though the benefits of technology in document review are well-established, not all document types are appropriate for CAL projects, such as those that are rich in numerical data or images. Further, while TAR reduces the number of reviewers required at first-level review, it does not obviate the need for lawyers to put eyes on documents at second-level review or as part of the privilege review or Quality Control (QC) process9 (for more on how AI tools can be most effectively integrated into legal practice, read “Rules for AI tools: how can legal teams source suitable tech?”).

Transformers: emerging applications for Gen AI in eDiscovery review

Unlike earlier iterations of technology-assisted review, emerging Gen AI applications and workflows will require much less initial training or intervention by document reviewers. These new tools fall into three categories10:

  1. Gen AI assisted review. The Gen AI model is trained using the review protocol and a set of documents that human lawyers have pre-reviewed. The workflow anticipates significant human validation. This approach most closely resembles existing CAL workflows.
  2. Gen AI iterative assisted review. This approach begins with a Gen AI assisted review but incorporates aspects of iterative review to validate results and provide ongoing feedback to train the Gen AI model.
  3. Gen AI autonomous review. This is the most technology-reliant—and arguably the most efficient—approach of the three. It essentially relies exclusively on the review protocol to train the system prior to the outset of the review and relies heavily on the Gen AI model to conduct its own review QC. Of the three, this approach slides perilously close to the world of “robo-reviewers”, and without sufficient human oversight, creates considerable risk of inaccuracy, as well as the inadvertent production of privileged or confidential information.

As with earlier review technologies, leveraging Gen AI in document review promises to be more efficient, although at present it does not necessarily deliver the same cost savings11. Regardless of which approach is eventually adopted, all Gen AI eDiscovery reviews will continue to require oversight by legal professionals and skilled eDiscovery technologists to ensure accuracy, especially with respect to the protection of legally privileged documents. Ultimately, the predicted time and cost-savings ascribed to future Gen AI-empowered document reviews do not eliminate the professional duty of counsel to exercise legal judgment and to adopt defensible tools and strategies in the course of documentary discovery12.

Conclusion

Emerging eDiscovery applications that leverage generative AI technology could significantly impact the accuracy, speed, and cost of document reviews. While these applications may reshape existing review workflows and processes, they will not eliminate the need for—or the professional duty of—human lawyers to put eyes on documents.


  1. “Linear” or manual review refers to the traditional approach that requires human reviewers to put eyes on all documents in an ESI data set on a document-by-document basis.
  2. Sedona Canada Principles Addressing Electronic Discovery, 3rd ed. (2022). The Sedona Canada Principles were originally drafted in 2008 by Working Group 7 of the Sedona Conference, a non-profit research and educational organization made up of lawyers, judges, academics and technical experts. In 2010, the Sedona Canada Principles were incorporated by reference in the Ontario Rules of Civil Procedure (RCP), R.R.O. 1990, Reg. 194.
  3. Sedona Canada Principles, Principle 2: “In any proceeding, steps taken in the discovery process should be proportionate…”. See RCP, R. 1.04 (1.1) (“Interpretation – Proportionality”) and R. 29.2 (“Proportionality in Discovery”). See also Hanson v. Stollery Estate, 2017 ONSC 528.
  4. RCP, R. 29.2.03(1)(a) and (b) and R.29.2.03(2).
  5. EDRM, Technology Assisted Review (TAR) Guidelines (2019). See also Sedona Canada Principles, Principle 7, Comment 7.d.iv., citing Maura R. Grossman and Gordon V. Cormack, “Technology-Assisted Review in E-Discovery Can Be More Effective and More Efficient Than Exhaustive Manual Review” (2011) 17:3 Rich JL & Tech 1.
  6. Sedona Canada Principles, Principle 7, Comment 7.d.iv.
  7. Ibid.
  8. Ibid.
  9. Rob Robinson, “eDiscovery Review in Transition: Manual Review, TAR, and the Role of AI”, EDRM Blog (August 16, 2024).
  10. Ibid. The article notes that the GenAI software applications are currently more expensive on a per-document basis compared to TAR, but these costs must be weighed against the time savings.

This article was published as part of the Q4 2024 Torys Quarterly, “Machine capital: mapping AI risk”.

Inscrivez-vous pour recevoir les dernières nouvelles

Restez à l’affût des nouvelles d’intérêt, des commentaires, des mises à jour et des publications de Torys.

Inscrivez-vous maintenant