#Load scotustext package
remotes::install_github("JakeTruscott/scotustext")
library(scotustext)
These tools aim to help with retrieving, processing, and cleaning oral argument transcripts from the United States Supreme Court.
Within the Oral Arguments suite of
library(scotustext)
are two primary
tools:
oa_search
: An automated tool for retrieving cleaned and
partitioned transcripts (and relevant data) from orally argued cases at
the United States Supreme Court between the 2004 and 2022
terms (Updated June 2023). The dataset will be periodically
updated throughout subsequent terms.oa_parser
: An automated tool for processing, cleaning,
and parsing docket sheets from PDFs stored on your local machine.NOTE: While oa_parser
should work for
any non-scanned PDF oral argument transcript, best use is found for
transcripts between the 2004 and 2022 terms. The reason for this stems
from the processing protocols in oa_parser
, which relies on
locating demarcations in transcripts based on unique justice and
attorney-level identifiers. For example:
It is rare to find non-scanned PDFs of oral arguments, even those on supremecourt.gov that follow this pattern. Instead, most pre-2004 arguments followed a process of identifying changes in which justice(s) is engaging with:
oa_parser
is still capable of accurately locating these
demarcations, but the ability to assess which justice is actually
engaging with counsel is no longer possible.
oa_search
Function Descriptionoa_search
accepts several optional parameters to
customize and curtail search queries:
term
: Character (or vector of comma-separated character
objects) containing relevant term(s) of interestjustice
: Character (or vector of comma-separated
character objects) containing relevant justice(s) of interestattorney
: Character (or vector of comma-separated
character objects) containing relevant attorneys(s) of interestspeaker_type
: Character object indicating “Justices” or
“Attorneys” (Default is BOTH).docket_id
: Character (or vector of comma-separated
character objects) containing relevant docket number assigned to
argument(s) of interestparty
: Character (or vector of comma-separated
character objects) containing relevant party (or parties) of interest
addressed in official case title.NOTE: Search queries limited to October Terms 2004 to 2022 (as of May 2023). Queries will return all matches to provided parameters. Not providing any parameters will retrieve all transcript data from OT04 to OT22.
oa_parser
Function DescriptionImplementation of the oa_parser
tool only requires a
single parameter, dir_path
, which provides a file path (as
a character object) to the folder on your local machine containing oral
argument transcript(s) saved as PDFs. As noted above, the most efficient
use of the tool is found with transcripts following (and including) the
2004 term, which coincides with the Court prescribing demarcations to
indicate which of the justices are engaging with counsel – though it is
capable of cleaning and parsing earlier pre-2004 transcripts where
demarcations are indicated using “QUESTION:…”.
It should also be noted that this tool does not employ the use of Optical Character Recognition (OCR) software. As such, the tool can only process non-scanned PDF transcripts.
oa_search
and oa_parser
Variables## [1] "speaker" "text" "argument" "id" "term"
## [6] "type" "case_name" "word_count"
speaker
: Character indicating Justice or Attorney
providing statementtext
: Text of statement provided by Justice or
Attorneyargument
: Docket number coinciding with argument of
interestid
: Numeric ID number associated with statement
presence in argument (i.e., the order in which the statement was offered
in the argument’s transcript)term
: Term coinciding with argument of interesttype
: Factor indicating whether statement was offered
by a Justice or Attorneycase_name
: Character indicating title of case being
orally argued (e.g., DOBBS, STATE HEALTH OFFICER OF THE MISSISSIPPI
DEPARTMENT OF HEALTH, ET AL. v. JACKSON WOMEN’S HEALTH ORGANIZATION, ET
AL)word_count
: Raw count of words (measured using spaces
present in string) provided in statementoa_search
ExamplesBelow is a collection of example code highlighting the functionality
of oa_search
.
Note: Successful retrieval and (or) processing of
transcripts using oa_search
or oa_parser
will
yield a Completion Summary denoting summary statistics of the
processed oral arguments.
Retrieve All Oral Argument Transcript Data (OT04 to
OT22)
(Leave Optional Parameters Empty)
oral_arguments <- oa_search()
##
## - - - - - - - - COMPLETION SUMMARY - - - - - - - -
## Terms: 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021, 2022
## Speaker Type: Both Justices and Attorneys
Number of Entries for Justice Breyer
length(oral_arguments$text[oral_arguments$speaker == "JUSTICE BREYER"])
## [1] 23078
Number of Entries for Paul Clement (Attorney and Fmr. Solicitor General)
length(oral_arguments$text[oral_arguments$speaker == "MR. CLEMENT"])
## [1] 2871
Number of Unique Attorneys in Dataset
length(oral_arguments$text[oral_arguments$type == "Attorney"])
## [1] 152483
Retrieve Transcripts Limited to OT 2017
oral_arguments_07 <- oa_search(term = "2017")
##
## - - - - - - - - COMPLETION SUMMARY - - - - - - - -
## Terms: 2017
## Speaker Type: Both Justices and Attorneys
#Number of Entries Returned For OT2017
length(oral_arguments_07$text)
## [1] 15743
Retrieve Transcripts Limited to Justice Breyer
oral_arguments_breyer <- oa_search(justice = "Breyer")
##
## - - - - - - - - COMPLETION SUMMARY - - - - - - - -
## Terms: 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021
## Justices: JUSTICE BREYER
## Speaker Type: Justices ONLY
#Number of Entries Returned For Breyer
length(oral_arguments_breyer$text)
## [1] 23078
Retrieve Transcripts Limited to Justices Breyer and Ginsburg in OT2017 and OT2018
oral_arguments_0718_breyer_ginsburg <- oa_search(term = c("2017", "2018"), justice = c("Breyer", "Ginsburg"))
##
## - - - - - - - - COMPLETION SUMMARY - - - - - - - -
## Terms: 2017, 2018
## Justices: JUSTICE BREYER, JUSTICE GINSBURG
## Speaker Type: Justices ONLY
#Number of Entries Returned For Breyer & Ginsburg in OT17 and OT18
length(oral_arguments_0718_breyer_ginsburg$text)
## [1] 1892
Retrieve Transcripts from Dobbs v. Jackson
oral_arguments_dobbs <- oa_search(party = "Dobbs")
##
## - - - - - - - - COMPLETION SUMMARY - - - - - - - -
## Terms: 2021
## Cases: Dobbs v. Jackson Women's Health Organization
## Speaker Type: Both Justices and Attorneys
#Number of Entries Returned For Dobbs
length(oral_arguments_dobbs$text)
## [1] 320
Sample Transcript Data from Dobbs v. Jackson
## [1] "CHIEF JUSTICE ROBERTS: We will hear argument this morning in Case 19-1392, Dobbs versus Jackson Women's Health Organization. General Stewart."
## [2] "MR. STEWART: Mr. Chief Justice, and may it please the Court: Roe versus Wade and Planned Parenthood versus Casey haunt our country. They have no basis in the Constitution. They have no home in our history or traditions. They've damaged the democratic process. They've poisoned the law. They've choked off compromise. For years, they've kept this Court at the center of a political battle that it can never resolve. And years on, they stand alone. Nowhere else does this Court recognize a right to end a human life. Consider this case: The Mississippi law here prohibits abortions after weeks. The law includes robust exceptions for a woman's life and health. It leaves months to obtain an abortion. Yet, the courts below struck the law down. It didn't matter that the law apply -- that the law applies when an unborn child is undeniably human, when risks to women surge, and when the common abortion procedure is brutal. The lower courts held that because the law prohibits abortions before viability, it is unconstitutional no matter what. Roe and Casey's core holding, according to those courts, is that the people can protect an unborn girl's life when she just barely can survive outside the womb but not any earlier when she needs a little more help. That is the world under Roe and Casey. That is not the world the Constitution promises. The Constitution places its trust in the people. On hard issue after hard issue, the people make this country work. Abortion is a hard issue. It demands the best from all of us, not a judgment by just a few of us. When an issue affects everyone and when the Constitution does not take sides on it, it belongs to the people. Roe and Casey have failed, but the people, if given the chance, will succeed. This Court should overrule Roe and Casey and uphold the state's law. I welcome the Court's questions. "
## [3] "JUSTICE THOMAS: General Stewart, you focus on the right to abortion, but our jurisprudence seems to -- seem to focus on, in Casey, autonomy; in Roe, privacy. Does it make a difference that we focus on privacy or autonomy or more specifically on abortion? "
## [4] "MR. STEWART: I think whichever one of those you're focusing on, Your Honor, particularly if you're focusing on -- on the right to abortion, each of those starts to become a step removed for what's provided in the Constitution. Yes, the Constitution does provide certain -- protect certain aspects of privacy, of autonomy, and the like. But, as this Court said in Glucksberg, going directly from general concepts of autonomy, of privacy, of bodily integrity, to -- to a right is not how we traditionally, this Court traditionally, does due process analysis. So I think it just confirms, whichever one of those you look at, Your Honor, a right to abortion is -- is not grounded in the text, and it's grounded on abstract concepts that this Court has rejected in -- in other contexts as supplying a substantive right. "
## [5] "JUSTICE THOMAS: You say that this is the only constitutional right that involves the taking of a life. What difference does that make in your analysis? "
Below is an example of two oral arguments from the 2022 term:
#Replace <FOLDER DIRECTORY> with Relevant File Path
oa_parse_sample <- oa_parser(dir_path = "<FOLDER DIRECTORY>")
## Total Number of Transcripts to be Processed: 3
##
## - - - - - - - - COMPLETION SUMMARY - - - - - - - -
## Total Arguments: 2
## Total Statements: 1,318
## Number of Justices: 9
## Number of Unique Attorneys: 7
Number of Entries Returned For Twitter, Inc & UNC
length(unique(oa_parse_sample$text))
## [1] 1153
Sample Transcript Data from Students For Fair Admissions v. UNC
## [1] "CHIEF JUSTICE ROBERTS: We will hear argument this morning in Case 21-1496, Twitter versus Taamneh. Mr. Waxman."
## [2] "MR. WAXMAN: Mr. Chief Justice, and may it please the Court: JASTA permits any U.S. national injured by reason of an act of international terrorism to recover treble damages from a person who aids and abets by knowingly providing substantial assistance or who conspires with a person who committed such an act of international terrorism. The foundational points here are not in dispute. First, the conceded and obvious act of international terrorism is the Reina attack, and the complaint includes no allegation that the defendants provided substantial assistance, much less knowing substantial assistance, to that attack or, for that matter, to any other attack. Second, as the complaint concedes, the defendants \"had no intent to aid ISIS's terrorist activities.\" Quite to the contrary, they maintained and regularly enforced policies prohibiting content that promotes terrorist activity. The plaintiff's claim that because defendants were generally aware that among their billions of users were ISIS adherents who violated their policies and, therefore, defendants should have done more to enforce those policies does not constitute aiding and abetting an act of international terrorism under the operative terms of the text, the constitutional principles articulated in Halberstam, or any recognized understanding of what it means to abet a criminal act. If Congress had wanted to impose treble damage liability for existing -- assisting a terrorist organization, it had a ready model in the material support statute, Section 2339(b). If it had wanted to create such liability for supporting international terrorism writ large, it likewise had a model in Section 2331(1). Instead, it provided a remedy against those who conspire with terrorists or -- or who knowingly aid and abet acts of terrorism. It did not impose treble damage liability on companies whose services were exploited by terrorists in contravention of the company's enforced antiterrorism policies. I welcome the Court's questions. "
## [3] "JUSTICE THOMAS: Mr. Waxman, it seems that you tie your analysis to knowledge of the Reina attacks rather than just general knowledge of terrorism. "
## [4] "MR. WAXMAN: So we -- it's -- thank you, Justice Thomas. Let me clarify. We do not contend that there is no liability if these companies didn't know that the Reina nightclub would be attacked. What they had to have known to satisfy the operative language of the statute was that they were, in fact, providing substantial assistance to the act of international terrorism that injured the plaintiff and that they knew that their action would substantially assist an act of international terrorism. The -- the flight trainers who provide -- who taught the al-Qaeda terrorists how to fly planes so they could fly them into the World Trade Center and the Pentagon didn't need to know that those were the targets, but he needed to know that he was, in fact, providing substantial assistance to people who aimed to use that knowledge in order to commit a terrorist attack. "
## [5] "JUSTICE THOMAS: So the -- and I may have misunderstood your brief, but -- so you would -- I assume you would agree that if I had a friend who was a mugger, a murderer, and a burglar -- "