Submission information
Submission Number: 118
Submission ID: 210
Submission UUID: 2e6c97ea-7b3e-4f39-880b-6eaba73f6e0b
Submission URI: /form/project
Created: Tue, 09/21/2021 - 14:35
Completed: Tue, 09/21/2021 - 14:47
Changed: Thu, 06/16/2022 - 09:15
Remote IP address: 74.103.220.121
Submitted by: Gaurav Khanna
Language: English
Is draft: No
Webform: Project
Project Title: Statistical Analysis of criminal cases in the United States District Court of Puerto Rico Program: CAREERS (323) Project Image: https://support.access-ci.org/system/files/webform/project/210/iStock-1221293664-1.jpg Tags: ai (271), data-analysis (422), machine-learning (272), python (69) Status: Complete Project Leader -------------- Project Leader: Carlos Paniagua Email: carlos.paniaguamejia@salve.edu Mobile Phone: {Empty} Work Phone: {Empty} Project Personnel ----------------- Mentor(s): Diego Alcala (1426) Student-facilitator(s): Donnie Aikins (1431) Mentee(s): {Empty} Project Information ------------------- Project Description: For the purposes of submitting an amicus brief to the US Supreme Court, the Puerto Rico Association of Criminal Defense Lawyers (PRACDL) compiled several indictments and docket sheets from the PACER system. Data from these documents were extracted and analyzed with sociodemographic data from the US Census. Nevertheless, there is still an opportunity to continue to analyze the remaining data to present a visual representation of not only the type of cases seen in this court but also the length of time that the case is "open", the percentage of persons represented by a court-appointed attorney, the average length of sentences, the number of persons granted bail, the number of persons with bail violations and the reasons for those violations, among others. An understanding of these data will facilitate related future social justice projects in this jurisdiction. Project Information Subsection ------------------------------ Project Deliverables: Development of a workflow for classification of indictment and docket documents in the PACER system. Better understanding of the case data through classification using the multitude of parameters mentioned above. Project Deliverables: {Empty} Student Research Computing Facilitator Profile: Data science skills Python Mentee Research Computing Profile: {Empty} Student Facilitator Programming Skill Level: Some hands-on experience Mentee Programming Skill Level: {Empty} Project Institution: Salve Regina University Project Address: {Empty} Anchor Institution: CR-University of Rhode Island Preferred Start Date: 10/01/2021 Start as soon as possible.: Yes Project Urgency: Already behind3Start date is flexible Expected Project Duration (in months): 6 Launch Presentation: https://support.access-ci.org/system/files/webform/project/210/cyberready_presentation_detailed.pptx Launch Presentation Date: 11/10/2021 Wrap Presentation: https://support.access-ci.org/system/files/webform/project/210/Donnie.project.wrap_.pptx Wrap Presentation Date: 06/08/2022 Project Milestones: - Milestone Title: Milestone #1 Milestone Description: Data understanding, data curation, communication with subject matter expert (SME), Questions and Hypotheses generation, Launch presentation. Completion Date Goal: 2021-11-15 Actual Completion Date: 2021-11-15 - Milestone Title: Milestone #2 Milestone Description: Exploratory Data Analysis, Communication with SME. Completion Date Goal: 2021-12-15 Actual Completion Date: 2021-12-15 - Milestone Title: Milestone #3 Milestone Description: Modeling building and model selection. Completion Date Goal: 2022-02-15 Actual Completion Date: 2022-02-15 - Milestone Title: Milestone #4 Milestone Description: Model validation, Communication with SME, documentation, GitHub submission, Wrap presentation. Completion Date Goal: 2022-03-31 Actual Completion Date: 2022-06-08 Github Contributions: {Empty} Planned Portal Contributions (if any): {Empty} Planned Publications (if any): {Empty} What will the student learn?: {Empty} What will the mentee learn?: {Empty} What will the Cyberteam program learn from this project?: {Empty} HPC resources needed to complete this project?: {Empty} Notes: {Empty} Final Report ------------ What is the impact on the development of the principal discipline(s) of the project?: The principal discipline this project was concerned with is Law and its practice as it relates to their impact on society at large, particularly in the US and the US Court District of Puerto Rico. The pipelines developed by the RCF will serve as good proof-of-concept for processing court documents (such as dockets and indictments) into structured/semi-structured files for analysis and further processing once (we hope soon!) access to these court documents becomes open to everyone at no cost. The analysis performed has potential to be used by our subject matter expert Diego Alcala, a practicing criminal defense attorney in the US District of PR, and his peers in the PR Association of Criminal Defense Lawyers. I believe other such analyses could prove useful to other defense lawyers in other court districts of the US. What is the impact on other disciplines?: Science and technology create opportunities like this for us to engage in communities and help the world. Specifically, a central place to the role of education and training in preparing youth for productive engagement in a rapidly changing and sophisticated world. Is there an impact physical resources that form infrastructure?: None, Is there an impact on the development of human resources for research computing?: Yes; the student facilitator enjoyed his engagement with CyberTeams and is open to the possibility of computational work/facilitation as a career option. Is there an impact on institutional resources that form infrastructure?: None, Is there an impact on information resources that form infrastructure?: None. Is there an impact on technology transfer?: None. Is there an impact on society beyond science and technology?: The workflow developed has potential to be used by criminal defense attorneys for developing an understanding of the patterns and trends in court cases and eventually help improve the criminal justice system. Lessons Learned: Better understanding of the Law; efficient coding; time management; RCF also learnt the relevancy of having to adjust and learn new things constantly as a Data Enthusiast (whether it be analyst or scientist). Overall results: The pipelines developed by the RCF will serve as good proof-of-concept for processing court documents. The workflow developed has potential to be used by criminal defense attorneys for developing an understanding of the patterns and trends in court cases and eventually help improve the criminal justice system.