Submission Number: 193
Submission ID: 4444
Submission UUID: b608a896-cd66-436a-84f2-8cc176faef6d
Submission URI: /form/project

Created: Fri, 03/22/2024 - 11:56
Completed: Fri, 03/22/2024 - 11:56
Changed: Mon, 10/20/2025 - 15:00

Remote IP address: 71.58.230.184
Submitted by: Carrie Brown
Language: English

Is draft: No
Webform: Project
Project Title Gender GAP in bankruptcy filings
Program CAREERS
Project Image
Tags
Status Complete
Project Leader Nonna Sorokina
Email sorokina@psu.edu
Mobile Phone
Work Phone
Mentor(s)
Student-facilitator(s) Mark Fahim
Mentee(s)
Project Description Use PACER datasets to collect bankruptcy filings and identify cases filed for business reasons, then design a web-scraper to collect names of the petitioners and other information from the bankruptcy petitions and further develop textual analysis-based routine to analyze names and classify bankruptcy filings by gender.
Project Deliverables
Project Deliverables
Student Research Computing Facilitator Profile
Mentee Research Computing Profile
Student Facilitator Programming Skill Level
Mentee Programming Skill Level
Project Institution Penn State University
Project Address
Anchor Institution CR-Penn State
Preferred Start Date
Start as soon as possible. No
Project Urgency Already behind3Start date is flexible
Expected Project Duration (in months)
Launch Presentation
Launch Presentation Date 05/17/2024
Wrap Presentation
Wrap Presentation Date 08/14/2024
Project Milestones
  • Milestone Title: Collect Bankruptcy Data
    Milestone Description: Write the code to obtain bankruptcy case filings 2008-present from the Federal Judicial Center web-site. Organize available data elements in the SAS dataset. Produce summary statistics of the data where applicable. Codify textual categorical responses where possible for further data analysis. Validate data and ensure proper formatting.
    Completion Date Goal: 2024-05-10
  • Milestone Title: Develop NLP code to classify petitions by gender
    Milestone Description: Obtain list of names from the dataset produced in (1). Develop the code to scrape web for gender identifiers associated with the names. Use NLP logic to associate names with gender. Assign gender identifier to the bankruptcy filings based on names.
    Completion Date Goal: 2024-06-10
  • Milestone Title: Assist with statistical analysis
    Milestone Description: Add local economic and run regressions in SAS to explore gender-based distribution of bankruptcy filings.
    Completion Date Goal: 2024-07-10
Github Contributions
Planned Portal Contributions (if any)
Planned Publications (if any)
What will the student learn?
What will the mentee learn?
What will the Cyberteam program learn from this project?
HPC resources needed to complete this project?
Notes
What is the impact on the development of the principal discipline(s) of the project?
What is the impact on other disciplines?
Is there an impact physical resources that form infrastructure?
Is there an impact on the development of human resources for research computing?
Is there an impact on institutional resources that form infrastructure?
Is there an impact on information resources that form infrastructure?
Is there an impact on technology transfer?
Is there an impact on society beyond science and technology?
Lessons Learned
Overall results