Submission information
Submission Number: 27
Submission ID: 44
Submission UUID: a95c8eb3-6b76-4498-ba27-c393179db12f
Submission URI: /form/project
Created: Tue, 09/03/2019 - 14:08
Completed: Tue, 09/03/2019 - 14:10
Changed: Fri, 07/10/2020 - 17:05
Remote IP address: 130.215.55.243
Submitted by: Northeast Cyberteam
Language: English
Is draft: No
Webform: Project
Project Title | Genetics and Big Data |
---|---|
Program | Northeast |
Project Leader | Dawei Li |
Dawei.Li@uvm.edu | |
Mobile Phone | |
Work Phone | |
Mentor(s) | Katia Bulekova |
Student-facilitator(s) | Abigail Waters |
Mentee(s) | |
Project Description | The overarching goal of this project is to identify a predictive, quantitative framework describing individual differences in genetic, epigenetic, cognitive, and behavioral markers of emotion-cognition regulation in response to academically stressful situations. Each year, large numbers of young adults drop out of college and university due to self-sabotaging and seemingly irrational behaviors when faced with academic stressors in their young adulthood. This proposal utilizes a cross-disciplinary approach to understanding neuro-biological functionalities and resultant behaviors across a spectrum of neuro-typical and neuro-atypical young adults, the latter being identified as those with diagnosed learning disabilities, such as dyslexia, ADHD, and college-able autism. This project-partnership includes faculty and students from the University of Vermont (sequencing data analyses), Landmark College (research subject recruitment), University of New Hampshire (research subject recruitment), University of Maine (model simulation), and Vermont Genetics Network. Dawei's group has done some trial work at MGHPCC and has been VERY pleased with the results. He would like to scale up -- currently to run one sample, he uses 2TB storage and 5 days of processing with 64GB memory and 12 cores. The planned project has 3,000 samples. To finish them, the storage will be 2TB X 3,000 = 6 PB. Computational time is estimated at 15,000 computing days (5 days X 3,000) using a single processor with 64GB and 12 cores. |
Project Deliverables | |
Project Deliverables | |
Student Research Computing Facilitator Profile | Recommend a graduate student with expertise in dealing with large data sets -- might be more enjoyable if they have an interest in biology but not required. |
Mentee Research Computing Profile | |
Student Facilitator Programming Skill Level | |
Mentee Programming Skill Level | |
Project Institution | UVM |
Project Address | University of Vermont Burlington, Vermont. 05405 |
Anchor Institution | NE-University of Vermont |
Preferred Start Date | 07/10/2017 |
Start as soon as possible. | |
Project Urgency | |
Expected Project Duration (in months) | |
Launch Presentation | |
Launch Presentation Date | |
Wrap Presentation | |
Wrap Presentation Date | |
Project Milestones | |
Github Contributions | |
Planned Portal Contributions (if any) | |
Planned Publications (if any) | |
What will the student learn? | Stephen needs to provide |
What will the mentee learn? | |
What will the Cyberteam program learn from this project? | |
HPC resources needed to complete this project? | |
Notes | Note from Stephen: It remains unclear to me why he needs so much storage and whether he is appropriately compressing files, removing temp files, etc. If I understand his workflow correctly he uses a series of programs he has collected from others -- unclear if he uses scripts to link them all together or not. |
What is the impact on the development of the principal discipline(s) of the project? | |
What is the impact on other disciplines? | |
Is there an impact physical resources that form infrastructure? | |
Is there an impact on the development of human resources for research computing? | |
Is there an impact on institutional resources that form infrastructure? | |
Is there an impact on information resources that form infrastructure? | |
Is there an impact on technology transfer? | |
Is there an impact on society beyond science and technology? | |
Lessons Learned | |
Overall results |