Submission Number: 94
Submission ID: 131
Submission UUID: c29115c5-d292-4709-b656-6a6a386ea45b
Submission URI: /form/project

Created: Wed, 03/10/2021 - 15:39
Completed: Wed, 03/10/2021 - 15:39
Changed: Wed, 07/06/2022 - 15:09

Remote IP address: 71.59.69.51
Submitted by: Galen Collier
Language: English

Is draft: No
Webform: Project
Project Title: Parallel analysis of variants in multiple bear genomes: Visualization of complexity
Program:
CAREERS (323)

Project Image: https://support.access-ci.org/system/files/webform/project/104/Bear_0.png
Tags:
bioinformatics (277), java (560)

Status: Halted
Project Leader
--------------
Project Leader:
Andrey Grigoriev

Email: grigorie@camden.rutgers.edu
Mobile Phone: {Empty}
Work Phone: {Empty}

Project Personnel
-----------------
Mentor(s):
Galen Collier (420)

Student-facilitator(s):
{Empty}

Mentee(s):
{Empty}


Project Information
-------------------
Project Description:
Introduction: This interdisciplinary proposal focuses on the application of high-performance computing (HPC) approaches and efficient visualization to the analysis of next-generation sequencing (NGS) of the genomes brown and polar bears. It aims to improve and speed up the detection of common variants in cohorts of related genomes to establish evolutionary trajectories of the corresponding species. The work will be performed by a graduate student under the supervision of Dr. Andrey Grigoriev, Professor at the Biology Dept and Center for Computational and Integrative Biology at Rutgers-Camden. Remote work is the most likely mode of operation in this project.

Genomes of all organisms and species undergo constant change and mutations are of varying scales. Structural variants (SVs) typically affect much larger genome intervals compared to single nucleotide variants (SNVs) or short insertions/deletions (indels). Currently, comparative genomics efforts mostly focus on SNV/indels in protein coding regions, while the role of SVs (especially outside those regions) generally remains a mystery. There is an unmet need and a growing interest in understanding the effect of SVs in evolution using NGS. 

Project details: The low accuracy of current SV finding pipelines necessitates a visual inspection of found variants. Efficient graphical representation of these variants remains a challenge and existing tools cannot cope with more than 3 samples(1). A flexible data-driven pipeline connecting the output of variant-finding algorithm to the graphical user interface is also needed. This project will include the development of such interface and pipeline. This project will also complement our work on parallel search in multiple samples, combining weaker evidence at similar locations in similar subspecies will further improve SV prediction accuracy compared to the current pipelines based on our algorithm, GROM(2). 

References

1. Robinson, J.T., Thorvaldsdóttir, H., Wenger, A.M., Zehir, A. and Mesirov, J.P. (2017) Variant review with the integrative genomics viewer. Cancer research, 77, e31-e34.
2. Smith, S., Kawash, J., Grigoriev, A. (2017) Lightning-fast genome variant detection with GROM. GigaScience 6(10), 1-7.


Project Information Subsection
------------------------------
Project Deliverables:
{Empty}

Project Deliverables:
{Empty}

Student Research Computing Facilitator Profile:
Grad or undergrad
Interested in computational genomics research
Experienced Linux or Unix user
Comfortable working in a remote Linux environment (HPC cluster)
Experience with Python programming
Experience with C programming

Mentee Research Computing Profile:
{Empty}

Student Facilitator Programming Skill Level: Practical applications
Mentee Programming Skill Level: {Empty}
Project Institution: Rutgers University–Camden
Project Address:
303 Cooper St
Camden, New Jersey. 08102

Anchor Institution: CR-Rutgers
Preferred Start Date: 06/02/2021
Start as soon as possible.: Yes
Project Urgency: Already behind3Start date is flexible
Expected Project Duration (in months): {Empty}
Launch Presentation: {Empty}
Launch Presentation Date: {Empty}
Wrap Presentation: {Empty}
Wrap Presentation Date: {Empty}
Project Milestones:
- Milestone Title: Project launch and initial familiarization
  Milestone Description: Become familiar with the available visualization methodologies and the GROM pipeline, CAREERS Program “Project Launch” Presentation
- Milestone Title: Develop methods for displaying data on multiple genomes
- Milestone Title: Implement in Python/Java a data parser and a GUI based on these methods
- Milestone Title: Combine the HPC SV analysis pipelines with the GUI
- Milestone Title: Project wrap-up and documentation
  Milestone Description: Provide documentation for the code and results, description of novel pipeline steps and visualization methods; CAREERS Program “Project Close” Presentation

Github Contributions: {Empty}
Planned Portal Contributions (if any):
{Empty}

Planned Publications (if any):
See above, at least one paper with biological conclusions planned. Potential methodological publication as well.

What will the student learn?:
Intricacies of genomics, the unexpected ways genomes are changing, puzzles of genome rearrangements, how genome sequencing helps unravel these.

What will the mentee learn?:
{Empty}

What will the Cyberteam program learn from this project?:
Effort and requirements involved in recruiting and training junior-level bioinformatics specialists in genomics filed.

HPC resources needed to complete this project?:
{Empty}

Notes:
{Empty}



Final Report
------------
What is the impact on the development of the principal discipline(s) of the project?:
{Empty}

What is the impact on other disciplines?:
{Empty}

Is there an impact physical resources that form infrastructure?:
{Empty}

Is there an impact on the development of human resources for research computing?:
{Empty}

Is there an impact on institutional resources that form infrastructure?:
{Empty}

Is there an impact on information resources that form infrastructure?:
{Empty}

Is there an impact on technology transfer?:
{Empty}

Is there an impact on society beyond science and technology?:
{Empty}

Lessons Learned:
{Empty}

Overall results:
{Empty}