Submission Number: 60
Submission ID: 91
Submission UUID: e63b15db-76a2-4a1a-b394-4f672ed8cf5c
Submission URI: /form/project

Created: Wed, 08/12/2020 - 15:09
Completed: Wed, 08/12/2020 - 15:09
Changed: Tue, 08/02/2022 - 15:02

Remote IP address: 67.176.36.130
Submitted by: Anita Schwartz
Language: English

Is draft: No
Webform: Project
Project Title Deep convolutional neural networks (dCNN) for image segmentation, instance labeling, and tracking.
Program CAREERS
Project Image Caplan.PNG
Tags bash (242), bioinformatics (277), computational-chemistry (81), debugging (38), machine-learning (272), programming (5), python (69), scripting (243), slurm (71), software-installation (211), tensorflow (51)
Status Complete
Project Leader Jeffrey Caplan
Email jcaplan@udel.edu
Mobile Phone
Work Phone 302-831-3403
Mentor(s) Wayne Treible, Chandra Kambhamettu
Student-facilitator(s) Huining Liang
Mentee(s)
Project Description Our research group has been using deep convolutional neural networks (CNNs) to segment out biological structures from both time lapse confocal microscopy data sets and three dimensional electron microscopy. We have developed a pipeline that encompasses every step between image acquisition on microscopes, deep learning-based denoising and segmentation, visualization, and image analysis. Last summer, we successfully trained an undergraduate on our pipeline to segment mitochondria from 3D electron microscopy datasets. In this project, we are seeking a student that would like to learn how to implement this pipeline, and in the process, develop new capabilities for our pipeline. The data we use comes from our Bio-Imaging Center that serves over 100 research groups each year. The goal is to develop a flexible deep learning pipeline that can be readily deployed for a wide range of research projects. In this example project, we will examine cross sections of anthers, which produce pollen, that have a distinctive radial organization of tissue layers. The same sample will be imaged by both super-resolution light microscopy and electron microscopy. Images will be overlaid and aligned and both can be used for deep learning. Some hand traced training data of cell outlines has already been generated, making rapid progress possible. In the first month, the student would learn how to use this limited data set to train a CNN and then predict segmentation on new images. Then, the student would manually fix errors in these new predictions to increase the size of the training data set. It is expected that this process will take an additional month to complete. Once cells are adequately segmented, the remainder of the time would be to take that knowledge and use a CNN to classify different cell types and tissue layers. All of this work will be done using the Biomix cluster at the Delaware Biotechnology Institute.
Project Deliverables The main analysis pipeline is mostly in place, and therefore, the deliverable will be to develop standardized instructions that could be provided to make the process of applying this approach to other projects faster and more efficient. The second deliverable is to extend the deep learning approach beyond image segmentation for projects that require classification of objects within images. The goal is to report these findings in a publication in the summer of 2021.
Project Deliverables
Student Research Computing Facilitator Profile - Grad or undergrad
- Interested in cell biology research
- Experienced Linux or Unix user
- Comfortable working in a remote Linux environment (HPC cluster)
- Some experience with Python programming
- Familiarity with machine learning concepts will be helpful
Mentee Research Computing Profile
Student Facilitator Programming Skill Level One programming class
Mentee Programming Skill Level
Project Institution University of Delaware
Project Address Newark, Delaware. 19716
Anchor Institution CR-University of Delaware
Preferred Start Date 02/15/2021
Start as soon as possible. No
Project Urgency Already behind3Start date is flexible
Expected Project Duration (in months) 6
Launch Presentation
Launch Presentation Date 03/10/2021
Wrap Presentation
Wrap Presentation Date 08/11/2021
Project Milestones
  • Milestone Title: Beginning
    Milestone Description: The student would learn the basics of generating a training data set to train a CNN for segmentation of cell outlines from super-resolution microscopy data sets and electron micrographs. Give a Launch Presentation.
    Completion Date Goal: 2021-04-15
  • Milestone Title: Middle
    Milestone Description: The student will learn how to take initial predictions generated with a limited training set, to greatly increase the size of the training set. Through this process, the student will become proficient in using CNNs for deep learning-based segmentation of microscopy images. This can be applied to a multitude of image-based segmentation problems.
    Completion Date Goal: 2021-06-15
  • Milestone Title: End
    Milestone Description: In this final stage, the student will take what he or she learned about CNNs and modify it to classify different types of cells and tissues. This part of the project will build upon the knowledge gained in the prior parts of the project. Give a Wrap-up presentation.
    Completion Date Goal: 2021-08-15
Github Contributions
Planned Portal Contributions (if any)
Planned Publications (if any) Methods paper in summer of 2021.
What will the student learn? The student will learn how to conduct image segmentation and classification using deep learning approaches.
What will the mentee learn?
What will the Cyberteam program learn from this project? Effort involved in recruiting and training
HPC resources needed to complete this project? Access to an HPC with a GPU node that has 1 - 4 Tesla v100 graphic cards or equivalent on the Biomix cluster at the Delaware Biotechnology Institute
Notes
What is the impact on the development of the principal discipline(s) of the project? The methods developed by Huining Liang will greatly assist in the detection and quantification of small RNAs in maize anthers. Currently, we are limited to about 5x multiplexing due to the complexity of data acquisition and image analysis. The deep learning based segmentation of cell walls and classification of tissue layers will make it possible to do much higher order, high throughput multiplexing. Huining now is able to focus on her PhD in deep learning based on some of the techniques she had worked on under the CAREERS project.
What is the impact on other disciplines? Dr. Caplan directs a Bio-Imaging Center that is used by 19 different departments at the University of Delaware, spanning a wide array of disciplines. The approaches developed in this project can be translated to other projects in the network.
Dr. Kambhamettu directs the Video/Image Modeling and Synthesis (VIMS) Lab which has 12 PhD students, all working in the deep learning approaches. The approaches developed in this project opened up aspects that are interesting to several other researchers in the lab.
Is there an impact physical resources that form infrastructure? Nothing to report.
Is there an impact on the development of human resources for research computing? Yes, Huining Liang has received excellent training to assist others in research computing, and therefore, adding to our human resource infrastructure.
Is there an impact on institutional resources that form infrastructure? Nothing to report.
Is there an impact on information resources that form infrastructure? In VIMS Lab, there is a repository of techniques that are impacted by Huining’s work under CAREERS. We now have some new algorithms in place due to this project as a contribution to this repository.
Is there an impact on technology transfer? Nothing to report.
Is there an impact on society beyond science and technology? The project that Huining Liang was working on will assist a Plant Genome Research Program looking at maize anther development. The research on this project will further our understanding of maize anther development and male sterility, which is an important part of crop management. Thus, this project may potentially benefit crop production and food security.
Lessons Learned * Use data augmentation techniques to deal with the challenge of limited data set when training a DCNN model.
* Both geometric and appearance based augmentations are useful.
* Explored a mixed pipeline that takes U-Net and U-Net with multiple channels for segmentation and classification respectively.
* Facilitate research with machine learning methods and HPC resources.
Overall results There is an overall improvement in the classification and segmentation techniques.