Submission Number: 85
Submission ID: 121
Submission UUID: a6a2c476-84ec-49ef-a27c-bf9a7c461a60
Submission URI: /form/project

Created: Tue, 01/26/2021 - 15:59
Completed: Tue, 01/26/2021 - 16:23
Changed: Thu, 05/05/2022 - 03:49

Remote IP address: 74.78.188.196
Submitted by: Bruce Segee
Language: English

Is draft: No
Webform: Project
Project Title Big Data Portal For Sharing Real-world Bioinformatics Data Sets to the Public Domain
Program Northeast
Project Image 488CA134-FDB7-45F4-8793-A0D73935DA88.jpeg
Tags big-data (4), bioinformatics (277), data-management (260), data-wrangling (6), hpc-storage (171), metadata (264), science-gateway (28), storage (47)
Status Complete
Project Leader Rocko Graziano
Email rocko.graziano@maine.edu
Mobile Phone
Work Phone
Mentor(s) Larry Whitsel
Student-facilitator(s) Joseph Neumann, Ben Burnett
Mentee(s)
Project Description This project aims to facilitate the sharing of large data sets for research and education across Maine as well as across the Open Storage Network. It is the intention of Mount Desert Island Biological Laboratories (MDIBL) to make data files and metadata publicly available in exchange for free access. This data is of interest and value to Data Science faculty at the University of Maine Augusta, for teaching and research as part of a system-wide data science degree.

The project requires the development of a front-end and back-end system, preferably developed in Go and deployed in a container, preferably Docker. The end result will allow uploading, downloading, metadata tagging, and HPC job submissions that use the data.
Project Deliverables It is the intention that this be an interface that can be used day-to-day by researchers at MDIBL, but flexible enough to allow the easy incorporation of other data sets and be shared to other researchers and educators (and students) with minimal effort on the part of the end user.
Project Deliverables
Student Research Computing Facilitator Profile The ideal student would have skills with linux, Go language, database and containers. This is a development project which relies heavily on the use of Python, HTML, and either Django or Flask to create RESTful interfaces to access & maintain the data stores. The right candidate will be comfortable with Python development & HTML/web technologies; experience working with Linux systems would also be useful. You should be willing to work independently to research and deploy technologies.
Mentee Research Computing Profile
Student Facilitator Programming Skill Level Practical applications
Mentee Programming Skill Level
Project Institution University of Maine, Augusta
Project Address Jewett Hall
Augusta, Maine. 04330
Anchor Institution NE-University of Maine
Preferred Start Date 02/01/2021
Start as soon as possible. No
Project Urgency Already behind3Start date is flexible
Expected Project Duration (in months)
Launch Presentation
Launch Presentation Date 01/18/2022
Wrap Presentation
Wrap Presentation Date 03/15/2022
Project Milestones
  • Milestone Title: Working Prototype
    Completion Date Goal: 2022-01-31
  • Milestone Title: Core Features Implemented
    Completion Date Goal: 2022-02-28
  • Milestone Title: Deployment
    Completion Date Goal: 2022-03-25
Github Contributions
Planned Portal Contributions (if any) It is anticipated that several questions related to Go and/or containers will be generated and answered.
Planned Publications (if any)
What will the student learn? The student will gain familiarity and experience with full stack software development.
What will the mentee learn?
What will the Cyberteam program learn from this project? It is anticipated that at least one method for sharing big data sets will result.
HPC resources needed to complete this project? This project will utilize existing CEPH storage at the University of Maine as well as one or more virtual machines to run the code and act as the web interface. HPC resources will be used in the processing of data, and as such some modest use is required for development.
Notes
What is the impact on the development of the principal discipline(s) of the project?
What is the impact on other disciplines?
Is there an impact physical resources that form infrastructure?
Is there an impact on the development of human resources for research computing? From Ben Burnett, Student Participant:
Being a research facilitator for the Northeast Cyberteam had a positive impact on my performance as a student. As a graduate student a lot of the work I do requires me to be like a horse with blinders on and deeply focus on one subject. Working on a Cyberteam project presented me the opportunity to step out of my own field of research to learn about and experience other perspectives and cultures within research computing. Learning different perspectives is invaluable as an aspiring problem solver, and I am grateful for the chance to have learned more as a Northeast Cyberteam research facilitator.
Is there an impact on institutional resources that form infrastructure?
Is there an impact on information resources that form infrastructure?
Is there an impact on technology transfer?
Is there an impact on society beyond science and technology?
Lessons Learned
Overall results