Submission Number: 79
Submission ID: 111
Submission UUID: 7bff5e72-bbc4-452a-a1c2-237a857e7115
Submission URI: /form/project

Created: Tue, 11/24/2020 - 13:48
Completed: Tue, 11/24/2020 - 14:00
Changed: Thu, 05/05/2022 - 04:32

Remote IP address: 24.61.104.85
Submitted by: Scott Valcourt
Language: English

Is draft: No
Webform: Project
Project Title: Benchmarking Locally-Developed HPC Resources
Program:
Northeast (308)

Project Image: https://support.access-ci.org/system/files/webform/project/111/HPC.jpg
Tags:
backup (36), big-data (4), data-management (260), file-systems (33), hpc-cluster-build (84), hpc-operations (43), permissions (177), provisioning (13), schedulers (164), slurm (71), unix-environment (60)

Status: Complete
Project Leader
--------------
Project Leader:
Scott Valcourt

Email: s.valcourt@northeastern.edu
Mobile Phone: 6033802860
Work Phone: {Empty}

Project Personnel
-----------------
Mentor(s):
Scott Valcourt (81)

Student-facilitator(s):
Griffin Leclerc (574), Sawyer Bergeron (575), Joseph Carrara (576)

Mentee(s):
{Empty}


Project Information
-------------------
Project Description:
This project is building a cluster environment for benchmarking the system against a local laptop, locally-available HPC resources, and XSEDE resources.

Project Information Subsection
------------------------------
Project Deliverables:
Paper/poster for UNH Undergraduate Research Conference and for PEARC21; a locally-developed HPC collection of hardware

Project Deliverables:
{Empty}

Student Research Computing Facilitator Profile:
Interest in system administration and software development for large systems (python, C, R)

Mentee Research Computing Profile:
{Empty}

Student Facilitator Programming Skill Level: Some hands-on experience
Mentee Programming Skill Level: {Empty}
Project Institution: University of New Hampshire
Project Address:
33 Academic Way
Durham, New Hampshire. 03824

Anchor Institution: NE-University of New Hampshire
Preferred Start Date: 09/15/2020
Start as soon as possible.: No
Project Urgency: Already behind3Start date is flexible
Expected Project Duration (in months): {Empty}
Launch Presentation: {Empty}
Launch Presentation Date: 12/01/2020
Wrap Presentation: {Empty}
Wrap Presentation Date: {Empty}
Project Milestones:
- Milestone Title: Working hardware for cluster development
  Milestone Description: Converting six 1U Dell servers from "shelfware" to operating hardware and operating system software.
  Completion Date Goal: 2020-12-15
  Actual Completion Date: 2020-11-25
- Milestone Title: Working cluster configuration
  Milestone Description: Fully operational SLURM-based collection of nodes able to receive and process jobs.
  Completion Date Goal: 2021-02-01
  Actual Completion Date: 2021-03-15
- Milestone Title: Benchmarked software on local laptop
  Milestone Description: Finding a working basic example codebase for use a a benchmark exercise for non-HPC and HPC environments.
  Completion Date Goal: 2021-02-15
  Actual Completion Date: 2021-02-15
- Milestone Title: Benchmarked software on XSEDE
  Milestone Description: Run the benchmark code on Bridges-2 at XSEDE through an allocation.
  Completion Date Goal: 2021-03-01
  Actual Completion Date: 2021-04-15
- Milestone Title: Benchmarked software on local cluster
  Milestone Description: Run the benchmark code on the HPC environment built during this project.
  Completion Date Goal: 2021-03-15
  Actual Completion Date: 2021-04-20
- Milestone Title: Write final project documentation and reports
  Milestone Description: Written documentation includes technical document with hardware and software operational details, a PEARC21 short paper, a final capstone course presentation, and a NECyberTeam Wrap Presentation.
  Completion Date Goal: 2021-05-10

Github Contributions: {Empty}
Planned Portal Contributions (if any):
How to document on configuring a local HPC using recycled materials

Planned Publications (if any):
poster for University of New Hampshire 2021 Undergraduate Research Conference (presented)
short student paper for PEARC21 (submitted, awaiting selection)

What will the student learn?:
How to configure several machines to operate as a coordinated cluster for high performance computing
Benchmarking hardware for software implementation

What will the mentee learn?:
{Empty}

What will the Cyberteam program learn from this project?:
How to effectively select resources to carry out data processing

HPC resources needed to complete this project?:
local HPC cluster, XSEDE resources

Notes:
{Empty}



Final Report
------------
What is the impact on the development of the principal discipline(s) of the project?:
The students exercised their present skills in operating computing hardware and expanded their skills in the installation and configuration of cluster-based software to operate a multi-node HPC.  The skills in selecting the right scheduler and configurating the various options required engaging the HPC community through Ask.CI and other online and regional support personnel.

What is the impact on other disciplines?:
While the original project sought to partner with science domain experts to bring a real-time project to the table for action, it was clear that the experimental nature of this project precluded the team from active involvement during the project period, but would be available in the post-project period to consider the incorporation of science problems into the home-built HPC environment.

Is there an impact physical resources that form infrastructure?:
The recommissioned hardware that became the HPC used by the students is anticipated to remain operational.  A true test of the documentation and skill of the student team will be the acceptance of a new team leading the operation of this experimental HPC cluster.

Is there an impact on the development of human resources for research computing?:
The three students on the HPC team gained experiential knowledge in the design, deployment, configuration, and operation of a HPC cluster--a skill not part of the regular computer science curriculum.  As a result, these students are fully-qualified to install any newly-commissioned HPC cluster in the commercial or research environment.

Is there an impact on institutional resources that form infrastructure?:
This project developed a new HPC resource that could be used for light duty computations that may need more resources that a single computing resource may not be equipped to support.  As a result, any University of New Hampshire faculty or student researcher would have access to this new resource.

Is there an impact on information resources that form infrastructure?:
The documentation that outlines the processes and steps undertaken to create this resource will assist future students and professionals in the development and installation of a HPC cluster using recommissioned hardware.

Is there an impact on technology transfer?:
There is no observable impact on technology transfer at this time.

Is there an impact on society beyond science and technology?:
There is a benefit to society by not having the hardware that was previously destined for recycling continuing to expend service beyond its anticipated end-of-life.  This active recycling of hardware for the purpose of continue to offer compute cycles is a way to extend operating hardware and save budget funds.

Lessons Learned:
The students on this project learned much more than they anticipated they would.  Each student had already built his own computer for personal uses--gaming, classwork, even personal environment monitoring.  The exercise in building a HPC cluster was something that they knew existed, but has not considered was a direction that they could go with their learning careers in computer science.  As research computing leaders, we know that we expect our professional teams to be adept at everything that the research community might need for support, but we often neglect to consider what we need to teach the next generation of employees.  This project serves as a model to help others to advance, even when there isn't a new HPC to install for a research project.

Overall results:
This project provided student the opportunity to build on their present skills in operating computing hardware by expanding in the installation, configuration, and operation of cluster-based software to deliver a multi-node HPC.  The skills in selecting the right scheduler and configurating the various options required engaging the HPC community through Ask.CI and other online and regional support personnel, and the documentation on how this can be done and the choices made to achieve success will assist other students to follow, achieve the same level of proficiency, and implement the next set of options to assist them in becoming outstanding research computing facilitators.