Submission Number: 129
Submission ID: 226
Submission UUID: 555a8c58-506b-47fa-8113-46e20a781360
Submission URI: /form/project

Created: Tue, 11/23/2021 - 15:06
Completed: Tue, 11/23/2021 - 15:06
Changed: Mon, 06/03/2024 - 13:02

Remote IP address: 128.118.7.103
Submitted by: Rob Mathers
Language: English

Is draft: No
Webform: Project
Project Title: Calculation of Polymer Hydrophobicity
Program:
CAREERS (323)

Project Image: {Empty}
Tags:
cleanup (368), git (457), github (490), optimization (509), python (69)

Status: Complete
Project Leader
--------------
Project Leader:
Rob Mathers

Email: rtm11@psu.edu
Mobile Phone: 412-779-7738
Work Phone: {Empty}

Project Personnel
-----------------
Mentor(s):
Thomas Langford (510)

Student-facilitator(s):
Sander Cohen-Janes (1684)

Mentee(s):
{Empty}


Project Information
-------------------
Project Description:
After writing python code to calculate physical properties of polymer molecules in 2021, we are interested in cleaning up the code, addressing some calculation issues, and putting the code on GitHub. The code is written using an open-source cheminformatics package called RDKit. Prior to using RDKit, we had been using commercial software (Materials Studio, Chem3D) from 2014 to 2019. 

The physical property of interest relates to hydrophobicity or the oil-like characteristics of polymers. Our method is inspired by the medicinal chemistry approach to describe drug-like molecules using partition coefficients. These coefficients, which are often referred to as LogP values, can be positive or negative. Positive LogP values indicate oil solubility while negative LogP values suggest water soluble molecules. Since the 1980s, the pharmaceutical industry has spawned many computational methods to calculate LogP. 

Our method constructs SMILES strings for a short segment of a polymer. These SMILES strings represent 3D chemical structures using ACSII symbols. Then, we use RDKit to convert the SMILES string to a 3D molecule, optimize the conformation, and calculate the surface area (SA). Afterwards, we calculate LogP. The resulting ratio of LogP/SA has provided predictive capability in a number of collaborative projects. Since 2015, we have published 18 journal articles that use LogP  and LogP/SA values. 

Project Information Subsection
------------------------------
Project Deliverables:
The goals of the project include the following:

1)	Clean up code 
2)	Optimize calculation method
        a. Reduce time needed for jobs
        b. Automatically adjust to available hardware resources
        c. Determine how many conformations are needed
3)	Data output
        a. Output graph or list values (csv etc)
        b. Provide options for selecting axes for graph (x-axis could be number of monomer units (N), 1/N, logN, N^1/3, N^1/2 etc.)
4)	Put code on GitHub

Project Deliverables:
{Empty}

Student Research Computing Facilitator Profile:
- Experience with Python
- Experience with or interested in learning Git and using GitHub

Mentee Research Computing Profile:
{Empty}

Student Facilitator Programming Skill Level: Some hands-on experience
Mentee Programming Skill Level: {Empty}
Project Institution: Penn State-New Kensington
Project Address:
New Kensington, Pennsylvania

Anchor Institution: CR-Penn State
Preferred Start Date: {Empty}
Start as soon as possible.: Yes
Project Urgency: Already behind3Start date is flexible
Expected Project Duration (in months): 5
Launch Presentation: https://support.access-ci.org/system/files/webform/project/226/Sander_Cohen-Janes_Launch_Presentation.pdf
Launch Presentation Date: 07/20/2022
Wrap Presentation: {Empty}
Wrap Presentation Date: {Empty}
Project Milestones:
- Milestone Title: Clean up the code
  Milestone Description: Currently, the code needs to be cleaned up and possibly reorganized or restructured.
  Completion Date Goal: 2022-06-30
- Milestone Title: Optimize calculation method
  Milestone Description: While running the code, we have noticed that calculating LogP is relatively fast. However, calculating surface area (SA) is much slower with most of the computational time spent finding the best conformation (i.e. lowest energy). If possible, we would like to reduce calculation time by automatically adjusting to available hardware resources and finding a better way to conduct the conformation searching.
  Completion Date Goal: 2022-08-11
- Milestone Title: Data output
  Milestone Description: Currently, the code outputs a graph. We would like options to save a graph, list values (csv etc), or select axes for graphing the data.
  Completion Date Goal: 2022-09-30
- Milestone Title: Post code on Github
  Milestone Description: Get code ready and make available on Github.
  Completion Date Goal: 2022-10-30

Github Contributions: {Empty}
Planned Portal Contributions (if any):
{Empty}

Planned Publications (if any):
{Empty}

What will the student learn?:
{Empty}

What will the mentee learn?:
{Empty}

What will the Cyberteam program learn from this project?:
{Empty}

HPC resources needed to complete this project?:
{Empty}

Notes:
{Empty}



Final Report
------------
What is the impact on the development of the principal discipline(s) of the project?:
{Empty}

What is the impact on other disciplines?:
{Empty}

Is there an impact physical resources that form infrastructure?:
{Empty}

Is there an impact on the development of human resources for research computing?:
{Empty}

Is there an impact on institutional resources that form infrastructure?:
{Empty}

Is there an impact on information resources that form infrastructure?:
{Empty}

Is there an impact on technology transfer?:
{Empty}

Is there an impact on society beyond science and technology?:
{Empty}

Lessons Learned:
{Empty}

Overall results:
From the Project PI:
"So far, we have created The Hydrophobicity Project on GitHub to build a community of users. Sanders code is available on this site.

https://github.com/TheHydrophobicityProject

We have been testing and using the code since last October. A manuscript is in preparation."

From the student facilitator:
"This was the first project I worked on where I used version control. Now I can't live without it. This was also the first project in which I was creating a "product" for general consumption, rather than internal tools. It was useful to collaborate with people that weren't active coders on the project so I could get usability feedback off of which I could iterate. Those are two things I will keep with me regardless of the other technologies I use in my future projects."