Pittsburgh Supercomputing Center is a joint computational research center with Carnegie Mellon University and the University of Pittsburgh. Established in 1986, PSC is supported by several federal agencies, the Commonwealth of Pennsylvania and private industry.

The PSC AI/BD team is looking for several interns to perform materials research with the large language model (LLM) RoBERTa. While there has been extensive work on string representations of molecules as inputs to these models, there has been little work on string representations of crystal structures. For this internship, we will be investigating several string representations for these materials in pre-training and property prediction of organic molecular crystal structures.

First, we will pre-train RoBERTa using each of the string representations. Next, we will train a regression head on several properties of interest to the community of organic materials. Finally, we will evaluate the ability of each representation to accurately predict on previously unseen data. Of particular interest is how well these representations/models are able to distinguish between polymorphs, which are materials where the same molecule crystallizes into different forms. If time permits, students may also be able to develop system specific models using density functional theory and LLM models using the Pegasus workflow on Neocortex and Bridges-2.

Responsibilities may include:

  • Troubleshooting models during training/evaluation
  • Evaluating the performance of models
  • Presenting/disseminating results in papers/conferences

Our internships offer the opportunity to:

  • Gain valuable experience and knowledge in research computing.
  • Network with leaders in academia and industry to form valuable relationships.
  • Publish in peer-reviewed journals and at prominent conferences.
  • Gain experience on HPC
  • Apply AI/ML to materials science and chemistry
  • Gain experience on a novel AI accelerator

Successful candidates will have the following:

  • Excellent problem-solving skills and creativity
  • Candidates must be pursuing a relevant degree. Examples of relevant majors are ML, Physics, Mathematics, ECE, Computer Science or any major
  • Interests in applying AI/ML to the physical sciences
  • Experience with pytorch
  • Nice to have: experience with pymatgen and ASE
  • Excellent communication skills and ability to work in a team environment.

This internship is open to anyone currently enrolled in an undergraduate or Masters program. This is a fully in-person position and all work must be conducted while in the United States. Our offices are located at 300 South Craig Street, Pittsburgh, PA.

Please send your resume to danao@psc.edu or apply on Handshake by March 31, 2025.