News & Publications

PSC Accelerates Machine Learning with GPUs

Pittsburgh Supercomputing Center Accelerates Machine Learning with GPUs

Researchers at the Pittsburgh Supercomputing Center and HP Labs achieve unprecedented speedup in a key machine-learning algorithm.

PITTSBURGH, May 23, 2011 — Computational scientists at the Pittsburgh Supercomputing Center (PSC) and HP Labs are achieving speedups of nearly 10 times with GPUs (graphic processing units) versus CPU-only code (and more than 1000 times versus an implementation in a high-level language) in k-means clustering, a critical operation for data analysis and machine learning.

A branch of artificial intelligence, machine learning enables computers to process and learn from vast amounts of empirical data through algorithms that can recognize complex patterns and make intelligent decisions based on them. For many machine-learning applications, a first step is identifying how data can be partitioned into related groups or “clustered.”

Ren Wu, principal investigator of the CUDA Research Center at HP Labs, developed advanced clustering algorithms that run on GPUs, which have advantages for many data-intensive applications. PSC scientific specialist Joel Welling recently applied Wu’s innovations to tackle a real-world machine-learning problem. Using data from Google’s “Books N-gram” dataset and working together, Wu and Welling were able to cluster all five-word sets of the one thousand most common words (“5-grams”) occurring in all books published in 2005. With this project, representative of many research efforts in natural-language processing and culturomics, the researchers demonstrated an extremely high-performance, scalable GPU implementation of k-means clustering, one of the most used approaches to clustering.

Wu and Welling ran on the latest “Fermi” generation of NVIDIA GPUs. Using MPI between nodes (three nodes, with three GPUs and two CPUs per node), they observed a speedup of 9.8 times relative to running an identical distributed k-means algorithm (written in C+MPI) on all CPU cores in the cluster, and thousands of times faster than the purely high-level language implementation commonly used in machine-learning research. Using their GPU implementation, the entire dataset with more than 15 million data points and 1000 dimensions can be clustered in less than nine seconds. This breakthrough in execution speed will enable researchers to explore new ideas and develop more complex algorithms layered atop k-means clustering.

“K-means is one of the most frequently used clustering methods in machine learning,” says William Cohen, professor of machine learning at Carnegie Mellon University. “It is often used as a subroutine in spectral clustering and other unsupervised or semi-supervised learning methods. Because some of these applications involve many clustering passes with different numbers of means or different randomized starting points a greatly accelerated k-means clustering method will be useful in many machine learning settings.” Cohen co-leads the Never-Ending Language Learning (NELL) and Read the Web projects ( The goal of NELL is to automate inferences based on continually “reading” natural-language text from the Web.

Machine learning is just one example of the exploding field of data analytics, notes PSC scientist Nick Nystrom. Other data-analytic applications range from understanding the results of traditional high-performance computing (HPC) simulations of global climate, engineering, and protein dynamics to emerging fields that need HPC such as genomics, social network analysis, and mining extensive datasets in the humanities.

“A substantial body of major application codes is already being developed specifically for NVIDIA GPUs,” notes Nystrom, PSC director of strategic applications. “Because NVIDIA GPUs are so widespread, those codes can run well on anything from a supercomputer to a netbook.” Nystrom has been instrumental in PSC’s work with advanced technologies for scientifically important, data-intensive problems. This application of NVIDIA GPUs to k­­-means clustering, he notes, is one example of how a pervasive technology that leverages broad markets can benefit important algorithms in science and data analysis.

This advanced clustering algorithm, notes Wu, also has the advantage of being easy to use, which facilitated rapid implementation with Welling. “I think that the CUDA programming model is a very nice framework,” says Wu, “well balanced on abstraction and expressing power, easy to learn but with enough control for advanced algorithm designers, and supported by hardware with exceptional performance (compared to other alternatives). The key for any high-performance algorithm on modern multi/many-core architecture is to minimize the data movement and to optimize against memory hierarchy. Keeping this in mind, CUDA is as easy, if not easier, than any other alternatives.”

Nystrom concurs and sees an exciting future for software developers: “There’s a rich software ecosystem supporting NVIDIA’s GPUs, ranging from easy-to-use compiler directives to explicit memory management to powerful performance tools. Add to that integration of general-purpose processors in this successful line of architectures, and the potential for developing transformative software architectures is extraordinary.”

PSC Observes 25 Years of Service and Accomplishment

PSC Observes 25 Years of Service and Accomplishment

PITTSBURGH, April 15, 2011 — Over a hundred guests, including students, representatives of government and industry, joined the Pittsburgh Supercomputing Center (PSC) staff today at PSC's 25th anniversary observance and Discover 11 Open House. Also present were participants from the TeraGrid/Blue Waters Symposium in Data Intensive Analysis, Analytics, and Informatics, held in Pittsburgh, which concluded at noon on April 15.


PSC scientific co-directors Ralph Roskies (left) and Michael Levine (right) with director of special projects James Kasdorf (center)

The Open House featured demonstrations of PSC research including 3D stereo movies of cellular interactions in a synapse and a zoom-in view of water molecules. PSC's biomedical group also highlighted the recent "wiring diagram of the brain" research featured on the cover of the March 10 issue of Nature, the prestigious international science journal. The event also included a look back at PSC supercomputers from 1986 until now, and a number of the highlight research projects those computing systems enabled. PSC's networking group demonstrated Cisco Telepresence, an advanced video conferencing system, that facilitates distance communication with a realistic sense of presence beyond other current systems.

At 1:00 pm officials from universities, industry and government convened for an event that included remarks about PSC:

  • Introduced by PSC scientific co-director, Michael Levine, Dr. Jared Cohon, president of Carnegie Mellon, congratulated PSC and briefly highlighted, as an example of the range of research the center has supported, several projects that CMU researchers have carried out in collaboration with PSC — including Internet privacy, earthquake modeling, machine learning, particle physics and cosmology to computational chemistry.
  • PSC scientific co-director Ralph Roskies introduced University of Pittsburgh Chancellor Mark Nordenberg, who spoke about the partnerships that have been important to PSC's continuing success, in particular the linkages between the two major research universities in the Oakland neighborhood of Pittsburgh, CMU and Pitt.
  • Jim Kasdorf, PSC director of special projects, introduced Tom Moser, Manager of Infrastructure, Westinghouse Electric Company, who commented on parallels between the technological innovations of Westinghouse Corporation, now in its 125th years, and PSC, in its 25th year.
  • Roskies, Levine and Kasdorf thanked the PSC staff for their many contributions to the sustained success of PSC.
  • Mary Ann Eisenreich, Director, Governor's Southwest Office (representing the Honorable Tom Corbett, Governor, Commonwealth of Pennsylvania) commented on the importance of PSC's contribution to southwest Pennsylvania and read comments from Governor Corbett.
  • The Honorable Mike Doyle, U. S. Representative, 14th Congressional District of Pennsylvania, presented his "heartfelt congratulations" by video.
  • Dr. Irene Qualters, Program Director, Office of Cyberinfrastructure, National Science Foundation, acknowledged the many human contributions to PSC's success and brought warm congratulations from the NSF staff, including Ed Siedel, a former PSC researcher, and Irene Lombardo.
  • Pennsylvania State Representative Joe Markosek presented PSC's directors with a copy of a Pennsylvania House of Representatives resolution officially commenting upon PSC's contributions to Pennsylvania.

PSC Featured in Pittsburgh Post-Gazette

PSC Featured in Pittsburgh Post-Gazette

PITTSBURGH, April 4, 2011 — The Sunday, April 3 issue of the Pittsburgh Post-Gazette includes a full-page editorial article by the three people who co-authored the proposal that led the National Science Foundation to fund the Pittsburgh Supercomputing Center 25 years ago.


    PSC scientific co-directors Michael Levine and Ralph Roskies (left) and director of special projects James Kasdorf


The OpEd, by PSC's scientific directors Michael Levine and Ralph Roskies and PSC director of special projects James Kasdorf, is aimed at non-scientist readers. It outlines what's meant by “supercomputing” and discusses some of PSC's accomplishments.

You can read the article, here:

PSC Receives Grant to Develop Pilot Program in Math and Science Teaching


Pittsburgh Supercomputing Center Receives Grant to Develop Pilot Program in Math and Science Teaching

PITTSBURGH, March 15, 2011 — The Pittsburgh Supercomputing Center (PSC) has received a $100,000 grant from the DSF Charitable Foundation to develop a pilot program to prepare high-school math and science teachers to effectively use computational modeling as part of K-12 learning. The grant extends Computation and Science for Teachers (CAST), PSC's successful program — introduced in 2008 — that introduced many Southwest Pennsylvania science and math teachers to easy-to-use modeling and simulation as often powerful tools for classroom learning.

Cast Summer Workshop

The DSF grant funds a three-way effort among PSC and the Maryland Virtual High School Project (MVHS), which helped to pioneer the use of computational thinking in high-school learning, along with the Math & Science Collaborative (MSC) of the Allegheny Intermediate Unit, which provides specialized educational services to Allegheny County's 42 suburban school districts and five vocational/technical schools. Educators from these three organizations will plan and design a well defined professional development program for STEM (science, technology, engineering and math) teachers in western Pennsylvania to become leaders in integrating computational modeling and simulations in classroom learning.

"CAST," says PSC director of education and outreach Cheryl Begandy, "proposes to bring to the classroom the same problem-solving, technology-rich approaches currently used in scientific research and in business. Introducing 'cool' technology into the classroom engages students," she adds, "and increases their willingness to stay with subjects they may otherwise find too complicated or just uninteresting. Ultimately the goal is to help create the cyber-savvy workforce demanded by the 21st-century marketplace."

Specific objectives of the CAST phase 2 pilot program, says Begandy, are:

  • to increase use of computational reasoning,
  • to improve the learning experience and engagement of students in math and science, and
  • to build capacity in western Pennsylvania for wider and sustained use of computational reasoning and tools.

Educators from PSC, MVHS and MSC met on January 20 for their first gathering to establish the outline of the pilot program and to set timelines and milestones.

PSC Scientists Co-Author Paper on Wiring Diagram of the Brain

Pittsburgh Supercomputing Center Scientists Co-Author Paper on Wiring Diagram of the Brain

PITTSBURGH, March 10, 2011 — Pittsburgh Supercomputing Center (PSC) scientists Art Wetzel and Greg Hood co-authored a paper on brain anatomy featured as the cover story in the March 10 issue of Nature, the international weekly journal of science.


Nature - March 10 Issue Cover

Wetzel and Hood collaborated with a team at Harvard University led by Clay Reid, professor of neurobiology at the Harvard Medical School and Center for Brain Science. The Harvard-PSC team exploited improvements in computer speed and storage capacity available at PSC that made it possible to transmit and process more than three-million high-resolution images from a pinpoint-sized region of a mouse brain. Starting with these very thin-slice (40 nanometers) images — obtained at Harvard via electron microscopy (EM), Wetzel and Hood stitched together a large-scale single-section mosaic. From these sections, they then reconstructed a 3D volume (encompassing millions of cubic micrometres) which made it possible for the Harvard team to painstakingly trace interconnections among selected neurons, in effect mapping a wiring diagram of a portion of the mouse visual cortex.

To get an idea of the amount of cortical information captured in each section, Reid analogizes to slicing a wedge of cheese. If each slice were a millimeter thick like a thin slice of cheese (instead of 40 nanometers), and the lateral dimensions increased by the same proportion, each slice would cover an area bigger than an NBA basketball court.

Hood and Wetzel used various software methods — fast Fourier transform correlations and other search methods — to find features in overlapping camera frames for alignment into a single mosaic. This process matches adjacent frames both spatially and in intensity to produce a nearly seamless image (about 10 gigapixels) of each section. They then apply a non-linear registration algorithm to map each section to its neighboring sections, compensating for deformations that inevitably occur when cutting tissue so thinly. Finally, a multiscale 3-D alignment stacks these local maps to construct a finished volume (10 teravoxels) for viewing and analysis.

By tracing interconnections within this volume, the Harvard researchers produced new insights into how the brain functions, finding that neurons tasked with suppressing brain activity seem to be randomly wired, putting the lid on local groups of neurons all at once rather than picking and choosing. Such findings are important because many neurological conditions, such as epilepsy, are the result of neural inhibition gone awry.

"This is just the iceberg's tip," said Reid. "Within ten years I'm convinced we'll be imaging the activity of thousands of neurons in a living brain. In a visual circuit, we'll interpret the data to reconstruct what an animal actually sees. By that time, with the anatomical imaging, we'll also know how it's all wired together."

For now, Reid and his colleagues are working to scale up this platform to generate larger data sets. "How the brain works is one of the greatest mysteries in nature," Reid added, "and this research presents a new and powerful way for us to explore that mystery."

Article and video about this research from Focus, Harvard Medical School research magazine:

PSC Mourns the Untimely Loss of Former PSC Scientist

PSC Mourns the Untimely Loss of Former PSC Scientist


Phil Andrews

PITTSBURGH, February 28, 2011 — The PSC staff joins the computational science community nationally in mourning the untimely loss of our friend and colleague Phil Andrews. Among the first scientists hired at PSC from its inception in 1986, Andrews played an important role at PSC for more than 10 years, serving as coordinator of scientific visualization for several years and then as manager of data-intensive computing. From PSC he went on to hold various leadership positions at San Diego Supercomputer Center [see] before becoming, in 2007, founding director of the National Institute for Computational Sciences at the University of Tennessee and Oak Ridge National Laboratory. [See]

Early on at PSC, Andrews made a major contribution to the ability of PSC and other computational sites to produce movie-like animations from the data generated by computational simulations. His versatile graphics program, GPLOT, could take computer graphics files from many applications and translate them into a format that could be used by various operating systems, including VMS, UNIX and UNICOS. At one point, in the early 1990s, more than 20 other sites used GPLOT for this purpose.

While at PSC, Andrews took an interest in the presentation of textual material online, becoming fluent with SGML, a precursor to HTML that later became the underlying technology for World Wide Web as it was developed at CERN (the European Organization for Nuclear Research). Well before JAVA caught on widely, Andrews saw its potential and did a presentation to PSC staff showing off a page he developed with this now popular software. "He predicted JAVA would change the web," says J. Ray Scott, PSC director of systems and operation. "About six months later it started to emerge."

PSC Network Exchange Partners with Drexel University for Improved Internet Connection

Pittsburgh Supercomputing Center Network Exchange Partners with Drexel University for Improved Internet Connection

PITTSBURGH, February 17, 2011 — The Three Rivers Optical Exchange (3ROX), the high-performance Internet hub operated and managed by the Pittsburgh Supercomputing Center (PSC), has partnered with Drexel University in Philadelphia to implement a five-fold upgrade to the Internet bandwidth of both 3ROX and Drexel at essentially no cost increase.

Prior to partnering, 3ROX and Drexel each had individual one-gigabit (a billion bits per second) connections to Internet2, a high-performance research and education network that connects universities, corporations and research agencies nationally. By partnering, they are able to take advantage of a new type of Internet2 connection. Normally, the next level of service available would be 10 gigabits, which is cost prohibitive, but the new connection makes it possible to have two connections, each with five gigabits of committed bandwidth.

"Operating in the virtual world we live in, we're able to split the connection into two five gigabit connections at two different physical locations, Philadelphia and Pittsburgh," said John Bielec, the chief information officer at Drexel. "The newly formed connector, called 3ROX/Drexel, will benefit the many Internet partner institutions of both Drexel and 3ROX.."

"This will allow significantly better end-to-end performance," said Wendy Huntoon, PSC director of networking, "as well as access to new Internet2-based services. We'll each maintain a separate physical connection to Internet2, but will now collaborate on the management and strategic direction for the connection."

The partnership consolidates Internet2 connections in Pennsylvania from the previous three — 3ROX, Drexel and MAGPI (Mid-Atlantic Gigapop in Philadelphia for Internet2) — to two: MAGPI and 3ROX/Drexel. 3ROX serves universities, research sites and K-12 schools in western Pennsylvania and West Virginia, and Drexel connects the Drexel campus and its related research sites with the 14 Pennsylvania State System of Higher Education universities. With the new connection, both 3ROX and Drexel will be able to improve the quality and quantity of services they provide.

More information about 3ROX:

PSC Scientific Co-Director Will Speak at Cafe Scientifique

What is Supercomputing and Why Should You Care?

Ralph Roskies, PSC scientific co-director 

PITTSBURGH, January 31, 2011 - Ralph Roskies, PSC scientific co-director, will speak at Pittsburgh's Cafe Scientifique, a series of talks about science held in an unstuffy atmosphere with food and drink at the Carnegie Science Center. Roskies' talk on Feb. 7 will discuss how supercomputers have immense power to improve our quality of life. Doors open at 6 pm. The program is 7 - 9.

3ROX Contracts with NOAA for $2.58M

Pittsburgh Supercomputing Center Internet Exchange Contracts with NOAA for $2.58M

PITTSBURGH, January 24, 2011 — The Three Rivers Optical Exchange (3ROX), operated and managed by the Pittsburgh Supercomputing Center, has contracted with NOAA (the National Oceanic and Atmospheric Administration) to provide a high-speed optical fiber connection to NOAA's planned Environmental Security Computing Center (ESCC) in Fairmont, West Virginia.

The new center, to be located at the at I-79 Technology Park Research Center in Fairmont, will house a supercomputing system, expected to be online by late 2011, to meet national weather forecasting and climate modeling goals outlined in NOAA's "High Performance Computing Strategic Plan 2011-2015." 3ROX will provide connectivity to ESCC by providing new network infrastructure between PSC and the Fairmont site. The connection to ESCC will have 10 Gigabit per second capability, and will allow ESCC access to national and international research networks such as Internet2 and National LambdaRail.

"We expect the infrastructure to be put in place by March 2011," said Wendy Huntoon, PSC director of networking, "and anticipate that we will be able to leverage it to upgrade existing connectivity to West Virginia University (currently 155 megabits per second)."

More information about 3ROX:

More Articles ...

PSC Media Contacts

Media / Press Contact(s):

Kenneth Chiacchia
Pittsburgh Supercomputing Center

Vivian Benton
Pittsburgh Supercomputing Center

Website Contact

Shandra Williams
Pittsburgh Supercomputing Center

Use of PSC materials: To request permission to use PSC materials, please complete this form.

Events Calendar

<<  <  May 2020  >  >>
 Su  Mo  Tu  We  Th  Fr  Sa 
       1  2
  3  4  5  6  7  8  9