Concurrent Sessions

Tuesday, November 2

5:30 pm – 8:00 pm

Wednesday, November 3

8:30 am – 9:45 am

Welcome and CI at Michigan Sampler: New initiatives and resources at Michigan and beyond

Following a welcome by Stephen R. Forrest (click for welcome notes), Vice President for Research, and CI Days co-sponsors Daniel E. Atkins, Associate VP for Research Cyberinfrastructure, and Laura Patterson, Chief Information Officer and Associate VP for ITS, this panel will offer an overview of resources for computing and data along with opportunities for further education and community building.

  • Brian Athey, Professor of Biomedical Informatics, Psychiatry, and Internal Medicine
  • Paul Courant, Professor of Public Policy, Economics, and Information and Dean of Libraries
  • Sharon Glotzer, Professor of Chemical Engineering, Materials Science & Engineering, Macromolecular Science & Engineering, Physics, and Applied Physics
  • Eric Michielssen, Professor of Electrical Engineering and Computer Science
  • Laura Patterson, Associate Vice President (Information and Technology Services) and Chief Information Officer
  • Moderator: Dan Atkins, Associate VP of Research Cyberinfrastructure and Professor of Information and Electrical Engineering and Computer Science

 

10:00 am – 10:50 am

Keynote speaker

Jimmy Lin, Associate Professor at the University of Maryland (currently on sabbatical at Twitter)

“Hadoop/MapReduce as a Platform for Data-Intensive Computing”

MapReduce, especially the Hadoop open-source implementation, has recently emerged as a popular framework for data-intensive computing. Among its advantages include the ability to horizontally scale to petabytes of data on thousands of commodity servers, easy-to-understand programming semantics, and a high degree of fault tolerance. Hadoop lies at the core of an application stack that is gaining widespread adoption in both industry and academia. In this talk, I will present case studies in text processing and bioinformatics that illustrate how these technologies are enabling data-driven research and transforming science.

 

11:05 am – 12:05 pm

Concurrent SessionsSession 1: Overview of Parallel Computing

Quentin Stout, Professor of Computer Science and Engineering, will offer an introductory guide to parallel computing, answering the following questions: What is parallel computing? Why would you want or have to use it? How do you write a parallel program? Why should I care about Amdahl’s Law? This session covers what you need to know about parallel computing before learning the nuts and bolts of doing it.

 

Session 2: Best Practices and Resources for Managing and Sharing Data

This panel is an opportunity to learn what key issues surround the use, management, and stewardship of data and who on campus can answer your questions about such issues.

    • HV Jagadish, Professor of Computer Science and Engineering, will present a perspective from U-M’s Blue Ribbon Task Force on Research Data Strategy. This Task Force is charged to provide guidance to university leadership on how U-M should approach the increasing demands around managing research data.

 

Session 3: Introduction to Tools for Data-Mining: Hadoop and clairlib

    • Michael Cafarella, Assistant Professor of Computer Science and Engineering, will offer an overview of recent Hadoop and MapReduce research and tools.
    • Dragomir R. Radev, Professor of Information, Electrical Engineering and Computer Science, and Linguistics, will talk about CLAIRlib (www.clairlib.org), his research group’s library of tools for information retrieval, network analysis, and natural language processing.
    • Q & A

 

Session 4: National Computational/Networking Resources (panel)

Learn about national resources available to support high-performance computing, science and engineering analysis, data and resource sharing, and high-bandwidth networking. This panel features:

    • Elizabeth Leake, External Relations Coordinator for TeraGrid, a federally funded, scientific discovery infrastructure integrating high-performance computers, data resources and tools, and high-end experimental facilities at 11 sites around the country.
    • Nancy Wilkins-Diehr, Area Director for TeraGrid Science Gateways, which are community-developed sets of tools, applications, and data that are integrated via a portal or a suite of applications, usually in a graphical user interface, and further customized to meet the needs of a targeted community, allowing easier access to high-end resources.
    • Russ Hobby, Program Manager at Internet2, which provides resources and training for state-of-the-art networking and videoconferencing, as well as other software tools/kits.
    • Q & A

12:05 pm – 1:20 pm

Exemplars of Computational Discovery at U-M
A “taste” of U-M faculty research using CI. We will feature four rooms of faculty presentations. Enjoy a complimentary boxed lunch while listening to your choice of faculty speakers.

Exemplars Session 1:

Sharon Glotzer is a Professor of Chemical Engineering, Materials Science & Engineering, Macromolecular Science & Engineering, Physics, and Applied Physics. Her talk will be “Towards Assembly Engineering: The Shapes of Things to Come.” The new revolution in nano-science, engineering and technology is being driven by our ability to manipulate matter at the molecular and supramolecular level to create “designer” structures. My group uses computer simulation to understand the fundamental principles of how particle systems interact and self-assemble, and to discover how to control the assembly process to engineer new materials and devices. Our work crosses many disciplines including Chemical Engineering, Material Science, Physics, and Computational Science.

Charles Brooks III is a Professor of Chemistry and Biophysics. His talk will be “25 years of high performance computing in computational biophysics: From distributed computations on protein folding to building clusters at home.” Brooks’ research is focused on the application of statistical mechanics, quantum chemistry, and computational methods to chemically and physically oriented problems in biology.

Exemplars Session 2:

Dragomir R. Radev is a Professor of Information, Electrical Engineering and Computer Science, and Linguistics. He uses a suite of custom textual analysis tools (called CLAIRlib) to conduct research on text summarization, Lexical models of the Web, robust question answering, sequence alignment techniques for text analysis, biomedical language processing, information extraction using weakly supervised methods, graph-based methods for classification, and natural language processing.

K P Unnikrishnan is a Research Assistant Professor at the Center for Computational Medicine and Bioinformatics (CCMB) at the University of Michigan Medical School. His research is in Data Mining and its applications in Health Care, Medicine, and Biology.

Exemplars Session 3:

Tamas Gombosi is Chair of the Department of Atmospheric, Oceanic and Space Sciences, Director of the Center for Space Environment Modeling, and Professor of Aerospace Engineering and Space Science. His research focuses on numerical simulations of the space environment of Earth and other planets.

Christiane Jablonowski is an Assistant Professor in the Department of Atmospheric, Oceanic & Space Sciences (AOSS). Her research focuses on the fluid dynamics component of weather and climate models. In particular, she develops Adaptive Mesh Refinement (AMR) techniques that can be used for regional climate change assessments and the tracking of tropical cyclones.

Exemplars Session 4:

August (Gus) Evrard is a first-generation computational cosmologist (Professor of Physics and Astronomy) who uses N-body and gas dynamic simulations of cosmic structure to inform analysis of astronomical sky surveys. This talk will describe his group’s modeling at the interface of astrophysics and cosmology, and provide connections to the research of a “sim-astro quartet” of assistant professors in the Astronomy department.

Daniel Forger is an Associate Professor of Mathematics and is on the faculty of the Center for Computational Medicine and Bioinformatics. His research focuses on modeling biological clocks.

1:30 pm – 2:30 pm

Concurrent Sessions

Session 1: All About Flux, U-M’s New High-Performance Computing Cluster

Andrew Caird of the CoE’s Center for Advanced Computing will talk about this new campus resource. Flux is a high-performance computing environment that offers flexibility to researchers who may want to vary the number of CPUs they use or when they need them, allowing efficient use of research funds. Andrew will describe how Flux differs from other available options, how Flux works administratively, and some of the policies associated with using this flexible resource.

Session 2: CI in the Classroom

Perry Samson, Professor of Atmospheric, Oceanic and Space Sciences, will discuss how he uses CI in the classroom, including LectureTools and demonstrations, and his research about using CI to engage students.

Session 3: Campus Resources for Learning How to Analyze and Manage Data and Research Output

This panel will describe data, analysis, and archival services on campus and how you can learn more about them.

    • Jen Green, of Spatial and Numeric Data Services (SAND), teaches data analysis & visualization techniques, also providing labs & facilities.
    • Giselle Kolenic is from CSCAR’s Spatial Analysis and Visualization Laboratory (SAVi), which is open to the campus for researchers doing higher order spatial statistics.
    • Jean Song, Research & Informatics Coordinator in the Health Sciences Libraries, provides support & training in biomedical/life sciences informatics tools & resources.
    • Jim Ottaviani is the coordinator for Deep Blue, U-M’s online institutional repository for scholarly research (including multimedia), and will speak to the broader management issues associated with research output, including why sharing and retaining ownership of your complete research output is important.

Session 4: Research Computing in the Cloud

Traci Ruthkoski, Director of External Cloud Projects for the CIRRUS Project and Medical School Information Systems HPC Team Lead, will give an overview of basic cloud structure and terminology, describing methodologies for integrating cloud technologies into research environments. Topics such as “Exactly what is the cloud?”, “Is the cloud appropriate for my data?”, and “How do I know what to look for in a cloud provider?” will be discussed. Also, basic “getting started” demonstrations for the Microsoft Azure cloud, Amazon EC2, and Matlab on the TeraGrid will be presented.

Presentation Materials

2:45 pm – 3:35 pm

Keynote speaker
Larry Smarr, Director of the California Institute for Telecommunications and Information Technology (Calit2), a UCSD/UCI partnership, and founding Director of the National Center for Supercomputing Applications (NCSA)

“Set My Data Free: High-Performance CI for Data-Intensive Research”

As the need for large datasets and high-volume transfer grows, the shared Internet is becoming a bottleneck for cutting-edge research in universities. What are needed instead are large-bandwidth “data freeways.” In this talk, I will describe some of the state-of-the-art uses of high-performance CI and how universities can evolve to support free movement of large datasets.

3:45 pm – 4:45 pm

Closing Plenary: The Future of CI at UM—An open discussion and “town hall” style opportunity for community input.

  • What was the most valuable thing you learned?
  • What questions do you still have about using CI?
  • What other topics do you want to see/learn about? When we host future presentations or CI Days events, what topics would you find useful?
  • What barriers do you face for using CI?
  • What can the University do to better support your use of CI?