Term projects: A group of 2 people. Timeline: Determining a project topic: March 24 Midterm project progress report: April 8 Project presentation: April 21 and April 24 Final project report: April 28 Any of the four types: 1. Survey the recent advances in one area related to PDS 2. Empirically evaluating a software system 3. Design, analyze, evaluate new algorithms for some types of parallel and distributed systems 4. Your own topics related to PDS - Survey the recent advances in one area related to PDS Fix a good topic; find all related papers by yourselves (5 minimum); summarize the problem and the proposed solutions; identify the strengths and limitations of the solutions; identify what have been done on this topic and what are the remaining issues. Requirement: a) The area must not be out-dated. - at least 1 paper published in 2008 or later. - at least 4 papers published in 2002 or later. b) You must be able to clearly identify the (significant) topic. - What is the problem? - Why is it a problem? - Why is this problem significant? c) The survey should give the state of the art of that area. - Existing techniques for the problems - Strengths and weaknesses of the techniques - What have been solved and what have not been solved. all these must be reflected in your report and presentation. Example topics (These topics may be too broad): 1. Transactional Memory - U. Drepper, "Parallel Programming with Transactional Memory," ACM Queue 6(5)38-45, Dec. 2008. 2. New parallel programming models - Michael D. Linderman, J. D. Collins, H. Wang, t. H. Meng, "Merge: A Programming Model for Heterogeneous Multi-core Systems," ACM ASPLOS 2008. - J. Dean, and S. Ghemawat, "MapReduce: Simplified Data Processing on Large Clusters," OSDI 2004. 3. Compiler/OS supports for Multithreading - Marek Olszewski, Jason Ansel, Saman Amarasinghe, "Kendo: Efficient Determistic Multithreading in Software," ACM ASPLOS 2009. - Z. Anderson, D. Gay, R. Ennals, and E. Brewer, "SharC: Checking Data Sharing Strategies for Multithreaded C," ACM PLDI 2008. 4. Thread scheduling for Multi-core - D. Tam, R. Azimi, M. Stumm, "Thread Clustering: Sharing Aware Scheduling on SMP-CMP-SMT Multiprocessors," ACM, EuroSys'2007. - M. Rajagopalan, B. T. Lewis, T.A. Anderson, "Thread Scheduling for Multi-Core Platforms," USENIX Workshop on Hot topics in OS, 2007. 5. Routing in Interconnection networks - P. Geoffray, T. Hoefler, "Adaptive Routing Strategies for Modern High Performance Networks," HOT Interconnect, 2008. - X. Yuan, W. Nienaber, Z. Duan, and R. Melhem, "Oblivious Routing for Fat-Tree Based System Area Networks with Uncertain Traffic Demands," ACM Sigmetrics, pages 337-348, 2007. 6. Collective communication optimization - A. Faraj, P. Patarasuk, and X. Yuan, "A Study of Process Arrival Patterns for MPI Collective Operations," International Journal of Parallel Programming, 36(6):543-570, December 2008. - Y. Qian and A. Afsahi, "Efficient shared memory and RDMA based collectives on multi-rail QsNetII SMP clusters," 11(4):341-354, Cluster Computing, Dec. 2008. 7. Accelerator based computing - Michael Kistler, John Gunnels, Daniel Brokenshire, Brad Benton, "Petascale Computing with Accelerators," ACM PPoPP 2009. - S. Ryoo, C. I. Rodrigues, S.S. Baghsorkhi, S. S. Stone, "Optimization Principles and Application performance Evaluation of a Multithreaded GPU Using CUDE," ACM PPoPP 2008. 8. Data intensive computing - Kouzes, R.T., Anderson, G.A., Elbert, S.T., Gorton, I., Gracio, D.K., "The Changing Paradigm of Data Intensive Computing," IEEE Computer, 42(1):26-34, Jan. 2009. - Randal E. Bryant, "Data-Intensive Supercomputing: The Case for DISC," TR CMU-CS-07-128, School of Computer Science, CMU, 2007. 9. Parallel I/O - A. Nisar, Weikeng Liao, A. Choudhary, "Scaling Parallel I/O Performance through I/O Delegate and Caching System," ACM SC'08. - P.H. Carns, b. W. Settlemyer, and W. B. Ligon III, "Using Server-to-server Communication in Parallel File systems to simplify Consistency and improve Performance," ACM SC'08. 10. Lock free data structures 11. Memory hierarchy design for CMP 12. on-chip interconnects - Empirically evaluating a software system Install a software system and related benchmarks (write your own if necessary). Empirically measure the performance of the system. Summary the performance results and draw conclusions from the results. Requirement: a) The software system must not be out-updated. b) This may involve software implementation or software installation Example topics: 1. Evaluation of mapreduce on multi-core platforms C. Ranger, et. al, "Evaluating MapReduce for Multi-core and Multiprocessor Systems," HPCA 2007. 2. Evaluation of STAR-MPI on different platforms. http://star-mpi.sourceforge.net/ 3. Evaluate some lock free data structures on multi-core systems. 4. Evaluate some collective communication algorithms on clusters 5. Evaluating and improving sparse matrix/graph message passing algorithms/applications on different platforms. One example is the LU factorization algorithm in http://www.cs.rochester.edu/~kshen/research/s+/. 6. Evaluating software overheads in common communication libraries. You can evaluate the software overheads in some commonly used libraries (e.g. MVAPICH, OPENMPI, etc). The results will provide the base line as to how much performance gain can still be achieved in such libraries when various communication optimization techniques are applied. - Design, analyze, evaluate new algorithms/applications for some types of parallel and distributed systems Anything new (new algorithm, new analysis techniques, new modeling techniques) with a reasonable justification is good. - New routing algorithms for fat-trees - New network topology - New lock free data structures - New multithreaded, or message passing algorithms - New multithreaded, or message passing applications - New collective communication algorithms - New performance modeling techniques