FSU Seal - 1851

    COURSE SYLLABUS

    CIS 5930 - 04: GPU Programming

    Spring 2015


Prerequisites:

COP 4610: Operating Systems. You should be comfortable programming in C/C++, and have good knowledge of undergraduate level algorithms, data structures, and computer architecture. No knowledge of parallel computing is required; however, you should have some experience with threads and some exposure to locks, atomic operations, and mutual exclusion, which are typically discussed in operating systems.

Class Schedule:

Activity Day Time Location
Lecture TR 3:35 pm - 4:50 pm LOV 301

Contact information:

Instructor: Ashok Srinivasan
Office hours: Monday 11 am - 12 noon and Friday 1 pm - 2 pm. I am also usually available in my office, and you can feel free to meet me when I am there. Alternatively, you may schedule an appointment, either by email or by phone.
Office: 169, Love Building
Phone: 644-0559
Email: asriniva AT cs.fsu.edu

Course material:

Required Material:
Reference material:
Computer accounts:

Course rationale:

In your programming, algorithms, and data structures courses, you would have learned how to write efficient sequential programs, that is, programs running a single thread. However, all modern processors are multicore, and in order to make effective use of them, you need to run multiple threads. When accelerators, such as GPUs, are available, then the number of threads needs to be even larger. In your operating systems course, you would have learned to write code that uses multiple threads, and how to avoid common errors such as race conditions. This course teaches you how to organize the computations of the threads so that they work together and perform the required computations efficiently, making good use of the available hardware resources. We will focus on using GPUs for general purpose computing, rather than for graphics.

Course description:

Most modern computers come with Graphical Processing Units (GPUs) that can be used for general purpose computing. GPUs provide much more computing power than CPUs do, by using more of their hardware resources for computing than CPUs do. CPUs, on the other hand, use more of their hardware resources in their cache, which can be used to reduce memory access latency. GPUs deal with memory access latency primarily through multi-threading; when some threads are stalled accessing data, other threads can perform computation without a significant context-switch penalty. A consequence of this approach is that massive parallelism -- a large number of threads -- is required to make effective use of GPUs. Problems encountered in making effective use of a large number of threads are similar to those encountered in making a group of people work together. It may be difficult to decompose a problem so that people can work on different parts simultaneously. For example, consider someone who wants to have dinner cooked, eat it, and then have the dishes washed. It is not easy to speed up this process by hiring someone to cook, and another person to wash the dishes, because the three tasks are sequential; the food needs to be cooked before it is eaten, and the food needs to be eaten before the dishes are washed. Similar problems occur in parallel computations too, and sequential parts of the computation can reduce the effectiveness of parallelization substantially.

This course will describe different approaches to solve such problems, in order to develop efficient parallel algorithms for a variety of problems. We will also pay much attention to practical aspects of implementing parallel code that actually yields good performance on GPUs. We will follow the material used in the pioneering GPU course by Wen-mei Hwu at UIUC. By the end of the course, you should be able to make effective use of GPUs and obtain significance improvements in speed for practical applications. You may still not be able to obtain the best possible performance; one semester is not enough time for that! However, you will learn enough that you can continue further learning on your own and become an expert after a few years of experience, if you are interested in this.

Learning objectives:

At the end of this course, you should be able to accomplish the objectives given below.

Your responsibilities:

Deadlines and Instructions

Following the same professional guidelines that you will encounter at work, there are strict deadlines, and instructions that must be followed. Please read instructions carefully, and schedule your activities so that you submit assignments well in time. You should check your FSU email account and the class web page regularly, and note other announcements, on-line and in class. You should also subscribe to the discussion board forums on blackboard.

Project

You will have one project, which will account for 40% of your course grade. You may work on the project either individually or in groups of two students each. If you work in a group, then you should clearly identify your contribution to the group effort. Your grade will be base on both, the group's achievement and your contribution to it. Your project should be of sufficient quality and quantity for it to be accepted in at least a mediocre conference, such as Europar or workshops at good conferences. You may want to read some best papers from conferences such as IPDPS or SC, to get an idea of good quality work. While your work may differ from the best ones in quantity, the quality of your work and presentation should be high.

Reading Assignments

After each lecture, you will be given a reading assignment pertaining to that lecture. You should read these, and also practice writing code. If you learn only the material discussed during the lecture, then you will likely fail the course. You should learn material from the reading assignment, and also refer to reference material in your programming tasks.

Note that new material builds on the old ones. So, if you have trouble with some material, please get help through the discussion board on Blackboard, or from me, before the next class.

I expect that you will need to spend between one and two hours studying, for each lecture. The programming assignments, project, and exam will consume additional time.

The following learning components are important, and you may want to verify if you do satisfactorily on these, after studying the material.

Assignments

You will have several small programming assignments in this course, and you will have around two days to work on each one. The assignments will be announced on the Blackboard course web site under the "Assignments" tab. Parallel programming assignments are substantially more difficult than sequential programming assignments, and require substantially more time and effort. Please start working on the assignments as soon as they are announced, if you wish to complete them on time! You will also be asked to answer questions based on the reading assignment. Your answers to these will also be evaluated under the category of assignments.

Course calendar:

Week Lecture Lecture topics Other activities
1 8 Jan 1. Introduction
2 13 Jan 2. Introduction to CUDA C
15 Jan 3. Introduction to CUDA C
3 20 Jan 4. CUDA Parallelism Model
22 Jan 5. CUDA Memory Model
4 27 Jan 6. CUDA Memory Model
29 Jan 7. DRAM, GMAC
5 3 Feb 8. Convolution, Constant Memory, and Constant Cache
5 Feb 9. Tiled Convolution
6 10 Feb 10. Tiled Convolution Analysis
12 Feb 11. Reduction Tree
7 17 Feb 12. Reduction Tree
19 Feb 13. Parallel Prefix
8 24 Feb 14. Parallel Prefix Project groups due Feb 24
26 Feb 15. Floating Point Considerations
9 3 Mar 16. Floating Point Considerations Project proposal due Mar 3
5 Mar Midterm review
10 12 Mar Spring Break
14 Mar Spring Break
11 17 Mar Midterm
19 Mar 17. Atomic Operations and Histogramming
12 24 Mar 18. Atomic Operations and Histogramming
26 Mar 19. GPU as Part of the PC Architecture
13 31 Mar 20. Data Transfer and CUDA Streams Project progress report due Mar 26
2 Apr 21. Application Case Study: Advanced MRI Reconstruction
14 7 Apr 22. Application Case Study: Molecular Visualization and Analysis
9 Apr 23. Performance Analysis
15 14 Apr 24. Joint CUDA-MPI Programming
16 Apr 25. Joint CUDA-MPI Programming
16 21 Apr 26. Introduction to OpenCL
23 Apr 27. Introduction to OpenACC
17 27 Apr-1 May Project demonstrations Project code due Apr 26
Project reports due Apr 26

Grading criteria:

Your overall grade will be based on your performance on (i) midterm, (ii) project, and (iii) assignments, with weights as given in Table 1. Note that there is no final exam in this course.

Your average on the midterms should be at least 75% for you to get a course grade of B or better. If you meet this constraint, then the final grade will be determined using Table 2.

    Table 1: Course Points
    Item Weight
    Midterm 30
    Assignments 30
    Group Project 40
    Table 2: Letter Grades
    Points Grade
    92.0 - 100.0 A
    90.0 - 91.9 A-
    88.0 - 89.9 B+
    82.0 - 87.9 B
    80.0 - 81.9 B-
    0 - 79.9 F - C+

NOTE: You must earn at least 75% in the midterm to be awarded a course grade of B or better. For example, if you obtain a total of 89%, but a midtgerm grade of only 74%, then you will not get a B+. Instead, you will get a B-, because that is the highest grade for which you will be eligible without meeting the exam cutoff.

Programming Assignment Assessment

You must understand your assignment work. If you are asked to explain your work, and you are unable to do so, you may be assigned a grade of zero.

Course policies:

Attendance Policy:

The university requires attendance in all classes, and it is also important to your learning. The attendance record may be provided to deans who request it. If your grade is just a little below the cutoff for a higher grade, your attendance will be one of the factors that we consider, in deciding whether to "bump" you up to the higher grade. Three or fewer unexcused absences in lectures and recitations will be considered good attendance.

Excused absences include documented illness, deaths in the immediate family and other documented crises, call to active military duty or jury duty, religious holy days, and official University activities. Accommodations for these excused absences will be made and will do so in a way that does not penalize students who have a valid excuse. Consideration will also be given to students whose dependent children experience serious illness.

You should let me know in advance, when possible, and submit your documentation. You should make up for any materials missed due to absences.

Missed exam Policy:

A missed exam will be recorded as a grade of zero. We will follow the university rules regarding missed final exams (see http://registrar.fsu.edu/dir_class/spring/exam_schedule.htm), for the midterm.

Late Submission Policy:

In order to enable us to provide timely solutions to assignments, we have the following policy regarding late submissions. Late submissions will not be accepted for any assignment. Late submission of project material will incur the following penalties.

Grade of 'I' Policy:

The grade of 'I' will be assigned only under the following exceptional circumstances:

Professional ethics:

You will gain confidence in your ability to design and implement algorithms only when you write the code yourself. On the other hand, one does learn a lot through discussions with ones peers. In order to balance these two goals, I give below a list of things that you may, and may not, do.

Things you may not do: You should not copy code from others. This includes directly copying the files, replacing variable names in their code with different names, altering indentation, or making other modifications to others' code, and submitting it as your own. (You may also wish to note that many of the modifications that make codes look very different in a higher level language, yield lower level representations that are very close, and are hence easy to detect.) Furthermore, you should take steps to ensure that others cannot copy code from you -- in particular, you should have all permissions on assignment files and directories set off for others.

Things you may do: You may discuss specific problems related to use of the computer, useful utilities, and some good programming practices, with others. For example, you may ask others about how to submit your homework, or how to use the debugger or text editor.

Honor Code: The Florida State University Academic Honor Policy outlines the University's expectations for the integrity of students' academic work, the procedures for resolving alleged violations of those expectations, and the rights and responsibilities of students and faculty members throughout the process. Students are responsible for reading the Academic Honor Policy and for living up to their pledge to be honest and truthful and [to] strive for personal and institutional integrity at Florida State University. (Florida State University Academic Honor Policy can be found at http://fda.fsu.edu/content/download/21140/136629/AHPFinal2014.pdf .)

Plagiarism:

Plagiarism is "representing another's work or any part thereof, be it published or unpublished, as ones own. For example, plagiarism includes failure to use quotation marks or other conventional markings around material quoted from any source" (Florida State University General Bulletin 1998-1999, p. 69). Failure to document material properly, that is, to indicate that the material came from another source, is also considered a form of plagiarism. Copying someone else's program, and turning it in as if it were your own work, is also considered plagiarism.

SYLLABUS CHANGE POLICY:

Except for changes that substantially affect implementation of the evaluation (grading) statement, this syllabus is a guide for the course and is subject to change with advance notice.


Last modified: 16 Jan 2015