Lecture 7
Learning objectives
After this class, you should be able to:
- Program the PS3, using the following functions:
spe_image_open,spe_context_create,spe_program_load,spe_context_run,spe_out_mbox_status,spe_out_mbox_read,spe_in_mbox_status,spe_in_mbox_write,spe_ls_area_get,spe_context_destroy,spe_image_close,mfc_get,mfc_put,mfc_write_tag_mask,mfc_read_tag_status_all,spu_write_out_mbox, andspu_read_in_mbox.- Give the DMA alignment and size restriction in
mfc_getandmfc_putcalls.- Give peak floating point performance of the SPEs, the maximum memory bandwidth, the total EIB bandwidth, and the bandwidth to SPEs.
- Explain the need for the
volatilequalifier on data variables used in DMA transfers.
Reading assignment
- Read Section 4.1.2 of the Cell Redbook (on Blackboard -- course library).
- References
- Libspe2(on Blackboard -- course library).
- SPE Extensions (on Blackboard -- course library).
Exercises and review questions
- Exercises and review questions on current lecture's material
- In Example 3 of Lecture 7, each SPE reads a portion of an array from main memory and computes the square of each element. If the data size is very large, then it will not fit in cache. In that case, you will need to bring pieces of data to the SPE and compute their squares. Write a code that does this, with and without overlap of computation and DMA transfers. What is the difference in performance? Report your results on the discussion board, under the
Lecture 7thread.- Preparation for the next lecture
- Time some piece of code on an SPE using the
spu_read_decrementerand thespu_write_decrementerfunctions. Report your results on the discussion board, under theLecture 7thread.
Last modified: 1 Feb 2010