If you have a multiprocessor system, a multi-threaded application may be a good way to leverage your computing power.
The previous section described how one can implement parallel rendering for a cluster of PCs. It's often the case that applications are more easily parallelized on a shared memory symetric multiprocessor (SMP) system than a distributed memory system (a PC cluster). Also, some types of visualization tasks may make better use of memory if the dataset is shared by N threads within one process, rather than replicated across N systems in a cluster.
As a concrete example, suppose we have a 4-pipe SGI Onyx system. If we create four rendering threads we can render part of the scene with each of the four pipes and display a sort-last-composited image on the user's screen.
In progs/threadtest/threadtest.c
you'll find an example
of a threaded Chromium application.
It's actually very similar to the psubmit demo.
The threadtest program accepts the following command line arguments:
-t numThreads
indicates how many threads to create.
The default is one.
-w
specifies that a separate window should be created for
each thread. The default is for all threads to render into one window.
-s1
specifies that only one thread should issue SwapBuffers
commands. The default is for all threads to swap their windows.
-b
specifies that barriers should be used for synchronization.
The threadtest.c
program is very similar to the
psubmit.c
program.
The major difference is the addition of code to create the N rendering
threads (using Windows threads or pthreads).
Like psubmit.c
, barriers are used after glClear and before
SwapBuffers in order to provide synchronization.
mothership/configs/threadtest.conf
is a sample configuration
file for running the threadtest demo.
This configuration file will pass the appropriate command line arguments to
the threadtest program.
It has several options to demonstrate various multi-threaded configurations.
Look near the top of the file for these options:
NumThreads
indicates the number of threads to create
Config
can take one of four values: LOCAL_ONE_WINDOW,
LOCAL_N_WINDOWS, REMOTE_ONE_WINDOW or SORT_LAST. These demonstrate
different types of parallelism.
The LOCAL_ONE_WINDOW option will create one window which all N threads will render into in parallel. glClear and SwapBuffers are synchronized with barriers. There is no server node; just run the mothership and crappfaker.
The LOCAL_N_WINDOWS option will create one window which all N threads will render into in parallel. There is no server node; just run the mothership and crappfaker.
The REMOTE_ONE_WINDOW option uses a pack SPU to send N streams of rendering commands to a render SPU running on a server node. Run the mothership, a crserver and crappfaker.
The SORT_LAST option uses a readback SPU to render N partial images which are sent to render SPU on a downstream server. Run the mothership, a crserver and crappfaker. Note that we don't use the -b option since the readback SPU itself will implement barrier synchronization.
Threaded sort-first rendering with the tilesort SPU is also possible but is not implemented in the configuration script.
Chromium is not thread-safe by default. To enable thread safety,
edit the options.mk
file, run make clean
then run make
.