CRIMP
View Ticket
Not logged in
Bounty program for improvements to Tcl and certain Tcl packages.
Ticket Hash: 839d20d7c3c080f1d36aa558b873e5f7acccbfe1
Title: performance through parallelization / threading
Status: Open Type: Feature_Request
Severity: Important Priority: Medium
Subsystem: General Resolution: Open
Last Modified: 2010-12-24 11:52:53
Version Found In:
Description:
Look into working up a C-level thread pool support and then using this to parallelize the various image operations by tiling or striping them and then each parcel handled by a separate thread.

This likely requires us to refactor the operators into the core function and API, so that we can slide the thread management between them.

Question: What C level API does Tcl (and Thread package) provide here which would be useful ?

Note that this assumes that the threads are at the C-level, and invisible to the user. Not image processing by moving them to a wholly separate thread to keep them out of the GUIs way. That is a separate thing CRIMP doesn't have to think about. Except maybe in providing ways of transfering images between Tcl threads without having to convert them between representations, using lots of memory.

<hr><i>andreask added on 2010-09-13 18:48:10:</i><br>
Note [http://wiki.tcl.tk/25977].
One concern: Is there a portable way to determine the number of CPUs on a system ?
Because this seems to me to be the most suitable size for the threadpool used by the crimp internals.


<hr><i>andreask added on 2010-09-13 18:58:29:</i><br>
Google: sysconf, _SC_NPROCESSORS_{CONF,ONLN}

<hr><i>andreask added on 2010-09-13 19:10:18:</i><br>
http://www.listware.net/201003/gtk-devel-list/42075-gthread-how-many-cores-do-i-have.html

Snarfing relevant parts of the discussion ...

... SYSTEM-INFO.dwNumberOfProcessors on Windows.

... http://git.gnome.org/browse/gimp/tree/app/base/base-utils.c#n54

... See also:
  http://qt.gitorious.org/qt/qt/blobs/4.7/src/corelib/thread/qthread-unix.cpp
  http://qt.gitorious.org/qt/qt/blobs/4.7/src/corelib/thread/qthread-win.cpp
the QThread::idealThreadCount() function.

On Windows, that's:
  SYSTEM-INFO sysinfo;
  GetSystemInfo(&sysinfo);
  return sysinfo.dwNumberOfProcessors;

MacOS X:
  MPProcessorsScheduled();

HPUX:
  struct pst-dynamic psd;
  if (pstat-getdynamic(&psd, sizeof(psd), 1, 0) == -1) {
    perror("pstat-getdynamic");
    cores = -1;
  } else {
    cores = (int)psd.psd-proc-cnt;
  }

{Free,Net,Open}BSD:
  size-t len = sizeof(cores);
  int mib[2];
  mib[0] = CTL-HW;
  mib[1] = HW-NCPU;
  if (sysctl(mib, 2, &cores, &len, NULL, 0) != 0) {
    perror("sysctl");
    cores = -1;
  }

"integrity" OS, symbian: hard-coded to one core.

VXWorks: a loop to check if CPU #n exists until it fails (see link)

IRIX:
  cores = (int)sysconf(_SC_NPROC_ONLN);

all other Unix (including Linux):
  cores = (int)sysconf(_SC_NPROCESSORS_ONLN);

Cheers

<hr><i>andreask added on 2010-09-13 19:12:03:</i><br>
And
http://stackoverflow.com/questions/150355/programmatically-find-the-number-of-cores-on-a-machine


<hr><i>andreask added on 2010-09-20 17:55:49:</i><br>
Thanks to Joe English for the link to the portable 'hardware locality' project.

[http://www.open-mpi.org/projects/hwloc/]

BSD licensed!
This looks to be all I want and much more.

<hr /><i>anonymous claiming to be Arjen Markus added on 2010-12-24 11:52:53:</i><br />
Consider the use of GPU programming - that is well-suited for this type of computations (almost embarassingly parallel, with only small amounts of data per node).

Another thing that comes to mind is the use of OpenMP: it is a high-level set of directives and some platform-independent functions that make life much easier, if it fits the bill.