MatPlus.Net

 Website founded by
Milan Velimirović
in 2006

4:56 UTC
ISC 2024
 
  Forum*
 
 
 
 

Username:

Password:

Remember me

 
Forgot your
password?
Click here!
SIGN IN
to create your account if you don't already have one.
CHESS
SOLVING

Tournaments
Rating lists
1-Apr-2024

B P C F





 
 
MatPlus.Net Forum Internet and Computing Hardware and solving
 
You can only view this page!
(1) Posted by Johan Beije [Monday, Jan 25, 2010 08:04]

Hardware and solving


Looking to my old AMD XP1700 (1466mzh) and 1 GB int.memory , it is time to replace it by something else. Vaclav has published good articles about handling memory and the several software packages. But what about the hardware? Single-core, dual-core, quad-core etc. there is so much choice at this moment. Is for example a 3.0 MHz dual-core just as fast as a 3.0 MHz quad-core. Has a multiple-core any influence on solving-software or is it only useful when ‘pruning’ memory. For any other notice, I use this computer hardly for anything else, just some internet and use of Office.
 
(Read Only)pid=4589
(2) Posted by [Monday, Jan 25, 2010 15:41]; edited by [10-01-27]

Johan Beije asks:

>But what about the hardware? Single-core, dual-core, quad-core etc. there is so much choice at this moment.

This depends on the software. If you are going to use the system for *nothing* but solving, check out what the software supports. Does it use multi-threading? If it doesn't, get the fastest computer you can find (or can afford), regardless of how many cores it has -- you will only use one anyway.

Don't trust just clock speed when you estimate speed -- evaluate the entire system. For instance, I've just upgraded from a 2-core 2.66 Ghz to a 4-core 2.66 GHz. Doesn't sound like a big win, does it? But since the new system is based on the relatively recent Intel i5-750 chip which connects system RAM straight to the CPU, instead of going by the northbridge, as the older system does, memory access is much faster, and this makes a big difference.

Also ... 32-bit vs 64-bit may make a difference if the solving program uses bitboards or similar techniques that are based on a 64-bit internal representation of the board.

However, if the system will do other things at the same thing it's running the solving software, multi-cores become important.

>Is for example a 3.0 MHz dual-core just as fast as a 3.0 MHz quad-core.

On the same architecture, yes, roughly, as long as you run single-thread programs. If you run single-thread on the quad core, and dual-thread on the dual-core, the latter may be faster. If you are running a program that can do four threads at the same time, the quad-core will probably be faster, everything else assumed to be equal.

>Has a multiple-core any influence on solving-software or is it only useful when ‘pruning’ memory.

Is there any multi-thread capable solving software? It has to be written to do solving that way. The software I know is all single-thread, so multicore is not apparently useful to solving. Unless you want to work with other things while the solving is going on, in which case you may be able to arrange things that one core does the solving, and the others do those other things, with only a slight performance impact. (Problem solving isn't clearly an easy split into entirely parallel threads.)

But start from the software you are planning to use, and check with the maker to ensure you get the best fit.

(Added: it strikes me ... Intel is changing over to Core i7/i5/i3 lines, and will probably drop the older Core 2/Core 2 Duo and older CPUs entirely. This may mean that systems with those systems will become cheaper -- I just saw a notice that in Sweden Fujitsu hoped to introduce new systems this months, and to have dropped the older systems by midsummer. I would guess something similar may be done in other countries as well -- this may be an opportunity for upgrades to past-generation systems also when they sell out the older systems.)

(Another added: The UK magazine PC Pro, in its March issue, has a reasonably thorough analysis of Intel and AMD CPUs. This might be of interest to anyone looking for attenpts at compare processor performance.)
 
 
(Read Only)pid=4590
(3) Posted by Dmitri Turevski [Monday, Jan 25, 2010 17:58]

Hi Anders,

You wrote:
 QUOTE 
Unless you want to work with other things while the solving is going on


Solving another problem is quite the other thing you may want to do meanwhile. Multicore system can come in very handy when you are preparing an update for the problems database (or testing the tourney entries for example).

Effectively solving a single problem on a multicore box requires software support, true (still you can gain some performance boost now by doing it ineffectively). The write access to the hashtables has to be properly synchronized (at which modern playing engines have succeeded, haven't they?), but other than that what difficulties am i missing?
 
   
(Read Only)pid=4593
(4) Posted by [Tuesday, Jan 26, 2010 21:31]

Dmitri Turevski wrote:

 QUOTE 
Solving another problem is quite the other thing you may want to do meanwhile. Multicore system can come in very handy when you are preparing an update for the problems database (or testing the tourney entries for example).


I was thinking of 32-bit systems when I wrote that -- I typically allocate 1.5 Gb for hash tables, and if I do much else besides solving, like browsing or editing large documents, or working with a database on the same system, there will be lots of paging, memory being written to the page file, which makes the system even slower. (On 64-bit systems with lots of memory, or on systems with page file on fast solid-state drives, this is probably less of an issue). It works, but then most things work, even if slowly. And sometimes that's the way you have to work. Any improvement is this kind of area is more likely to come from faster architecture, more memory, larger memory caches, and fast page/swap area, than from many cores, I think.

 QUOTE 
Effectively solving a single problem on a multicore box requires software support, true (still you can gain some performance boost now by doing it ineffectively).


Assuming it can be done. I can't, for example, run four copies of problemist or popeye in parallel without a lot of work: I would need to split the file of problems I want to solve into four separate parts, then start each separate program on each file, and at the end, perhaps even merge the results into one file again for my later work. Doing this manually seems fairly likely to eat up any speed-up gained from the multi-core architecture. For best use, the solving problem would need to split up into three or four parts, each of which solves a separate problem, and then put things back together in one file again. Even so, I would still like to benchmark solving a set of problems with 1 core, 2 cores, 3 cores, etc. to be sure where the real improvement is. If too much code runs in paralell, performance will drops -- a 4 core CPU may run with a speed corresponding to 3 CPUs. In such cases, it may be better to have four separate computers instead.

 QUOTE 
The write access to the hashtables has to be properly synchronized (at which modern playing engines have succeeded, haven't they?), ...


Shouldn't be a problem.

The main difficulty I see is the problem of how to split a single-thread solving program into several cooperating threads.

One approach could be to let each thread correspond to the main solving engine, and run entirely separately from the others. This could work, but it needs a lot of memory for each thread to have its own hash table. If that memory is not available, performance drops. This approach probably means a 64-bit CPU, and at least a cache-full of memory for each thread. This is probably the easiest to do.

Another approach could be to have one thread do the main solving, another thread does hash-table administration, and a third thread does file reading/writing or user/interface stuff. This would require that the hash thread would keep up with the the main thread, but if it did, hash table management would essentially be free. Doesn't seem entirely practical, as a good hash table should be very fast to start with.

Yet another would be to hand each thread its own possible variation to examine. In this case, it's fairly likely that one thread would reach positions that have already been evaluated by another thread, so keeping a shared hash table could make sense. At the same time, it also seems likely that one thread reaches a position that another thread hasn't finished evaluating, and so starts redoing the same job. If this would be a problem or not, I don't know. It seems likely that there would be more of such double work in a multi-thread program, but if it would be so much as to make solving slower, I have no idea. Try and see is probably the only way.
 
   
(Read Only)pid=4603
(5) Posted by Dmitri Turevski [Wednesday, Jan 27, 2010 11:05]

When I had to solve large quantities of problems (like tens of thousand) this approach of splitting and merging did pay out. I booted linux with no gui and little services and ran two copies of the Popeye with 256M of memory each on Intel Dual Core2 (overnight or over weekend). I didn't do proper benchmarking, but it *felt* like it's just about two times faster. However, i didn't attempt to solve problems that were likely to require more than 256M of memory.

 QUOTE 
it also seems likely that one thread reaches a position that another thread hasn't finished evaluating, and so starts redoing the same job.


This is a very good point that i missed. On the other hand, since the search tree is unbalanced (some continuations take longer to evaluate than other), there needs to be load balancing mechanism anyway to avoid process idling too. I found a paper about parallel traversal of unbalanced trees:

http://www.springerlink.com/index/f0vw5hp227181vhk.pdf

I didn't study it carefully, but interesting conclusion there is that depending on a tree structure the overhead inflicted by the load balancing itself can be quite significant compared to the actual computations. In terms of chess problems solving it may mean that performance gain from using multiple processors can only be watched on a certain class of problems (e.g. with lots of men).

 QUOTE 
Try and see is probably the only way.


Exactly.
 
   
(Read Only)pid=4611
(6) Posted by Johan Beije [Friday, Jan 29, 2010 20:42]

Interesting, maybe an easier question, to what benchmark do I have to look? On several websites I see several test and yes I understand that for some software multiple-core is faster. But with what benchmark I can compare Alybadix, Popeye or Gustav.
 
   
(Read Only)pid=4625
(7) Posted by Thomas Maeder [Saturday, Jan 30, 2010 11:59]

 QUOTE 
Effectively solving a single problem on a multicore box requires software support, true (still you can gain some performance boost now by doing it ineffectively). The write access to the hashtables has to be properly synchronized (at which modern playing engines have succeeded, haven't they?), but other than that what difficulties am i missing?


Any suggestions as to properly (and effectively!) synchronizing writes to the hash table in Popeye are very welcome. :-)
 
   
(Read Only)pid=4631
(8) Posted by Thomas Maeder [Saturday, Jan 30, 2010 12:12]

 QUOTE 
The main difficulty I see is the problem of how to split a single-thread solving program into several cooperating threads.


Doing this seems straightforward in "intelligent mode", where the potential end positions are determined and then attempted to be reached; e.g. the square where the king is (stale)mated gives a good criterion for partitioning the solving space (allowing each partition to use its own hash table).

If "intelligent mode" doesn't apply, things are more complex, as you go on to write.


Another issue to be considered is that compilers tend to do much less optimizations if multi-threating is enabled. This may mean that one may be better off spawning multiple tasks with a single-threaded executable rather than multiple threads. FWIW, this is how I currently run the Popeye test cases: a bash script starts a process for each test file, taking care that there are never >3 simultaneous tasks (I run the tests on a machine with 4 cores).
 
 
(Read Only)pid=4632
(9) Posted by [Sunday, Jan 31, 2010 09:39]; edited by [10-01-31]

Thomas Maeder wrote:

>Any suggestions as to properly (and effectively!) synchronizing writes to the hash table in Popeye are very welcome. :-)

Depends a bit on how you want the exclusion to work, I suppose. I'm not sure what portability platform Popeye is written to, but if it provides some kind of semaphore, the basic functionality would be there. If you also want some kind of assurance that the first thread that was prevented from entering the critical code is also the first to enter it when it gets unlocked, you'll have to add some waiting list scaffolding around the basic semaphore. Or, perhaps, there are mutex services available already that do this, or the thread scheduling ensures it already.

If you're going to do multithreading, the thread API will almost certainly have these already. For pthreads, see the routines pthreads_mutex_{init,destroy,trylock,unlock} for example.

Or, at least this is how I interpreted the problem of 'synchronizing writes'.
 
 
(Read Only)pid=4633
(10) Posted by Olaf Jenkner [Monday, Feb 1, 2010 22:25]

Gustav only uses one core. I mostly run Gustav on my dual core laptop
twice. Hash-tables should not be too big to avoid swapping which slows
down a program terribly. With 2 GB RAM I use two times 666 MB. The rest
is eaten by Vista. The computer is strong enough to let me write or
surf through the internet while testing two chess problems. Videos on
Youtube require more computing power. Then I interrupt one calculation
while seeing the video.

Running the laptop day and night I cooked more than 100 selfmates
from the PBD within a half year:
http://www.softdecc.com/pdb/search.pdb?expression=usercommentuser=%27Jenkner%27and%20cooked

Olaf
 
   
(Read Only)pid=4637
(11) Posted by Johan Beije [Wednesday, May 19, 2010 07:17]; edited by Johan Beije [10-05-19]

After some time and tips from this website I bought a new computer. What did I buy? An AMD X2 550 (3.1Ghz, 6Mb L3) and 4 GB DDR (Intel is much more expensive. A low budgetsystem only mend for some internet, office and solving/testing. As know, the AMD dual-core can be unlocked to quad-core, also my X2 is unlocked to an X4 and is stable, only the temperature is too high, so I have to buy a CPU-cooler instead of the stock cooler which is now running. The use of more cores has no effect on solving as expected because the software (in this case I only tested Alybadix 2005.00) uses only one core.
After the start of this topic, I made a visited several PC on my work. I used the H#4 Adrian (http://web.telecom.cz/vaclav.kotesovec/) as example using Alybadix and OS Windows XP Pro serv.pack 3 and 466MB internal memory
AMD XP 1700 (1.46Ghz) = 13:57 min (my old computer)
Intel Dual core HT(3.4Ghz) = 8:52 min (HT has no effect on solving)
Intel E6300 C2D (1.86Ghz) = 7:54 min
Intel E6550 C2D (2.33Ghz) = 5:41 min
AMD Phenom X2 550 (3.1Ghz) = 4:10 min (note: 1285 MB time is 3:10 min. In DOS 1982 MB 2:53 min)
 
   
(Read Only)pid=5399
(12) Posted by Geoff Foster [Friday, Oct 15, 2010 02:40]

For multi-core computers it is possible to confine a single-thread program to just 1 core. When I run Popeye I go to Task Manager, click on the "Processes" tab, right-click on "pywin.exe", then click "Set Affinity". I then un-check all CPUs except for the final one. This seems to improve performance.
 
 
(Read Only)pid=6213

No more posts


MatPlus.Net Forum Internet and Computing Hardware and solving