
1i5t5.duncan at cox
Nov 30, 2012, 3:20 PM
Post #1 of 1
(276 views)
Permalink
|
|
piledriver/trinity cpu/apu hardware tester needed, bug 445053
|
|
Bug #445053 deals with the new USE=fma flag in sci-libs/fftw-3.3.3 (~amd64). This flag enables upstream's new-for-that-version fma instruction set optimizations, but the problem is that there's two different fma instruction sets, fma3 and fma4. The wikipedia article explains the difference, history, etc, in some detail. Bug URL: https://bugs.gentoo.org/show_bug.cgi?id=445053 fma on wikipedia: http://en.wikipedia.org/wiki/FMA_instruction_set So when I go to do my update, I see the new USE flag, and having an amd bdver1 (bulldozer) with fma4, but seeing the USE flag is for fma (no number appended), I'm confused and start looking into things, then file that bug. I've now actually tested USE=fma on my bdver1 (fma4) hardware with both the ebuild's "small" tests, and manually run "make bigtest" in all three subdirs (single/double/long-double) created as part of the build process, passing all tests, so it seems fma works reliably for fma4 hardware. What we do NOT yet know for sure is whether it works reliably on fma3 hardware, so we now need someone with fma3 hardware to check there, as well. According to the wikipedia article, Intel hardware will support fma3 with hardware to be released in 2013, so AFAIK, there's no released Intel hardware with hardware fma support at all, yet. Still anyone with a current (definitely this year) Intel cpu/apu is welcome to check /proc/ cpuinfo and see, and run the tests if they have it. The newest amd hardware should already have fma support, however, but it could be fma3 or fma4 depending on CPU. Bulldozer (-march=bdver1 in gcc) chips, released in late 2011, should have fma4 listed in /proc/cpuinfo, as I do here. That's what I tested with USE=fma here, with all tests I ran passing. The new piledriver CPUs, and trinity APUs, however (I believe - march=bdver2, but am not positive on that), are supposed to support fma3. I'd guess /proc/cpuinfo should report either fma3 or simply fma, for them. That's what still needs tested. So, anyone with that hardware, could you at least set USE=fma and run ebuild ... test on sci-libs/fftw-3.3.3 , then report the results in the bug? Based on my results, the whole build and test (the ebuild runs make smalltest for all three subdirs) should only run perhaps five minutes or so (it was about three here, including the configure and build, tho my PORTAGE_TMPDIR is on tmpfs, so it might take a bit longer for those with it on a spinning hard drive). Ideally, once the ebuild test passes, you'd also manually cd into the work dir, source the environment file to get the portage build environment, and run emake bigtest in all three subdirs (the ebuild uses a loop thru the subdirs to run smalltest, you can do the same for bigtest, or cd into each and run the tests manually). That will take rather longer, perhaps an hour or so for the single subdir, longer, maybe two hours, for the double subdir, and the same or longer for long-double. However, the tests don't make very efficient use of the CPU, so if you have a quad-core or better, likely with piledriver anyway, you could probably run the tests for all three subdirs in parallel and still have CPU left to run other things. If it passes (e)make smalltest (in the ebuild test phase) and the manual (e)make bigtest, for all three subdirs, with USE=fma, on an fma3 hardware system, it should be safe to change the USE flag description to say it can be used for either fma3 or fma4 hardware. If not, then since it does seem to work on my fma4 hardware, perhaps the flag should be changed to fma4. So any help testing fma3 hardware would definitely be appreciated. Please report results on the bug. Anyone with fma4 hardware can double-check my results as well, but it does seem to work here. Thanks. =:^) -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman
|