<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
	<channel>
		<title><![CDATA[Approximatrix Forums — OpenMP and AMD Ryzen CPU]]></title>
		<link>https://forums.approximatrix.com/viewtopic.php?id=786</link>
		<atom:link href="https://forums.approximatrix.com/extern.php?action=feed&amp;tid=786&amp;type=rss" rel="self" type="application/rss+xml" />
		<description><![CDATA[The most recent posts in OpenMP and AMD Ryzen CPU.]]></description>
		<lastBuildDate>Sat, 30 Jan 2021 13:25:22 +0000</lastBuildDate>
		<generator>PunBB</generator>
		<item>
			<title><![CDATA[Re: OpenMP and AMD Ryzen CPU]]></title>
			<link>https://forums.approximatrix.com/viewtopic.php?pid=3683#p3683</link>
			<description><![CDATA[<p>I thought I would update this thread for those who may be interested.</p><p>I built a new system with the AMD Rysen 9 3900x chip. When testing this system, I discovered that the 24 possible threads, my code can only utilize 6 threads efficiently. Any more than 6 threads and the simulation slows down. I suspect that there are 4 pipeline to the memory and thus, memory fetches across those pipelines are inefficient (here speaking without knowing exactly the architecture and appropriate terminology; not an expert on this).</p><p>Bottom line, this chip was not what I has hoped for in terms of performance running my simulation; perhaps the threadripper chips are what I really need. I am still able to use this system as I have been running 4 simultaneous simulation on it quite well. Six threads on the 3900x will do 100 iterations in 95 seconds compared to 8 threads on my I9-9900K system will do 100 iterations in 82 seconds, so a little slower on comparative simulations. However, I can to 4 simultaneous simulations on the 3900x and only two on the i9. Shrug... </p><p>Rod</p>]]></description>
			<author><![CDATA[null@example.com (grogley)]]></author>
			<pubDate>Sat, 30 Jan 2021 13:25:22 +0000</pubDate>
			<guid>https://forums.approximatrix.com/viewtopic.php?pid=3683#p3683</guid>
		</item>
		<item>
			<title><![CDATA[Re: OpenMP and AMD Ryzen CPU]]></title>
			<link>https://forums.approximatrix.com/viewtopic.php?pid=3637#p3637</link>
			<description><![CDATA[<p>Thanks Jeff,</p><p>Yes, I have heard all the hoopla about the new AMD offerings and I lust after the threadripper chips but they are not in my budget. I have looked around the web for information on these chips and am encouraged but most reviews are aimed at gamers which is not necessarily the same performance needs I require. (My son is in the game industry and he tries to explain why games require so much CPU horsepower but I am not sure I buy into that hype.) I look at CPU benchmarks (PassMark) for comparisons purposes but I am not exactly sure their multi-threaded tests are comparable to how I will use the chip. I was hoping someone here in this forum would have objectively (non-fanboy) used these chips and had some data to share. </p><p>I am certainly not a expert on CPU architecture, all I know is what I empirically observe on the CPU I have to use. In the example I used above, while using all 16 threads for a single simulation with my i9 chip is not efficient, I can split the CPU into two and run two simultaneous simulations segregating the threads between the two sims using the affinity modes in Windows. That is not a bad compromise. I do this also in two other systems with two physical CPU chips (two Numa nodes) and split sims across affinity and nodes. One system of these systems runs 4 sims simultaneously. </p><p>Anyway, I will probably have to find other sources for more information. I am wary of the AMD forums as those tend to be biased. </p><p>Again, thanks,</p><p>Rod</p>]]></description>
			<author><![CDATA[null@example.com (grogley)]]></author>
			<pubDate>Wed, 28 Oct 2020 09:13:59 +0000</pubDate>
			<guid>https://forums.approximatrix.com/viewtopic.php?pid=3637#p3637</guid>
		</item>
		<item>
			<title><![CDATA[Re: OpenMP and AMD Ryzen CPU]]></title>
			<link>https://forums.approximatrix.com/viewtopic.php?pid=3635#p3635</link>
			<description><![CDATA[<p>Rod,</p><p>While I don&#039;t currently have a Ryzen system here, I only hear good things about them.&nbsp; The FX series from AMD used a much older architecture that I can imagine had plenty of bottlenecks and drawbacks.&nbsp; However, Ryzen and Threadripper CPUs seem to be the darlings of the CPU world right now.&nbsp; I have seen that the Intel offerings are basically keeping pace with these AMD chips, though, but the AMD equivalents are substantially cheaper.</p><p>The Intel &quot;8 core and 16 threads&quot; stuff can be misleading.&nbsp; A lot of the multiple thread operations are only useful when your code is executing instructions where a portion of that core is busy for multiple clock cycles but another thread has some work that can be done in the meantime.&nbsp; With a lot of the OpenMP stuff where every thread is running very similar code, you might lose some of the benefits of the &quot;16 threads&quot; claim.&nbsp; That behavior won&#039;t necessarily change with a Ryzen either.&nbsp; </p><p>Someone must have published some OpenMP benchmarks for the current batches of CPUs.&nbsp; I would try seeking them out online.&nbsp; They&#039;d be drastically more useful than some of the more general benchmarks.</p>]]></description>
			<author><![CDATA[null@example.com (jeff)]]></author>
			<pubDate>Mon, 26 Oct 2020 13:40:36 +0000</pubDate>
			<guid>https://forums.approximatrix.com/viewtopic.php?pid=3635#p3635</guid>
		</item>
		<item>
			<title><![CDATA[OpenMP and AMD Ryzen CPU]]></title>
			<link>https://forums.approximatrix.com/viewtopic.php?pid=3632#p3632</link>
			<description><![CDATA[<p>I am trying to evaluate if the current AMD Ryzen offerings (Ryzen 9 or Threadripper) have limitations on their memory/NUMA architecture (not sure what the terms for this are) which will cause performance degradation when extending the number of OpenMP threads out to the maximum. So what do I mean by this? </p><p>For example, my current workstation CPU is an Intel I9 9900K processor, with 8 core and 16 threads. When I run my simulation with all 16 threads the performance is worse than if I use 8 threads. This suggests to me that there are memory access issues across the CPU chip that slow things down but not if I use only half the threads. The same happens for a Xeon E52680 chip that has 10 cores and 20 threads. I have to segregate the CPU by setting the affinity for the simulation to use half the available number of threads.</p><p>So does anyone here have experience with the AMD Ryzen chips? Specifically, I am interested in the Ryzen 9 3900X chip using OpenMP and threading limitations but discussion of the other AMD chips might help. </p><p>I am leery of AMD because I have been burned by them in the past where their 8 core CPU (I think the FX-8350) could not use OpenMP because of memory and core pipelines or something like that (LOL!). </p><p>Thanks in advance,<br />Rod</p>]]></description>
			<author><![CDATA[null@example.com (grogley)]]></author>
			<pubDate>Tue, 20 Oct 2020 13:02:18 +0000</pubDate>
			<guid>https://forums.approximatrix.com/viewtopic.php?pid=3632#p3632</guid>
		</item>
	</channel>
</rss>
