<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
	<title type="html"><![CDATA[Approximatrix Forums — OpenMP and AMD Ryzen CPU]]></title>
	<link rel="self" href="https://forums.approximatrix.com/extern.php?action=feed&amp;tid=786&amp;type=atom" />
	<updated>2021-01-30T13:25:22Z</updated>
	<generator>PunBB</generator>
	<id>https://forums.approximatrix.com/viewtopic.php?id=786</id>
		<entry>
			<title type="html"><![CDATA[Re: OpenMP and AMD Ryzen CPU]]></title>
			<link rel="alternate" href="https://forums.approximatrix.com/viewtopic.php?pid=3683#p3683" />
			<content type="html"><![CDATA[<p>I thought I would update this thread for those who may be interested.</p><p>I built a new system with the AMD Rysen 9 3900x chip. When testing this system, I discovered that the 24 possible threads, my code can only utilize 6 threads efficiently. Any more than 6 threads and the simulation slows down. I suspect that there are 4 pipeline to the memory and thus, memory fetches across those pipelines are inefficient (here speaking without knowing exactly the architecture and appropriate terminology; not an expert on this).</p><p>Bottom line, this chip was not what I has hoped for in terms of performance running my simulation; perhaps the threadripper chips are what I really need. I am still able to use this system as I have been running 4 simultaneous simulation on it quite well. Six threads on the 3900x will do 100 iterations in 95 seconds compared to 8 threads on my I9-9900K system will do 100 iterations in 82 seconds, so a little slower on comparative simulations. However, I can to 4 simultaneous simulations on the 3900x and only two on the i9. Shrug... </p><p>Rod</p>]]></content>
			<author>
				<name><![CDATA[grogley]]></name>
				<uri>https://forums.approximatrix.com/profile.php?id=3372</uri>
			</author>
			<updated>2021-01-30T13:25:22Z</updated>
			<id>https://forums.approximatrix.com/viewtopic.php?pid=3683#p3683</id>
		</entry>
		<entry>
			<title type="html"><![CDATA[Re: OpenMP and AMD Ryzen CPU]]></title>
			<link rel="alternate" href="https://forums.approximatrix.com/viewtopic.php?pid=3637#p3637" />
			<content type="html"><![CDATA[<p>Thanks Jeff,</p><p>Yes, I have heard all the hoopla about the new AMD offerings and I lust after the threadripper chips but they are not in my budget. I have looked around the web for information on these chips and am encouraged but most reviews are aimed at gamers which is not necessarily the same performance needs I require. (My son is in the game industry and he tries to explain why games require so much CPU horsepower but I am not sure I buy into that hype.) I look at CPU benchmarks (PassMark) for comparisons purposes but I am not exactly sure their multi-threaded tests are comparable to how I will use the chip. I was hoping someone here in this forum would have objectively (non-fanboy) used these chips and had some data to share. </p><p>I am certainly not a expert on CPU architecture, all I know is what I empirically observe on the CPU I have to use. In the example I used above, while using all 16 threads for a single simulation with my i9 chip is not efficient, I can split the CPU into two and run two simultaneous simulations segregating the threads between the two sims using the affinity modes in Windows. That is not a bad compromise. I do this also in two other systems with two physical CPU chips (two Numa nodes) and split sims across affinity and nodes. One system of these systems runs 4 sims simultaneously. </p><p>Anyway, I will probably have to find other sources for more information. I am wary of the AMD forums as those tend to be biased. </p><p>Again, thanks,</p><p>Rod</p>]]></content>
			<author>
				<name><![CDATA[grogley]]></name>
				<uri>https://forums.approximatrix.com/profile.php?id=3372</uri>
			</author>
			<updated>2020-10-28T09:13:59Z</updated>
			<id>https://forums.approximatrix.com/viewtopic.php?pid=3637#p3637</id>
		</entry>
		<entry>
			<title type="html"><![CDATA[Re: OpenMP and AMD Ryzen CPU]]></title>
			<link rel="alternate" href="https://forums.approximatrix.com/viewtopic.php?pid=3635#p3635" />
			<content type="html"><![CDATA[<p>Rod,</p><p>While I don&#039;t currently have a Ryzen system here, I only hear good things about them.&nbsp; The FX series from AMD used a much older architecture that I can imagine had plenty of bottlenecks and drawbacks.&nbsp; However, Ryzen and Threadripper CPUs seem to be the darlings of the CPU world right now.&nbsp; I have seen that the Intel offerings are basically keeping pace with these AMD chips, though, but the AMD equivalents are substantially cheaper.</p><p>The Intel &quot;8 core and 16 threads&quot; stuff can be misleading.&nbsp; A lot of the multiple thread operations are only useful when your code is executing instructions where a portion of that core is busy for multiple clock cycles but another thread has some work that can be done in the meantime.&nbsp; With a lot of the OpenMP stuff where every thread is running very similar code, you might lose some of the benefits of the &quot;16 threads&quot; claim.&nbsp; That behavior won&#039;t necessarily change with a Ryzen either.&nbsp; </p><p>Someone must have published some OpenMP benchmarks for the current batches of CPUs.&nbsp; I would try seeking them out online.&nbsp; They&#039;d be drastically more useful than some of the more general benchmarks.</p>]]></content>
			<author>
				<name><![CDATA[jeff]]></name>
				<uri>https://forums.approximatrix.com/profile.php?id=2</uri>
			</author>
			<updated>2020-10-26T13:40:36Z</updated>
			<id>https://forums.approximatrix.com/viewtopic.php?pid=3635#p3635</id>
		</entry>
		<entry>
			<title type="html"><![CDATA[OpenMP and AMD Ryzen CPU]]></title>
			<link rel="alternate" href="https://forums.approximatrix.com/viewtopic.php?pid=3632#p3632" />
			<content type="html"><![CDATA[<p>I am trying to evaluate if the current AMD Ryzen offerings (Ryzen 9 or Threadripper) have limitations on their memory/NUMA architecture (not sure what the terms for this are) which will cause performance degradation when extending the number of OpenMP threads out to the maximum. So what do I mean by this? </p><p>For example, my current workstation CPU is an Intel I9 9900K processor, with 8 core and 16 threads. When I run my simulation with all 16 threads the performance is worse than if I use 8 threads. This suggests to me that there are memory access issues across the CPU chip that slow things down but not if I use only half the threads. The same happens for a Xeon E52680 chip that has 10 cores and 20 threads. I have to segregate the CPU by setting the affinity for the simulation to use half the available number of threads.</p><p>So does anyone here have experience with the AMD Ryzen chips? Specifically, I am interested in the Ryzen 9 3900X chip using OpenMP and threading limitations but discussion of the other AMD chips might help. </p><p>I am leery of AMD because I have been burned by them in the past where their 8 core CPU (I think the FX-8350) could not use OpenMP because of memory and core pipelines or something like that (LOL!). </p><p>Thanks in advance,<br />Rod</p>]]></content>
			<author>
				<name><![CDATA[grogley]]></name>
				<uri>https://forums.approximatrix.com/profile.php?id=3372</uri>
			</author>
			<updated>2020-10-20T13:02:18Z</updated>
			<id>https://forums.approximatrix.com/viewtopic.php?pid=3632#p3632</id>
		</entry>
</feed>
