At Puget Systems, we HAVE seen the issue, but our experience has been much more muted in terms of timeline and failure rate. In order to answer why, I have to give a little bit of history.
Going all the way back to 2017, with the Intel 8700K processor, we published an article titled
Why Do Hardware Reviewers Get Different Benchmark Results? which helped call attention to the fact that motherboards were shipping with “Multicore Enhancement” enabled, which set the CPU “All Core Turbo” to be equal to the “Single Core Turbo” frequency. This essentially was overclocking the CPU, by pushing it past official Intel specifications, and had negative effects on stability and temperatures. At Puget Systems, we have always valued stability first and we actively made the choice to follow Intel specifications. Behind the scenes, this meant encouraging Intel to make those specifications public on
Intel ARK and pushing motherboard ODMs to follow Intel guidance as their default settings.
JayzTwoCents helped drive public awareness of the issue, and for a short time it appeared that things were back on track.
Since that time, our stance at Puget Systems has been to mistrust the default settings on any motherboard. Instead, we commit internally to test and apply BIOS settings —
especially power settings — according to our own best practices, with an emphasis on following Intel and AMD guidelines. With Intel Core CPUs in particular, we pay close attention to voltage levels and time durations at which those levels are sustained. This has been especially challenging when those guidelines are difficult to find and when motherboard makers brand features with their own unique naming.
Nevertheless, we kept that approach with confidence due to the high amount of real-world testing we do here. We’ve even developed our own suite of
PugetBench Benchmarks, whose goal is to test real-world scenarios, guided by years of experience and learning through our customers and partners. Our approach has always led us to be conservative with our power settings, especially when
we have shown that the real-world performance impact to be a small 1-2% range.
Puget Systems Intel Core Failure Rates
So, with that understanding of WHY we may be seeing things differently than others in the industry — what ARE we seeing here at Puget Systems?
Even though failure rates (as a percentage) are the most consequential, I think showing the absolute number of failures illustrates our experience best. I decided to go back all the way to the launch of Intel Core 10th Gen to give some historical perspective. Starting with 10th Gen, we have only sold the top 2 SKUs (XX700K and XX900K) in volume, which gives us a nice clean set of data.
Looking at that chart, you’ll notice a few things. First, your attention undoubtedly is drawn to the recent spike of failures with Intel Core 14th Gen. Second, you can see that Intel Core 11th Gen CPUs had a failure rate at nearly the same level, even though it didn’t get as much press at that time, that I can recall. Third, I’ll draw your attention to a steady and elevated failure rate on 13th Gen processors.
I can also plot this same data, but instead of coloring it by CPU generation, I’ll color it based on whether we caught the issue on our production floor (shop failure), or if the issue made it out to the customer (field failure). Obviously, a field failure is dramatically more severe of a problem because it now impacts our customer experience.
The most concerning part of all of this to us here at Puget Systems is the rise in the number of failures in the field, which we haven’t seen this high since 11th Gen. We’re seeing ALL of these failures happen after 6 months, which means we do expect elevated failure rates to continue for the foreseeable future and possibly even after Intel issues the microcode patch.
Based on this information, we are
definitely experiencing CPU failures higher than our historical average, especially with 14th Gen. We have enough data to know that we don’t have an acute problem on the horizon with 13th Gen — it is more of a slow burn. We do expect an elevated failure rate on 14th Gen while Intel finishes finding a root cause and issuing a microcode update. While the number of failures we are experiencing is definitely higher than our historical average, it is difficult to classify 5-7 failures a month in the field as a huge issue, and it is definitely a lower rate of failure than we are hearing about from others in the industry. The recent spike in 14th Gen failure rates stands out mostly because how incredibly low historical CPU failure rates tend to be.
We believe that our commitment to internally developed power settings is why we have been much less impacted than others by these Intel stability issues. This is shaping our approach over the coming months.
Failure Rates in Context
Everything I’ve shown you so far is our raw number of failures, but what matters most is failure rate percentages. Let’s look at total failure rates in the context of multiple generations and with comparison to AMD Ryzen CPUs.
You can see that in context, the Intel Core 13th and 14th Gen processors do have an elevated failure rate but not at a show-stopper level. The concern for the future reliability of those CPUs is much more the issue at hand, rather than the failure rates we are seeing today. If it is true that the 14th Gen CPUs will continue to have increasing failures over time, this could end up being a much bigger problem as time goes by and is something we will, of course, be keeping a close eye on. 14th Gen isn’t as rock solid as Intel’s 10th or 12th Gen processors, but at least for us, it isn’t yet at critical levels.
Based on the failure rate data we currently have, it is interesting to see that 14th Gen is still nowhere near the failure rates of the Intel Core 11th Gen processors back in 2021 and also substantially lower than AMD Ryzen 5000 (both in terms of shop and field failures) or Ryzen 7000 (in terms of shop failures, if not field). We aren’t including AMD here to try to deflect from the issues Intel is currently experiencing but rather to put into context why we have not yet adjusted our Intel vs. AMD strategy in our workstations.