I think for Constructed they do a good enough job of play testing but occasionally they let slip a few cards which is an acceptable margin of error in an expansion of 130+ cards.
They got the Pirates Package wrong in MSG because Bucaneer was made very late into the expansion and they nerfed it last second to release it. They soon realized their mistake on that. Patches has been playtested numerous times and gone through multiple iterations, I guess they really wanted to push the Pirates tribe finally.
DrakOP and Jade Idol are the other two cards that are above the curve in MSG especially based on stats. Jade Idol punishes control decks but with the existence of the Pirates the Jade Druid has not proliferated and that's probably what Blizzard predicted. Blizzard already admitted that they made DrakOP OP on purpose to push Priest up.
So 4ish cards above the power level, maybe 5 if you include Kazakus. This is not bad at all, no where near the level of GvG where you had like 10 cards that were insane (Dr Boom, Shredder, Muster for Battle, Minibot, Mech Warper, Coghammer, Implosion, Crackle, Whirlozapomatic, Unstable Portal).
Before MSG they had Old Gods as their major expansion, they got CotW, Yogg and Thing from Below wrong... MAYBE 477 and Fandral. Yogg is a card that not even the community got right the first few months, they get a pass for that for sure so getting 2-3 cards over tuned in a big expansion is not bad at all.
For every card they didn't properly balance, they properly balance like 20 other cards correctly. It's easy to focus stuff that they miss versus stuff that they didn't.
MSG is also the first time that they have gotten Arena right IMO after years of messing up on it.