In your previous example, for instance, where you had polls showing Trump +1, Clinton +1, Clinton +5, the assumption from you was that the Trump +1 poll was correct, so we shouldn't average these results, but take that poll. But what's your argument for doing so? It's obvious is the others were clearly bad (maybe landline only or something), but that isn't very interesting.
There are many arguments we can make to judge polls. That was my old job - trying to figure out what LV screens work best. You're right that after a certain point the differences become very subtle, although usually surmountable, but even if they weren't...
It's more accurate to average them and get Clinton +2 than it is to just guess which of the three is correct.
...this is still wrong. Accurate has a specific definition. A sample is accurate if the expected outcome is equal to the population. If one of the samples is more accurate than the others, then averaging it produces a new, less accurate sample. Suppose we don't have any reason to suppose that any of the samples is any more accurate than the other. This implies that we *also* don't have any reason to suppose that average would be any more accurate than the original samples. So there's *still* no reason to average.
Instead, the best thing to do, in the absence of any other information, is just to provide all three separate results, with the different assumptions attached. That's it. You could provide the average too, but it doesn't give you any additional information; unlike the average of two samples with identical methodology. This is just the Popperian paradigm at work.
This is in fact my current frustration with Silver! I don't particularly know whether his model right now is correct or not, and won't until the election if ever, but his reasoning for why he does what he does is really poor.
Again, with respect, I think perhaps you do not have sufficient statistical understanding to be able to assess whether his reasoning is poor or not.
And this isn't me being particularly favourable to Silver. I think he's terribly smug on Twitter and a very poor quality pundit. I'm just saying that his assumptions are reasonable. Wang thinks that elections are determined to a high degree by a relatively small number of important inputs. Silver thinks that elections are determined to a high degree by a relatively large number of dispersed inputs. Neither of these is better or worse practice; I've seen people argue quite persuasively for both. It's unlikely you could definitely determine which is better because presidential elections aren't actually drawn from the same distribution, you're facing a constantly moving target - presidential elections might, for example, become progressively more dependent on a larger number of inputs over time. So both models can and should co-exist. I think most of the spats between Silver and Wang are just because they're competitors.
I think the sharpest criticism you could launch at Silver is something like: the more inputs a model has, the more uncertainty it has. As uncertainty increases, linear estimators tend towards the geometric mean (this has a long explanation, but essentially in a two-person race, it tends towards a 50/50 forecast). A 50/50 forecast implies a 'dead heat'. Dead heats generate the most attention, which translates into money for ESPN. So, Silver isn't creating an inaccurate model, but he might be overly keen to include inputs because the model produces more profitable punditry as a result. But... this is fine, if you just ignore his punditry, which we all should because he can't write anyway.