If you're going to consider this possibility, you can't look at the top 10k. The top 100 would be considerably more relevant. Tier lists revolve around a character's theoretical limits at the highest level of play. They're not relevant in the Platinum range or even Diamond level.
Edit: Oh wait, I didn't read above and missed this post: 
http://www.neogaf.com/forum/showpost.php?p=232030742&postcount=8402
So, basically, we can't really derive much from these in terms of actual character strength in terms of competitive performance at tournaments right now.
		
 
The top 100 would probably be a better estimate, except for the lower amounts of data. Here are the same results, but using the top 100 matchups, with the addition of the margin of error at a 95% level of confidence:
	
	
	
		Code:
	
	
		Rank  Char       Wins    Matches Win%
1     balrog     1700    2976    57.1    +/-    1.8
2     mbison     1634    2982    54.8    +/-    1.8
3     necalli    830     1529    54.3    +/-    2.5
4     zangief    931     1736    53.6    +/-    2.3
5     laura      1503    2863    52.5    +/-    1.8
6     ibuki      1681    3273    51.4    +/-    1.7
7     cammy      1157    2262    51.1    +/-    2.1
8     urien      1011    1988    50.9    +/-    2.2
9     nash       453     891     50.8    +/-    3.3
10    rashid     1328    2620    50.7    +/-    1.9
11    guile      735     1466    50.1    +/-    2.6
12    karin      876     1751    50.0    +/-    2.3
13    birdie     1002    2015    49.7    +/-    2.2
14    dhalsim    884     1830    48.3    +/-    2.3
15    akuma      1260    2617    48.1    +/-    1.9
16    chunli     321     691     46.5    +/-    3.7
17    ken        576     1255    45.9    +/-    2.8
18    rmika      531     1162    45.7    +/-    2.9
19    fang       546     1195    45.7    +/-    2.8
20    juri       336     737     45.6    +/-    3.6
21    ryu        739     1678    44.0    +/-    2.4
22    alex       501     1184    42.3    +/-    2.8
23    vega       667     1607    41.5    +/-    2.4
24    kolin      55      206     26.7    +/-    6.0
	 
 
For comparison, the margin of error was about 0.2-0.4 for the previous results. This mostly agrees with the previous results, though with some notable changes that cannot be explained simply by the uncertainty (e.g. Dhalsim).
However, on top of the increased uncertainty, I think that these results are much more likely to be skewed by individual performance. If only a few people play any given character, then their individual play-style and their individual performance will have a much bigger effect on the overall observed results.
	
		
	
	
		
		
			I mean, numbers don't lie. These are the actual stats for SFV online. But again, we usually talk about high level play when discussing tier lists.
		
		
	 
The problem is, I think, that we base our estimation of (high-level) tiers on the results of relatively few players. Therefore their individual strengths and weaknesses can skew our perception of the strengths of a given character, and thereby their tier. When somebody is considered the "best" player of a character, then that influences how people perceive that character.
I'm sorta interested in approximating the "true" tier of these characters, though that is of course just a hypothetical, which is why I think that we need more data to wash out individual performance. But of course, as you allude to, when including more data we increase the range of skill-levels considered.
Your point about match-ups being 3 dimensional is well taken, and I am personally happy that I don't have to balance fighting games.