• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

NHL Off-Season 2014 |OT2| - Dan Cleary

Razorskin

----- ------
Going to have to watch every game, yes. Basically will have a program to enter everything into a database as quickly as possible, then overlay that information with some official NHL data that i could scrape off their website.

Idea is to load that data post game then have the game available (from hockeystreams or whatever) and then jump to the points of interest highlighted by their data to make my analysis.



Current analytics is primitive. Yes it gives a general idea but a lot of times the data may be inaccurate to the point of unusable.

Good luck, hopefully in two years it'll get some recognition for all your hard work in this.

sounds like a waste of time

Could be, or could be a fruitful endeavor.
 

Red_Man

I Was There! Official L Receiver 2/12/2016
Current analytics is primitive.
No one has said otherwise. It has to begin somewhere, and completely dismissing it while it has proved time and time again that is has merit is complete ignorance. You sound exactly like Simmons.
 

DopeyFish

Not bitter, just unsweetened
sounds like a waste of time

It is and it isn't. I can get basic data running without much analysis. (like 20-30 minutes per game)

And i still have crowd sourcing as an option... i know some people would love to do it because why not? Just want to see how much i could do alone (or at least with limited people) so that the info is at least as accurate as possible... at least until i can figure out ways to determine entry confidence
 

Socreges

Banned
It likely will. Maybe not in direct details but it should highlight a lot of shortcomings of the team. Problem areas, weaknesses, strengths.

The stat set, in theory, is the most powerful stat suite for hockey. But there's guaranteed to be anomalies with this type of stat and it's always evolving.

So when you see stats for this year and next year you look at the stats again, they'll be marginally different.

But the idea is to bring a general understanding to what's going on during play instead of relying on shots/attempts for and against. The QoC and QoT stats bug me the most just because they are incredibly inaccurate. I have watched some games where players will take so many crappy shots from perimeter but corsi dictates they are gods despite none of those shots having a chance to go in... which means they are overvaluing the event, and massively screwing up D ratings. Defense ratings for hockey are abysmal at the moment. Corsi, QoC and +/-? Rough stuff.

In zone attack and containment, plays off the rush all need to be treated differently. With enough data, i could effectively determine if a team is playing lucky or not. (prone to collapse) or if a team is winning despite under achieving.

I am still fleshing out the stats... at about 32 right now and it will probably grow to around 100~ by the time I'm done. (some will be designed for team and player)
Are you going to be evaluating the 'quality of shot'?
 

Red_Man

I Was There! Official L Receiver 2/12/2016
The funniest part of all these discussions is people discrediting stats like Corsi while stats like +/- are still recorded and used as fact regularly throughout the NHL.
 

DopeyFish

Not bitter, just unsweetened
How will DopeyStats account for the fact that much of its data has been generated by the subjective perceptions of a biased hockey fan?


Basically breaking down the offensive ice surface into zones and then bringing in a small set of variables (goalie in position, transition, or out of positions) type of shot, man in front of net, pass/one timer, whatever. Determine the shooting% for everything. Throw the shooting% back through a series of other calculations, determine missed shots with the shot quality, see the variance. Determine the average offense by adding all shots together, subtracting that figure from GF determines luck. The shot quality then can be rated across league. Shot quality then can be determined across players, and then you can see what the defense is doing. You can map the figures like corsi to get FAR better QoC and QoT figures as well as bring better context to the entire corsi suite as a whole. And for goaltenders? You can do some fun stuff like seeing their sv% against the upper half of quality shots to determine a more accurate view of their play.
 

Marvie_3

Banned
Are you going to be evaluating the 'quality of shot'?
He has a point there. A shot on goal and a shot that hits a defender 3 feet in front of the shooter should not be treated equally.

The funniest part of all these discussions is people discrediting stats like Corsi while stats like +/- are still recorded and used as fact regularly throughout the NHL.
Christianity is used as fact throughout north america. Doesn't mean it's right either.
 

Socreges

Banned
Basically breaking down the offensive ice surface into zones and then bringing in a small set of variables (goalie in position, transition, or out of positions) type of shot, man in front of net, pass/one timer, whatever. Determine the shooting% for everything. Throw the shooting% back through a series of other calculations, determine missed shots with the shot quality, see the variance. Determine the average offense by adding all shots together, subtracting that figure from GF determines luck. The shot quality then can be rated across league. Shot quality then can be determined across players, and then you can see what the defense is doing. You can map the figures like corsi to get FAR better QoC and QoT figures as well as bring better context to the entire corsi suite as a whole. And for goaltenders? You can do some fun stuff like seeing their sv% against the upper half of quality shots to determine a more accurate view of their play.
You haven't addressed the problem. All you've done is break down one subjective decision into several.

He has a point there. A shot on goal and a shot that hits a defender 3 feet in front of the shooter should not be treated equally.
I never said otherwise?

Dat Minnesota Nice™ apology doe
Minnesota sucks. Stop trying to make it look so gosh darn unique and wonderful.

A thousand apologies if that came across harshly!
 

DopeyFish

Not bitter, just unsweetened
You haven't addressed the problem. All you've done is break down one subjective decision into several

The idea is not to determine what my opinion is. It's what to determine what is and what isn't happening.

So when I fill out the data for the shot, if the goalie is ready for a shot and not moving, the data is entered as such. Where the puck is where i place it. If there's a person in front of the net (of the shooters team), i will enter it as such. I am not determining the shot quality from opinion, but from the shooting percentage of variables. Data won't be perfect (some shots will be difficult to determine) but it paints a far more accurate picture than simply shots for and against.
 
The QoC and QoT stats bug me the most just because they are incredibly inaccurate. I have watched some games where players will take so many crappy shots from perimeter but corsi dictates they are gods despite none of those shots having a chance to go in... which means they are overvaluing the event, and massively screwing up D ratings.

The point of Corsi is (an attempt) to illustrate possession. The player has the puck which means the other team does not. He may take a weak shot from the point or he may keep it etc.

Sure you need to go deeper than that but it's a start and that's how far that stat goes.
 

DopeyFish

Not bitter, just unsweetened
The point of Corsi is (an attempt) to illustrate possession. The player has the puck which means the other team does not. He may take a weak shot from the point or he may keep it etc.

Sure you need to go deeper than that but it's a start and that's how far that stat goes.

Food for thought, the act of shooting is a player losing possession. Just because you generate more shots doesn't mean you have higher possession numbers.

Corsi isn't a possession stat, it's a shooting differential stat.
 

Socreges

Banned
The idea is not to determine what my opinion is. It's what to determine what is and what isn't happening.

So when I fill out the data for the shot, if the goalie is ready for a shot and not moving, the data is entered as such. Where the puck is where i place it. If there's a person in front of the net (of the shooters team), i will enter it as such. I am not determining the shot quality from opinion, but from the shooting percentage of variables. Data won't be perfect (some shots will be difficult to determine) but it paints a far more accurate picture than simply shots for and against.
You're not a computer that can objectively evaluate reality, Dopey. You're a flawed human being with biases. In determining Yes/No for various events (eg, is there an offensive player in front of the net?) you will frequently encounter uncertain gray areas where a decision has to be made. If it's Gardiner taking the shot and it's JVR kind of in front of the net but not really, am I supposed to trust that your bias won't come into play? And that when you aggregate all of these decisions and data, that the picture you've painted is actually meaningful? You would say YES but I'm inclined to believe no.
 

Smiley90

Stop shitting on my team. Start shitting on my finger.
You're not a computer that can objectively evaluate reality, Dopey. You're a flawed human being with biases. In determining Yes/No for various events (eg, is there an offensive player in front of the net?) you will frequently encounter uncertain gray areas where a decision has to be made. If it's Gardiner taking the shot and it's JVR kind of in front of the net but not really, am I supposed to trust that your bias won't come into play? And that when you aggregate all of these decisions and data, that the picture you've painted is actually meaningful? You would say YES but I'm inclined to believe no.

he obviously is going to extrapolate the area the players covered between where the shot goes off and the net and then subtract the area of which the stick is for deflections, subtract an average area for where the stick COULD go in the time the puck takes to reach the player based on average reaction time of said player (based on... stats). He's going to do all this based on a 3d computing model that simulates what is happening on the ice based on a 2d recording.
 

TUSR

Banned
dopey not to shit on your endeavour here, its a great idea dont get me wrong. its just way too much for 1 person, i honestly think its too much for 3 people per game to do.

you are going to be spending a lot of your downtime updating statistics until you lose your mind.
 

DopeyFish

Not bitter, just unsweetened
You're not a computer that can objectively evaluate reality, Dopey. You're a flawed human being with biases. In determining Yes/No for various events (eg, is there an offensive player in front of the net?) you will frequently encounter uncertain gray areas where a decision has to be made. If it's Gardiner taking the shot and it's JVR kind of in front of the net but not really, am I supposed to trust that your bias won't come into play? And that when you aggregate all of these decisions and data, that the picture you've painted is actually meaningful?

Why would i have any bias determining if a player is in front of the net?

It's basically just a factor to determine the quality of that shot. It's balanced across all other shots with those same variables. The idea is... the player would have to be screening the goalie. Not to the left. Not to the right. But actively trying to block the goaltenders line of sight.

The only issue i have run across is zone selection, but it should have negligible impact across the data set.

The data that I'm forming is not very subjective. It's yes or no questions. No grey area except zoning.
 

Socreges

Banned
Why would i have any bias determining if a player is in front of the net?

It's basically just a factor to determine the quality of that shot. It's balanced across all other shots with those same variables. The idea is... the player would have to be screening the goalie. Not to the left. Not to the right. But actively trying to block the goaltenders line of sight.

The only issue i have run across is zone selection, but it should have negligible impact across the data set.

The data that I'm forming is not very subjective. It's yes or no questions. No grey area except zoning.
I gave you an example of how you could be biased. ie, Gardiner quality of shot. You'll be the generator of some of the data and could, for example, make Leafs players look better than they actually are.

And of course there would be gray area. Many instances will be clear cut but many could also go either way and will require a subjective judgment call of Yes or No.
 

DopeyFish

Not bitter, just unsweetened
I gave you an example of how you could be biased. ie, Gardiner quality of shot. You'll be the generator of some of the data and could, for example, make Leafs players look better than they actually are.

And of course there would be gray area. Many instances will be clear cut but many could also go either way and will require a subjective judgment call of Yes or No.

It wouldn't make leafs players better unless they did that one specific shot and no one else did

Basically when i'm detailing the shot, it's being added to all shots of the exact same type from the same area for all teams.

So if that one shot had a 5% shooting percentage, it's 5% shooting percentage for all. I don't rate on team by team basis except with the league wide shot quality, hence why the numbers change a year down the line because the data set is more dense and more accurate.

So the teams actual sh% with those variables will be different than the league wide data, but considering the amount of variables used, using a single teams data just wouldn't be feasible even for comparison sake.
 
Food for thought, the act of shooting is a player losing possession. Just because you generate more shots doesn't mean you have higher possession numbers.

Corsi isn't a possession stat, it's a shooting differential stat.

The whole point of possession is to generate scoring chances(shot attempts). They're not playing keep-away. You don't win if you don't generate scoring chances. Teams with good possession metrics generate more shot attempts because they have the puck more than their opponent. Even if they lose possession taking shots(and not always you can get your own rebound etc) they are better at getting it back.

You're right, by definition it's a shooting differential stat but it's used to illustrate puck possession. If you have a positive Corsi rating it means you are generating more shot attempts than giving them up. Which means you have the puck more often than your opponent. Teams that have the puck tend to win more than they lose (over a large sample size -- ie. an 82 game season)

http://www.extraskater.com/glossary
 

DopeyFish

Not bitter, just unsweetened
The whole point of possession is to generate scoring chances(shot attempts). They're not playing keep-away. You don't win if you don't generate scoring chances. Teams with good possession metrics generate more shot attempts because they have the puck more than their opponent. Even if they lose possession taking shots(and not always you can get your own rebound etc) they are better at getting it back.

It's technically a shooting differential stat but it's used to illustrate puck possession. If you have a positive Corsi rating it means you are generating more shot attempts than giving them up. Which means you have the puck more often than your opponent.

http://www.extraskater.com/glossary

Completely false. Again. Just because you are shooting more doesn't mean you have the puck more than your opponent. It also doesn't mean they are better at retrieving the puck either. It just means they are generating more shots. That's all. Even shooting more doesn't mean you're likely going to win, either. Shooting more than your opponent also doesn't mean you're more likely to score. Allowing more shots against doesn't mean you're playing worse defensively.

It's not demonstrating possession because it's a shot differential stat. If you want possession stats... why not look at possession stats?

Just because you can see that the two (possession and shot differential) agree more often than not doesn't mean it's a stat that accurately illustrates possession. All it illustrates is that there's a higher chance that the scenarios above are true but by definition it doesn't mean that it is.

Corsi and +/- are of the same ilk.
 

Socreges

Banned
It wouldn't make leafs players better unless they did that one specific shot and no one else did

Basically when i'm detailing the shot, it's being added to all shots of the exact same type from the same area for all teams.

So if that one shot had a 5% shooting percentage, it's 5% shooting percentage for all. I don't rate on team by team basis except with the league wide shot quality, hence why the numbers change a year down the line because the data set is more dense and more accurate.

So the teams actual sh% with those variables will be different than the league wide data, but considering the amount of variables used, using a single teams data just wouldn't be feasible even for comparison sake.
You're still missing the point. I won't repeat myself again, though. No need. Let's just say I personally won't see any value in DopeyStats for the reasons I've mentioned. And so, I think it'll be a colossal waste of time for you.

On the bright side you'll get to watch a shit load of hockey. Hopefully you enjoy the process.
 

Quick

Banned
Spotify has a playlist for everything.

Just need to rename this playlist I'm listening to as "Songs To Simp To."
 

DopeyFish

Not bitter, just unsweetened
Why couldn't you have put this at the top so I knew to stop reading then and there


yes
yes
yes

huh

Designed for a purpose, doesn't work for purpose. Ends up being a stat which kinda says something but really doesn't mean shit.

You're still missing the point. I won't repeat myself again, though. No need. Let's just say I personally won't see any value in DopeyStats for the reasons I've mentioned. And so, I think it'll be a colossal waste of time for you.

On the bright side you'll get to watch a shit load of hockey. Hopefully you enjoy the process.

Despite the fact I already know it works and despite the fact that it's COMPLETELY IMPOSSIBLE TO RIG THE STATS...like holy shit, what part of league shared data set and both offensive and defensive stats derived from the same datasets(meaning there's a balance) do you not understand? I'm not going to be spitting out random stats... there's going to be game logs and breakdowns to the shots so you can sit there and whine about how accurate I keep being or something until you get bored and decide to troll someone else.
 

Smiley90

Stop shitting on my team. Start shitting on my finger.
Designed for a purpose, doesn't work for purpose. Ends up being a stat which kinda says something but really doesn't mean shit.



Despite the fact I already know it works and despite the fact that it's COMPLETELY IMPOSSIBLE TO RIG THE STATS...like holy shit, what part of league shared data set and both offensive and defensive stats derived from the same datasets(meaning there's a balance) do you not understand? I'm not going to be spitting out random stats... there's going to be game logs and breakdowns to the shots so you can sit there and whine about how accurate I keep being or something until you get bored and decide to troll someone else.

All he's saying is that things like "did he screen the goalie or not" are VERY subjective, calm your tits
 

Smiley90

Stop shitting on my team. Start shitting on my finger.
is there a guy standing directly in front of the goalie? Y/N

not difficult

is the goalie looking to the side of him? is the goalie looking over him? is he standing halfway covering the goalie? 3/4? Does the goalie's handedness matter? Is the goalie in butterfly? Is the goalie position wrongly because of the screen, or correctly despite of it, or unrelated?
 

DopeyFish

Not bitter, just unsweetened
is the goalie looking to the side of him? is the goalie looking over him? is he standing halfway covering the goalie? 3/4? Does the goalie's handedness matter? Is the goalie in butterfly? Is the goalie position wrongly because of the screen, or correctly despite of it, or unrelated?

you do realize that the following parts:

Zone (split into 5, 10 if not using mirrored), shot type, goalie in position (3 choices to limit "subjectivity"), goalie obstructed, one timer is effectively 120 (or 240) different possible shots?

the concept behind the obstruction was to have two sides. Goalie with a clear lane, Goalie without. you get into very miniscule details like that and you're entirely screwing the data set (limiting usable data) or you're not understanding the purpose.
 

Smiley90

Stop shitting on my team. Start shitting on my finger.
you do realize that the following parts:

Zone (split into 5, 10 if not using mirrored), shot type, goalie in position (3 choices to limit "subjectivity"), goalie obstructed, one timer is effectively 120 (or 240) different possible shots?

the concept behind the obstruction was to have two sides. Goalie with a clear lane, Goalie without. you get into very miniscule details like that and you're entirely screwing the data set (limiting usable data) or you're not understanding the purpose.

that's Soc's and my entire argument.

How are you going to determine if the goalie's lane was clear or not. And don't fucking say "if a player is obstructing his view" because that also depends on a lot of things.

But I'll do what Soc did and just leave you to yourself now too.
 

TUSR

Banned
You're like one of those types that never admit they're wrong.
You should take up knitting, infinitely better use of time.
 

DopeyFish

Not bitter, just unsweetened
Wow, It's like I'm arguing with a bunch of preschoolers.

IS THERE A FUCKING PLAYER DIRECTLY IN FRONT OF GOALIE?

It doesn't matter if he's fucking dancing

It doesn't matter if the goalie is moving his head around like an owl

It doesn't. Fucking. Matter.

The whole purpose of that variable is 1) to make sure that those plays don't inflate clean shots so we get clearer picture 2) to demonstrate the value (if any) of putting someone in front of the net

It's the event that matters, not the minor details. That's why i am saying there is no subjectivity. Is there a guy in front of the goalie? Yes or no.
 

DopeyFish

Not bitter, just unsweetened
I give up.

FreshPaint-0-20140811-025249_zps25099a62.jpg~original
 

Socreges

Banned
Personally I found Dopey to be more persuasive as he increasingly resorted to cursing, caps and name-calling, despite still being depressingly obtuse.
 
Top Bottom