• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

Amazon releases new free game engine Lumberyard (based on CryEngine)

Not when they already have it. EA will have all of these things with a team which do ops 24/7, or at least pass that work onto a 3rd party. You still need these things if you were running on AWS. AWS just gives you hardware on tap, it doesn't manage it for you. They'd still need a team of engineers to manage and monitor it, just exactly like what I'm currently doing as I'm typing this. They still need to create their own rules for elasticity, define their own firewall rules, templates, builds and have it monitored round the clock. Amazon don't do that for you. For EA, they'll have the hardware and the people. To pay a premium for that, when they already have it, is stupid.

So there are two things to address here: time and money.

1) Time. The amount of time it would take to setup the hardware and also build tools that even come close to touching services and like Lambda, RDS (specifically auto-failover and HA, not just "we have a DB cluster"), containers, autoscaling of instances, variety in machine type with automation, reactionary tasks based on metrics monitoring, load balancing, smart & versatile traffic and security rules (application or user based), smart messaging and queueing services, etc, that all talk to each other natively would be *crazy* when those tools already exist and have a whole hell of a lot of proven history behind them.

2) Cost. These tools exist. This feature rich, robust ecosystem exists. The hardware required to build something even remotely as capable would be so much more expensive than using something like GCE or AWS.

Not to mention the additional time and cost of maintaining the hardware, fixing the services you created when you make changes and they break, and the cost of staffing the hardware and software engineers that would have to build this ecosystem for your company. Speaking from experience, the hardware cost and manpower to build and maintain something like that is many, many times the cost of using GCE or AWS where these tools already exist and you don't have to worry about hardware malfunction or maintaining the ecosystem and tools yourself.

For the sake of simplicity, let's say you need to spin up infrastructure for an application (that already exists in a repo somewhere) for some testing.
Let's say your checklist is as follows:
  • 2 or 3 private subnets in different zones/locations for HA for the instances themselves
  • 2 or 3 publicly accessible subnets to use for routing to the outside world
  • NAT, via separate box or whatever, for outbound traffic routing in the above
  • a VPC w/ IPsec from any corporate networks
  • route tables for all of the above and any routing to DBs or external services
  • application and user based security rules
  • load balancer that can balance to the separate zones/locations (& traffic rules)
  • an instance group with autoscaling capability
  • the rules that define those scaling parameters
  • monitoring
  • reactionary rules that trigger the scaling
  • a database with slaves, including at least one in another zone for HA
  • automated DB backups that only go back X amount of days
  • image of the completely configured server or use with scaling up

Obviously that list is missing bells and whistles that I'd consider 100% necessary from things like CDN capability, Terraform creation management, memcached type functionality, Lambda-style code consumption, a logging cluster, deploy pipeline, etc, but like I said... let's keep this simple for the sake of discussion.

How long would that above list take to build with an in-house server farm & the mirrored build at 2 other locations?
How long would it take to build that using your own in-house hardware + something like Xen or VMware?
How long would it take to build all the tools that allow for in-house auto-scaling of VMs?
How long would it take to build the network infrastructure (or rules & shaping if infra is already in place) that would allow you to have HA to the mirrored application and hardware/VM servers?
How long would it take to set up a monitoring system/server that has the ability to trigger meaningful actions/reactions if metrics are in the red?
How many people would you need to create this infrastructure?
How many people to monitor it?
How many to deal with hardware failures?
How long would it take to address hardware failures?
How expensive would it be to create this infrastructure for this one application?

How long would it take you to build the aforementioned list in AWS to the point where it was live and useable for testing? Give me a few hours. Alone. Done. And I'm not even cocky enough to say I'm some amazing wizard at this job; I know people who are, though, and can do this shit twice as fast as I can.

Obviously this doesn't take application development into account, as I mentioned, but I just wanted to make sure no one thought I was claiming that was included in the time estimate.

How about a mirrored staging environment? Double the time for both processes and see which comes out ahead.

How about the resiliency comparison of these two options when using tools like Chaos Money, Chaos Gorilla, Chaos Kong or Bees w/ Machineguns? The fallout, failover and recovery, in my experience, has been not even worth comparing. Not even in the same ballpark. Obviously most of the functionality of the Chaos tools is geared toward AWS, but there are resiliency tests you can still utilize in a local environment to see how your infra holds up. Not that AWS is perfect or anything, but it's soooooo much easier to architect for this sort of HA than it is to coordinate that across multiple self-run physical clusters.

Sure, you can absolutely use things like Terraform, Puppet, Chef, Docker, whatever internally. In fact, those are tools that I hope to christ places like EA with their own server infra are using. But those are tools which are limited by the same bottlenecks of your infrastructure and can only speed things up as much as the infrastructure allows them to. They'd be faster with beefier in-house hardware acting as a home to your VMs, but then we get back to cost.

Cost? The above infra in AWS would be laughable when compared to doing it in-house with purchased hardware and staff requirements.

Staffing the monitoring and maintenance will come nowhere NEAR the amount of people you'd need to maintain something like that in-house, especially if we're talking much, much larger apps that handle many millions of connections and users. The staff to manage that infra in-house would be and is ridiculous, both in cost and headcount, by comparison. Putting a team together to manage it somewhere like AWS? Negligible by comparison.

You mention the amount of time and people it takes to write/implement things like scaling rules and build templates and you are 100% correct, but that shit is absurdly fast somewhere like AWS. Oh, I need to change the floor and ceiling values for scaling for a particular app? Make the value change, apply it to the ruleset, save the new rules to the group. Done. Less than a few minutes. That's hardly an important consideration when weighing in-house to something like AWS or GCE. Monitoring is the same thing. Have people on call for late night hours. Staffing still comes nowhere close to doing this locally.

Also, when you eliminate the need to create, configure and maintain your own hardware infra, you eliminate reaction times. Say you have people on call during off-hours anyway, so you're not actually saving salary costs during the late night hours by using AWS. How close are those on-call engineers to the office? How long does it take them to get out of bed at 3am and get to work? When you have no hardware to care about, that doesn't matter. Log in to your AWS account and manage your infra in your underwear when you get that call at 3am. Log in to your company's VPN if you're going to need to ssh into something and go under the hood to investigate something that metrics are pointing to but not identifying.

This became another rant about 7 paragraphs ago, so I apologize. As much as it may seem like I'm some super argumentative asshole, I'm really not. At least I hope I'm not? Haha. I'm just extremely passionate about this because it's what I do for a living and I see this massive hole in the gaming industry that no one seems ready or willing to fill even when comparable industries are starting to adopt it... albeit just as painfully slow. That's unfair... *few* people are willing to. I just... I've been tasked with making this switch (and evaluating costs of doing so) many, many, many times over the last few years for anything ranging from massive corporate infra with insane connections to small scale volunteer-based organizations. It's what I do. There quite literally hasn't been a single time where in-house hardware (hosting VMs or not) was cheaper *or* more efficient in the long run compared to the current and future state of alternatives, and mostly it's because of tools and not simply "we use VMs!" (which seems to be a very common misconception of a solution). Not in performance, not in cost, not in employee headcount, not in availability, not in scalability... they just don't compare.

Are there times when in-house hardware *DOES* trump something like AWS, Azure or GCE? Of course. Not everything makes sense to have offsite, but typically it tends to be much, much smaller scale stuff and depends on the needs of the service/application. Even then it can be tough to find in-house solutions that make more sense. The $2/mo you'd pay for hosting a fairly heavy traversed static site on S3 (which is essentially (D)DoS proof) is going to be much cheaper than all the equipment and electrical cost you'd need to accomplish the same thing locally. But I digress... The larger your scale, the less sense it has made to avoid making the jump in every situation I've had a hand in evaluating/architecting/implementing/being part of.


They use it plain and simply for dedicated servers. Dedicated servers which spawn and destroy on demand, the exact same thing which AWS does.

I'm too lazy to look back, but I think this was in response to things about Titanfall/Respawn. I'm not sure if this is meant to imply it's all they use it for, or if this is in addition to the other things I'd read up on about their use of Azure. If they did in fact do this, I'd be curious to hear about how it was utilized and what average and peak load metrics were for both instances and application given my constant issues connecting to multiplayer sessions during the first month of launch. Not that I'll ever see that information, but the engineer in me is curious as to why they still suffered those issues. I'm immediately left wondering if it was lack of software optimization with the coupled infra, or if things had a low scale ceiling or what.

Regardless, thanks for the additional info. You don't by any chance have links to them talking about autoscaling for game world instances and connection handling, do you? I'd like to shove my face in that. =P
 
You're asking several different questions rolled into one, so you'd need to be a lot more specific.
For instance, "AAA" companies have used scalable infrastructure for a good long time (eg. Ubisoft has been using it in most of their games since at least 2009). Big companies value internally operating because one of the big parts of added-value from online is the ownership of your user-data, something that drives large parts of Amazon business model for that matter.
What you seem to be ignoring is that infrastructure as such is only a "relatively" small part of the problem, and writting arbitrarily scalable software is NOT a solved problem (if it were, GPU manufacturers would be owning the world right about now), even when you're using sane-compute abstractions for your infrastructure. These things remain hard-work, and they get harder as you add more complexity to your online-compute. There's good reason why embarrassingly parallel problems are the usual showcase for moving local->cloud compute ala Crackdown.

And there's other parts to this - lumping all manner of game-bugs together and blaming "online" for it is convenient, but ignores the fact that most of these games are fundamentally not well written due to the circumstances of their development, so the infrastructure is often the least of their problems (although it does occasionally compound them).

Bah, yeah. My initial post was suuuuper long winded, so I apologize. I do that a lot. I'll probably do it now. Excitement, frustration and coffee–especially combined–do that to me. Like today. ;[

Scalable ≠ autoscalable. Any infrastructure can be scalable. You can always add more hardware or VMs to offset load or connections, but that doesn't do anything without context. Or automation. There's no way to automatically scale hardware and most places seemingly don't have/use the internal tools in place to automatically scale VMs in a meaningful way, either.

Again, I think this is precisely the gap that AWS is attempting to address w/ multiplayer gaming infrastructure (with GameLift).

I'm by no means lumping, or at least not attempting to lump, game bugs together and blame "online" as the cause of everything. What I'm saying is that a lot of the blame (from the industry) surrounding large launches of online-specific games tends to fall on hammered servers that can't handle the load, be it a bottleneck in a service, an application or the instance (hardware or VM) itself. We see literally days and sometimes weeks of session and connection issues before things smooth out. That shouldn't be a difficult problem to solve unless:
1) you're unwilling to take the rather large upfront migration steps to solve it
2) the problem lies within constraints of rushed and constrained development cycles where you see a ton of systemic siloing and the like
3) a combination of the above

Obviously a faster, more robust, automated and versatile systems infrastructure won't solve #2, but it sure as shit can solve #1.

My guess, and it's only a guess, is that it boils down to #3. There's a software dev cycle issue where things are rushed out the door in order to meet arbitrary release dates force-set by the money holders because they don't understand development (or don't care) & the suffering that infrastructure has to endure due to the same lack of oversight and the increased negligence and/or micromanagement that comes with it.

I am in no way blaming developers or systems engineers (in most cases) when I rant about this shit, so much as the decision makers. I've been in situations where no matter how much logic is brought to the table, if a for-the-better long term investment costs some short term manpower diversion and upfront cost, the answer is no. I left that world for a lot of reasons.

This can be done in the gaming industry. This should be done in the gaming industry. If it's being done, it needs to be done better. I know that's extremely idealistic, but I wouldn't be making such a claim if I didn't have some context with which to make it. The one fact-filled context I'm lacking is why the game making industry seems so far behind in this adoption.

I'm curious as to why you think that arbitrarily scalable software is not an easy thing to build. Do you mean in the game industry specifically, or in general? Because it's not hard, especially when you already know the type of infrastructure to develop for/around. I also think I'm missing an obvious implication here... why would GPU hardware manufacturers be "owning the world" because of scalable software? Automatically scalable software is very much a thing. Unless we're talking about client-side, which I'm assuming was the intended specificity I was missing in the above sentence(s). Client side software scaling would, in fact, be a much more absurd beast to tackle. At least in the current state of things, especially in the gaming world. But I'm not talking about games that can scale across 16 GPUs and 4 home PCs Frankenstein'd together. I'm talking about the online server infrastructure that handles the world people connect to for multiplayer and co-op gameplay and the load that connections to that world brings with them. That's solvable. That's doable. It's being done all the time across myriad industries to mitigate the problem of surprise and planned peak volume and load.

Hey, you want to get lunch? Can everyone entertaining my coffee-fueled discussion on this topic just meet somewhere this weekend and have lunch, some wine and talk about this for like 6 hours? We all live in the same city, right?
 
Not to correct my original statement, but as an adendum, I am quite a lay person regarding this and I imagine your concerns are probably their concerns as well. The persistent universe is a long term love project, I imagine they are considering a lot of finer details to make it work in a mostly bottleneck-free way.

This whole thread has made me want to reach out to any of my video game contacts (sadly none are systems engineers) and see if they know people in their organizations who would be willing to discuss this stuff with me if I sign NDAs. Haha. I JUST NEED TO KNOW.

...which is exactly why it won't happen. ;[
 

Durante

Member
Any discussion about managed versus unmanaged is going to have proponents for both sides, because there absolutely are valid pros and cons for each, and how pro a pro is or how big a con is is entirely dependent on the project, especially in production environments where you can have multiple different programmers of multiple different skill levels and multiple language familiarity all working on interlinking systems that often don't have the luxury of peer review or even commenting code.

I'm absolutely not saying Unity is better than Unreal (or that managed is always better than unmanaged), but I will say it serves a specific niche that UE doesn't (and nor does Lumberyard from all appearances) and there are absolutely valid reasons for choosing one over another..
I don't completely disagree with everything you said, but I strongly believe that for games -- what are essentially high-performance real-time applications -- automatic memory management is one of those design choices which can easily be deceiving at the start of a project. As in, as you start out prototyping it seems (and probably is) easier, saving you a few minutes (or even hours) here and there, but the larger and the closer to an actual game your software project becomes the more its drawbacks might hit you. It's a bit like dynamic typing in that regard. Of course, at the point where the drawbacks do manifest - e.g. when the intermittent Unity stutter happens -- it's generally far too late to do anything fundamental about it.
 
This whole thread has made me want to reach out to any of my video game contacts (sadly none are systems engineers) and see if they know people in their organizations who would be willing to discuss this stuff with me if I sign NDAs. Haha. I JUST NEED TO KNOW.

...which is exactly why it won't happen. ;[

Your posts made do some research last 2 hours at work into azure and the cloud.
Made an azure free trial account yesterday to do some hacking, need to just find something to azure storage and worker jobs. Im a junior software developer so this is
all really new to me.

Don't tell my boss :p
 

fred

Member
So has anyone done anything with this yet..? Haven't had a chance to look at it yet but considering the trees and vegetation, Diner thingy and the legacy stuff I would imagine that it's quite possible to develop something quite simple that looks pretty good.

Might have a look on YouTube to see if anyone has put stuff out there. :eek:D
 
Your posts made do some research last 2 hours at work into azure and the cloud.
Made an azure free trial account yesterday to do some hacking, need to just find something to azure storage and worker jobs. Im a junior software developer so this is
all really new to me.

Don't tell my boss :p

I'm much less familiar w/ Azure and GCE than I am with AWS. AWS is part of my life blood at this point, but lately GCE has gotten much better with containerization than AWS and GCE charges by the instance minute instead of instance hour.

I will say from what experience I *do* have w/ Azure that it works much better w/ Windows-ready environments (which is why it's probably so peachy for games that are Windows and XB1 only) as opposed to, say, Linux. But it definitely depends on what you're doing. Obviously a native MS SQL sync is hopefully not something the gaming industry is very concerned with. Haha.

If you have any questions or just want to nerd out about this stuff, PM me and we can draw pictures of pizza together and stuff.
 

Lister

Banned
I don't completely disagree with everything you said, but I strongly believe that for games -- what are essentially high-performance real-time applications -- automatic memory management is one of those design choices which can easily be deceiving at the start of a project. As in, as you start out prototyping it seems (and probably is) easier, saving you a few minutes (or even hours) here and there, but the larger and the closer to an actual game your software project becomes the more its drawbacks might hit you. It's a bit like dynamic typing in that regard. Of course, at the point where the drawbacks do manifest - e.g. when the intermittent Unity stutter happens -- it's generally far too late to do anything fundamental about it.

There a lot that cna be done to optimize garbage collection in managed languages though, and usually performance issues with GC are due to not doing things the way they should be done in order for the engine to remain performant.

I think this is mostly an inexperienced programmer thing (atleast in the case of the typical kinds of games that are attached to engines like Unity) rather than a can't manage the memory heap thing.

Or atleast that's my suspicion.
 

fred

Member

Cheers for that, sounds like they're fully aware of how pants the CryEngine documentation and support is and are going to make a decent effort of improving them. Funnily enough the Unreal Engine a good few years ago was in a VERY similar state. If Amazon put plenty of effort and expense into this they could have the documentation and support on the same level as Unity and Unreal are at now in a year or two.
 
Cheers for that, sounds like they're fully aware of how pants the CryEngine documentation and support is and are going to make a decent effort of improving them. Funnily enough the Unreal Engine a good few years ago was in a VERY similar state. If Amazon put plenty of effort and expense into this they could have the documentation and support on the same level as Unity and Unreal are at now in a year or two.


I'm prolly gonna go even though our next project isn't going to use Lumberyard.
 
http://www.gdconf.com/gamenetwork/2017_jan.html#2

Q: What are some of the major changes to Lumberyard we can expect at 2017?

Our team has grown a lot over the last year, and we continue to grow. In 2017, you’ll see a steady stream of new features and refinements, including a new component entity system so you can build complex gameplay faster than ever, a new asset pipeline that lets you import and do live updates of game assets across target platforms in seconds, a new multi-threaded rendering architecture that takes full advantage of the latest technologies, an improved editor UX, new cloud integrations to help you dynamically change game data on the fly to better engage your players, and, of course, new integrations with Twitch to help you reach and engage that audience of 100+ million hardcore gamers.

They have several talks lined up for GDC: http://schedule.gdconf.com/search-sessions/lumberyard
 

M3d10n

Member
My qualms with using Lumberyard and spending time building a skillset around it is future support. Its role as a trojan horse for Amazon services makes my wary about long term commitment.

I also heard Amazon has a deadline (a few years) by which they need to have replaced all CryEngine code with their own original code, which makes me even more cautious.
 

trugc

Member
I also heard Amazon has a deadline (a few years) by which they need to have replaced all CryEngine code with their own original code, which makes me even more cautious.

I heard it was 10 years. Should be enough for them to build a new engine.
 

Panajev2001a

GAF's Pleasant Genius
My qualms with using Lumberyard and spending time building a skillset around it is future support. Its role as a trojan horse for Amazon services makes my wary about long term commitment.

I also heard Amazon has a deadline (a few years) by which they need to have replaced all CryEngine code with their own original code, which makes me even more cautious.

I think Amazon is building up on his area and hiring pretty awesome people, I think it will be ok :).
 
Ah. I see. Aren't they supposed to be about a year from release?
I hope switching engines isn't too much of a hassle.

Lumberyard and StarEngine are both forks from exactly the SAME build of CryEngine.

We stopped taking new builds from Crytek towards the end of 2015. So did Amazon. Because of this the core of the engine that we use is the same one that Amazon use and the switch was painless (I think it took us a day or so of two engineers on the engine team). What runs Star Citizen and Squadron 42 is our heavily modified version of the engine which we have dubbed StarEngine, just now our foundation is Lumberyard not CryEngine. None of our work was thrown away or modified. We switched the like for like parts of the engine from CryEngine to Lumberyard. All of our bespoke work from 64 bit precision, new rendering and planet tech, Item / Entity 2.0, Local Physics Grids, Zone System, Object Containers and so on were unaffected and remain unique to Star Citizen.

https://www.extremetech.com/extreme...mises-move-amazons-lumberyard-wont-delay-game
 
Curious if their visual scripting is as in depth as Unreal's blueprints or if it's just there to do some basic scripting

Amazon seems to be putting a lot of work into it which is awesome
 
Top Bottom