Sign in to follow this  
Zuelatak

Rift lag discussion

Recommended Posts

A lot of the rifts on NFI have been quite laggy and I haven't seen much discussion from the devs or community about addressing it. The most recent Cadence one probably was the worst with around 5-10 minutes of server lag. Looking to propose some ideas. Some require a lot of work and some hopefully require little. If anyone else has an ideas feel free to drop them as a reply and I'll potentially edit the post with them. I think this is an issue that should be addressed if there's ever hope for more people to play the game. The answer "It'll sort itself out when players quit and/or stop coming" is not a solution and will never allow this game to grow beyond where it is now. 

 

  • Switch from Java to something else.
    • Definitely the hardest, but probably the most rewarding.
  • Port game to Vulcan.
    • I've heard that this should be really simple using Steam and should have a decent impact.
  • Have the rift portal be a gateway for players to hop to a server dedicated to just the rift instead of the enemies coming through it into the main server
    • Even if the dedicated server lagged at least it wouldn't lag all of Cadence and other servers whilst the rift was going.
  • Tone down the enemy spawns.
    • Potentially make them harder and more rewarding to make up for the loss of enemies. Have enemies scale in health/dmg instead of numbers.
  • Make summoners melee or at least tone down their AI calculations
    • I've heard that they are constantly calculating how to stay the maximum distance away from the closest player.
  • Make the AI do less calculations. Make them stupid.
    • Essentially the Summoner suggestion, but applied to all of the AI. For most of the rift half of the rift creatures are just dancing back and forth on a tile.
  • Like 6

Share this post


Link to post
Share on other sites

I like the idea of portal folks to a dedicated server just for the rift. Although, that will only help people who aren't doing it. I don't think the rifters will have a better experience. But then again if the dedicated server has nothing to do but handle the rift it might be better.

 

It would be nice if someone could clarify what the real problem is.

1. A hardware issue where Wurm servers aren't computing powerhouses so they struggle to keep up.

2. A network issue where the more people there are in local the more often the server tries to push position update data to the player's client (I think this is the real problem).

  • Like 1

Share this post


Link to post
Share on other sites
1 minute ago, Ogare said:

2. A network issue where the more people there are in local the more often the server tries to push position update data to the player's client (I think this is the real problem).

I imagine it's more this. Pretty sure their hardware is fine. I think most of the initial suggestions should still help with this cause there'd be less data being sent around to clients and enemies. However, it would be awesome to know from the devs where issue lies so we could better help.

Share this post


Link to post
Share on other sites

On Cadence rifts effectively lag the whole server no matter where you are at on the map when they are happening. on Harmony the other say as the server population is small there was a major difference in lag between the two. A server just for the rifts on all servers would really fix a lot of the issues for people at the rifts and those not at the rifts.

  • Like 1

Share this post


Link to post
Share on other sites

Seriously, there needs to be some transparency from the devs on this as this is fairly game breaking. I have never enjoyed how quiet the management is when the game is broken.

 

The first rift on NFI was by far the worst for me lag wise. The second rift it seemed the lag was not as severe but it still was there. Obviously I had a better experience than others at the second rift. 

 

I love this game and want it to continue to grow and get better.

Share this post


Link to post
Share on other sites

The first two options have no bearing on the issue. 

 

As for the lag, we are aware of it and will be investigating the root cause of the issue, and see if there's anywhere we can directly improve before jumping to major changes to the system. 

  • Like 2

Share this post


Link to post
Share on other sites

is there a chance you could participate in the next one on Cadence Retro?

  • Like 1

Share this post


Link to post
Share on other sites

@RetrogradeIf the first two options won't effect it could you explain why? So, we have a better understanding of how things work. Also, could you address this in the next Valrei International and possibly give us an update of how the investigation is going? That way the general public who don't look at these posts know it's being looked at and potentially get an idea of how long it'll take to solve. It feels to me like a lot of issues are addressed only through replies and the response is in the form of "its being investigated" which I imagine is why people often call the team quiet or non-transparent. 

12 minutes ago, shuego said:

I have never enjoyed how quiet the management is when the game is broken.

 

  • Like 1

Share this post


Link to post
Share on other sites
Just now, Zuelatak said:

@RetrogradeIf the first two options won't effect it could you explain why? So, we have a better understanding of how things work. Also, could you address this in the next Valrei International and possibly give us an update of how the investigation is going? That way the general public who don't look at these posts know it's being looked at and potentially get an idea of how long it'll take to solve. It feels to me like a lot of issues are addressed only through replies and the response is in the form of "its being investigated" which I imagine is why people often call the team quiet or non-transparent. 

 

The first two are tasks that would take an extremely significant amount of time to do in which the entire base code is rewritten, and a new engine completely rebuilt, and offer little to address the issues regarding server lag. 

 

We will update this when necessary, or when we have information, but quite often it just winds up as bug fixes or lag fixes. 

Share this post


Link to post
Share on other sites

I think maybe having it be investigated and a post made sort of like how there is a post mortem report on an outage explaining what the issue was, how it was fixed or how it could be fixed might be a good idea. Or just coming out and saying if it's just going to be the way it is. I don't personally play on the new cluster, but from the streams I watched of the events it was pretty bad. Even on the old cluster, if there is some event or slaying we get the similar issues with smaller numbers. I think a concern should be that even those not doing the events suffer. When as a player I have to take a couple hour break from the game because of the lag of an event going on that I am not even taking part in is not a good thing. 

 

The replies in here make it seem like the issues/causes are known, but that we just aren't going to be getting that kind of information as the players. Which honestly I would rather just be up front told that the information is private, then get the 'we will look into it and update when necessary'

 

After all communication is a priority that the staff wants to work on, right?

  • Like 1

Share this post


Link to post
Share on other sites

Issue is known, cause is not. 

 

We do share information when we encounter things that are more than just "oh yeah too much going on at once" but will always like to share what we can. 

Share this post


Link to post
Share on other sites

@RetrogradeIs it possible to discuss with everyone about a hotfix? Something temporary to at least fix the problem until a better solution comes to mind. Like the reducing spawn quantity and just improving the health/dmg/reward of each enemy. It's unclear how important this issue is to the team and how quickly it'll be resolved.

Share this post


Link to post
Share on other sites
2 minutes ago, Zuelatak said:

@RetrogradeIs it possible to discuss with everyone about a hotfix? Something temporary to at least fix the problem until a better solution comes to mind. Like the reducing spawn quantity and just improving the health/dmg/reward of each enemy. It's unclear how important this issue is to the team and how quickly it'll be resolved.

No, this won't be a hotfix unless it's to disable rifts at this point. Without knowing the underlying cause there's nothing TO hotfix. 

Share this post


Link to post
Share on other sites

@RetrogradeDo you guys have tools to simulate and test to see if the solution I just mentioned would fix the problem? When I say hotfix I mean throw something out there that can potentially fix it until the underlying cause is discovered. 

Share this post


Link to post
Share on other sites

Not in a short timeframe, first step would be identifying the cause by using our polling tools next rift. 

Share this post


Link to post
Share on other sites
3 minutes ago, Retrograde said:

Not in a short timeframe, first step would be identifying the cause by using our polling tools next rift. 

sounds like you will be at the next rift then uh 🤑

Share this post


Link to post
Share on other sites

Some short remarks:

1. Abandoning Java: That would be a total rewrite of the game and the engine, it would be a project for a not small crew, over a year or two at minimum, with a budget in the millions € or $. In short: no way.

2. Vulkan API: To my information, it does not support all older boxes. Apart from that, useless for the particular problem which did not seem a graphics trouble, otherwise those (prolly not few) with downscaled graphic settings would have reported less lag. That was not the case to my information.

3. following re changing rift mob AI: As long as there is no evidence that a too large number of mobs, or overly complicated AI was the root case, it would not be sensible to change mob AI.

 

Last rift on Xanadu, we suffered one lag of several minutes, with a participation of 9 combattants, and consequently modest mob numbers. It was the first time in about 120+ rifts I experienced something like that. An evaluation should not exclusively concentrate on NFI. What puzzled me with this particular lag was that I lost well 30% health during the outage. Normally, mobs suffer from lag like us. I therefore tend to believe that this particular lag was network rather than engine based.

  • Like 1

Share this post


Link to post
Share on other sites

 

4 minutes ago, Ekcin said:

Some short remarks:

1. Abandoning Java: That would be a total rewrite of the game and the engine, it would be a project for a not small crew, over a year or two at minimum, with a budget in the millions € or $. In short: no way.

2. Vulkan API: To my information, it does not support all older boxes. Apart from that, useless for the particular problem which did not seem a graphics trouble, otherwise those (prolly not few) with downscaled graphic settings would have reported less lag. That was not the case to my information.

3. following re changing rift mob AI: As long as there is no evidence that a too large number of mobs, or overly complicated AI was the root case, it would not be sensible to change mob AI.

 

Last rift on Xanadu, we suffered one lag of several minutes, with a participation of 9 combattants, and consequently modest mob numbers. It was the first time in about 120+ rifts I experienced something like that. An evaluation should not exclusively concentrate on NFI. What puzzled me with this particular lag was that I lost well 30% health during the outage. Normally, mobs suffer from lag like us. I therefore tend to believe that this particular lag was network rather than engine based.

you should come to cadence rift and see how it is

Share this post


Link to post
Share on other sites
2 hours ago, Ekcin said:

What puzzled me with this particular lag was that I lost well 30% health during the outage. Normally, mobs suffer from lag like us. I therefore tend to believe that this particular lag was network rather than engine based.

The most recent Cadence rift resulted in me walking away from the fight and then 5 minutes later suddenly taking dmg and then dying. When I respawned my corpse wasn't where I was but was in the middle of the horde of rift creatures. The issue is for network problems, and lowering the amount of data being transferred amount I imagine would help. 

 

2 hours ago, Ekcin said:

As long as there is no evidence that a too large number of mobs, or overly complicated AI was the root case, it would not be sensible to change mob AI.

How long are you willing to wait for "evidence"? If tweaking the AI caused the problem to no longer appear then I'd call that a good hotfix. The longer the lag sticks around the more players will hate rifts. 

 

2 hours ago, Ekcin said:

Last rift on Xanadu, we suffered one lag of several minutes, with a participation of 9 combattants, and consequently modest mob numbers. It was the first time in about 120+ rifts I experienced something like that. An evaluation should not exclusively concentrate on NFI.

I agree that SFI should be looked at too, but I would prioritize fixing NFI because you yourself said that this is a first for 120+ rifts. This has been at least a 1/3rd of the rifts on NFI and is where the overwhelming majority of the playerbase is.

Edited by Zuelatak

Share this post


Link to post
Share on other sites
2 hours ago, Retrograde said:

first step would be identifying the cause by using our polling tools next rift.

That's some transparent information. Thank you for sharing that and I hope we get to hear what results you guys get from the next rift.

Share this post


Link to post
Share on other sites
6 hours ago, Ogare said:

I like the idea of portal folks to a dedicated server just for the rift. Although, that will only help people who aren't doing it. I don't think the rifters will have a better experience. But then again if the dedicated server has nothing to do but handle the rift it might be better.

 

It would be nice if someone could clarify what the real problem is.

1. A hardware issue where Wurm servers aren't computing powerhouses so they struggle to keep up.

2. A network issue where the more people there are in local the more often the server tries to push position update data to the player's client (I think this is the real problem).

game code problem :)

Share this post


Link to post
Share on other sites

It's more about how the data is sent to people in the same area. 

 

This has been problematic for years and seen often in PvP and public slayings. When you have 100 people (and alts) in local, even on beastly PC's the game lags a lot. 

Need to figure out a way to send those packets of information more efficiently (compression issue?) between players. 

Share this post


Link to post
Share on other sites
2 hours ago, Zuelatak said:

The most recent Cadence rift resulted in me walking away from the fight and then 5 minutes later suddenly taking dmg and then dying. When I respawned my corpse wasn't where I was but was in the middle of the horde of rift creatures. The issue is for network problems, and lowering the amount of data being transferred amount I imagine would help. 

 

How long are you willing to wait for "evidence"? If tweaking the AI caused the problem to no longer appear then I'd call that a good hotfix. The longer the lag sticks around the more players will hate rifts. 

 

I agree that SFI should be looked at too, but I would prioritize fixing NFI because you yourself said that this is a first for 120+ rifts. This has been at least a 1/3rd of the rifts on NFI and is where the overwhelming majority of the playerbase is.

I want to wait for the evidence the devs can gain from closer observation. What you describe indeed sounds very similar to what we experienced during last Xanadu rift, and also during Pristine black dragon hatchling slaying (ok the hatchling did not harm me, but others to some extent during outage) which had good but not outstanding participation. Last year Xanadu red dragon slaying had about half more participants, and nothing even closely similar.

 

It is not a matter of how many lags during how many rifts, but a fairly recent problem (that was what I wanted to express when citing 120+ previous rift combats). Therefore it is as urgent to investigate on SFI as on NFI because it is unlikely that the root causes, and thus the solutions are much different. Also it is obviously independent of graphics subsystems, and affects all users equally no matter how high or low their individual graphics setting and GC performance are.

 

The typical server based lag (in the old times we had a daily 3min ca. outage due to backups every 4am CET) affects mobs and players equally: None is able to harm the other until frame exchange resumes. The recent cases resemble the situation when your router goes into some undefined state. The network connection (socket) from your provider to the Wurm server will still be open, and the server cannot determine whether the client dropped, or you just went to the bathroom, or your box crashed. All ingame combat events will thus resume (otherwise a player in danger could flee from combat by switching off the computer).

 

In our case, such a situation seems to have occurred between the server and all clients simultaneously, so it is much likely a condition in the server provider's cloud. And it seems not to be confined to rifts, and not very likely being connected with number of participants, mobs spawned, or mob AI.

 

Disclaimer: I am still speculating, and may be wrong. It will be Keenan's job to determine what's going on, and I am confident that he will find out.  

Edited by Ekcin
addendum
  • Like 1

Share this post


Link to post
Share on other sites
9 hours ago, Ekcin said:

Some short remarks:

1. Abandoning Java: That would be a total rewrite of the game and the engine, it would be a project for a not small crew, over a year or two at minimum, with a budget in the millions € or $. In short: no way.

Well not really. The most consuming part in coding is figuring it out. Rewrite something you have already done is most of the time quite fast. The biggest problem would be a new engine. The closest thing would probably realize it with Unity, but I am not sure if It would bring much improvement in the network regards (well it will probably make at least the RAM use of the client better)

Java will always have a big problem with high amount of data changes in a short time. So it is maybe not the wrong idea. To make the Jump to something else. A Wurm 2 if you want so that is basically Wurm but realized in a much more efficient way and another engine that make implementation and bug hunting more easy. But this will stay dreams I guess.

But over all it sounds like a problem with the throughput and/or handling of the amount of data sets given to the player and received by the server. We have to wait until they have to look into it when it happens again.

 

Share this post


Link to post
Share on other sites

lag seems slightly better to me ,not like the old days when it was bad as ,i assume there playing around the wm doesnt dance around that much any more some tinkering is going on .

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this