testserver/Spigot-Server-Patches/0457-Optimize-PlayerChunkMa...

276 lines
15 KiB
Diff
Raw Normal View History

Improve mid tick chunk loading, Fix Oversleep, other improvements Process loads outside of any canSleep check. Original intent was to only apply those restrictions to generations but realized I had some checks higher up the call chain. Reworked the back off strategy to just run every 1 millisecond per world, and to apply the per tick limit to generations only. This guarantees that your chunk will load with at most around 1ms delay. Additionally, fire midTick processing in a few more places, notably the oversleep section so we can keep processing loads here too which has a large up to 50ms window... Speaking of oversleep, we had a bug in our implementation changes for Timings that caused oversleep to not sleep the correct amount. Because we now moved it into the NEXT tick instead of THIS tick, the value of nextTick had already been increased to +50ms, resulting in the risk of sleeping more than it should, but, more importantly, this caused every task that was trying to NOT run during oversleep to actually run during oversleep. This is now fixed. Another small tweak is to the /tps command, to no longer show the star when TPS is right at 20. Due to ineffeciencies in the sleep precision, TPS is commonly 20.02. This causes the star to show up almost constantly, so now only show it if we actually hit a real "catchup". This commit also improves the changes to the CallbackExecutor, in that it now is also recursion safe. It was possible that the executor could run tasks out of desired order if the executor task scheduled more executor tasks. We solve this by ensuring new additions do not enter the currently iterated queue. Each depth level will have its own queue. Fixes #3220
2020-04-26 03:47:29 +00:00
From cbbb713b9977ebf9c0df1bbc3b8f23112867da14 Mon Sep 17 00:00:00 2001
From: Aikar <aikar@aikar.co>
Date: Wed, 8 Apr 2020 03:06:30 -0400
Subject: [PATCH] Optimize PlayerChunkMap memory use for visibleChunks
No longer clones visible chunks which is causing massive memory
allocation issues, likely the source of Humongous Objects on large servers.
Instead we just synchronize, clear and rebuild, reusing the same object buffers
as before with only 2 small objects created (FastIterator/MapEntry)
This should result in siginificant memory use reduction and improved GC behavior.
diff --git a/src/main/java/com/destroystokyo/paper/util/map/Long2ObjectLinkedOpenHashMapFastCopy.java b/src/main/java/com/destroystokyo/paper/util/map/Long2ObjectLinkedOpenHashMapFastCopy.java
new file mode 100644
index 0000000000..e0ad725b2e
--- /dev/null
+++ b/src/main/java/com/destroystokyo/paper/util/map/Long2ObjectLinkedOpenHashMapFastCopy.java
@@ -0,0 +1,32 @@
+package com.destroystokyo.paper.util.map;
+
+import it.unimi.dsi.fastutil.longs.Long2ObjectLinkedOpenHashMap;
+
+public class Long2ObjectLinkedOpenHashMapFastCopy<V> extends Long2ObjectLinkedOpenHashMap<V> {
+
+ public void copyFrom(Long2ObjectLinkedOpenHashMapFastCopy<V> map) {
+ if (key.length < map.key.length) {
+ key = null;
+ key = new long[map.key.length];
+ }
+ if (value.length < map.value.length) {
+ value = null;
+ //noinspection unchecked
+ value = (V[]) new Object[map.value.length];
+ }
+ if (link.length < map.link.length) {
+ link = null;
+ link = new long[map.link.length];
+ }
+ System.arraycopy(map.key, 0, this.key, 0, map.key.length);
+ System.arraycopy(map.value, 0, this.value, 0, map.value.length);
+ System.arraycopy(map.link, 0, this.link, 0, map.link.length);
+ this.size = map.size;
+ this.mask = map.mask;
+ this.first = map.first;
+ this.last = map.last;
+ this.n = map.n;
+ this.maxFill = map.maxFill;
+ this.containsNullKey = map.containsNullKey;
+ }
+}
diff --git a/src/main/java/net/minecraft/server/ChunkProviderServer.java b/src/main/java/net/minecraft/server/ChunkProviderServer.java
index 2e63f64783..3d79e756d9 100644
--- a/src/main/java/net/minecraft/server/ChunkProviderServer.java
+++ b/src/main/java/net/minecraft/server/ChunkProviderServer.java
@@ -750,7 +750,7 @@ public class ChunkProviderServer extends IChunkProvider {
entityPlayer.playerNaturallySpawnedEvent.callEvent();
};
// Paper end
- this.playerChunkMap.f().forEach((playerchunk) -> {
+ this.playerChunkMap.forEachVisibleChunk((playerchunk) -> { // Paper - safe iterator incase chunk loads, also no wrapping
Optional<Chunk> optional = ((Either) playerchunk.b().getNow(PlayerChunk.UNLOADED_CHUNK)).left();
if (optional.isPresent()) {
diff --git a/src/main/java/net/minecraft/server/MCUtil.java b/src/main/java/net/minecraft/server/MCUtil.java
index d9941b38ca..71ab65e00f 100644
--- a/src/main/java/net/minecraft/server/MCUtil.java
+++ b/src/main/java/net/minecraft/server/MCUtil.java
@@ -529,7 +529,7 @@ public final class MCUtil {
WorldServer world = ((org.bukkit.craftbukkit.CraftWorld)bukkitWorld).getHandle();
PlayerChunkMap chunkMap = world.getChunkProvider().playerChunkMap;
- Long2ObjectLinkedOpenHashMap<PlayerChunk> visibleChunks = chunkMap.visibleChunks;
+ Long2ObjectLinkedOpenHashMap<PlayerChunk> visibleChunks = chunkMap.getVisibleChunks();
ChunkMapDistance chunkMapDistance = chunkMap.getChunkMapDistanceManager();
List<PlayerChunk> allChunks = new ArrayList<>(visibleChunks.values());
List<EntityPlayer> players = world.players;
diff --git a/src/main/java/net/minecraft/server/PlayerChunkMap.java b/src/main/java/net/minecraft/server/PlayerChunkMap.java
Improve mid tick chunk loading, Fix Oversleep, other improvements Process loads outside of any canSleep check. Original intent was to only apply those restrictions to generations but realized I had some checks higher up the call chain. Reworked the back off strategy to just run every 1 millisecond per world, and to apply the per tick limit to generations only. This guarantees that your chunk will load with at most around 1ms delay. Additionally, fire midTick processing in a few more places, notably the oversleep section so we can keep processing loads here too which has a large up to 50ms window... Speaking of oversleep, we had a bug in our implementation changes for Timings that caused oversleep to not sleep the correct amount. Because we now moved it into the NEXT tick instead of THIS tick, the value of nextTick had already been increased to +50ms, resulting in the risk of sleeping more than it should, but, more importantly, this caused every task that was trying to NOT run during oversleep to actually run during oversleep. This is now fixed. Another small tweak is to the /tps command, to no longer show the star when TPS is right at 20. Due to ineffeciencies in the sleep precision, TPS is commonly 20.02. This causes the star to show up almost constantly, so now only show it if we actually hit a real "catchup". This commit also improves the changes to the CallbackExecutor, in that it now is also recursion safe. It was possible that the executor could run tasks out of desired order if the executor task scheduled more executor tasks. We solve this by ensuring new additions do not enter the currently iterated queue. Each depth level will have its own queue. Fixes #3220
2020-04-26 03:47:29 +00:00
index be75b9dd14..90e4811157 100644
--- a/src/main/java/net/minecraft/server/PlayerChunkMap.java
+++ b/src/main/java/net/minecraft/server/PlayerChunkMap.java
@@ -55,8 +55,10 @@ public class PlayerChunkMap extends IChunkLoader implements PlayerChunk.d {
private static final Logger LOGGER = LogManager.getLogger();
public static final int GOLDEN_TICKET = 33 + ChunkStatus.b();
- public final Long2ObjectLinkedOpenHashMap<PlayerChunk> updatingChunks = new Long2ObjectLinkedOpenHashMap();
- public volatile Long2ObjectLinkedOpenHashMap<PlayerChunk> visibleChunks;
+ public final com.destroystokyo.paper.util.map.Long2ObjectLinkedOpenHashMapFastCopy<PlayerChunk> updatingChunks = new com.destroystokyo.paper.util.map.Long2ObjectLinkedOpenHashMapFastCopy<>(); // Paper - faster copying
+ public final com.destroystokyo.paper.util.map.Long2ObjectLinkedOpenHashMapFastCopy<PlayerChunk> visibleChunks = new com.destroystokyo.paper.util.map.Long2ObjectLinkedOpenHashMapFastCopy<>(); // Paper - faster copying
+ public final com.destroystokyo.paper.util.map.Long2ObjectLinkedOpenHashMapFastCopy<PlayerChunk> pendingVisibleChunks = new com.destroystokyo.paper.util.map.Long2ObjectLinkedOpenHashMapFastCopy<>(); // Paper - this is used if the visible chunks is updated while iterating only
+ public transient Long2ObjectLinkedOpenHashMap<PlayerChunk> visibleChunksClone; // Paper - used for async access of visible chunks, clone and cache only when needed
private final Long2ObjectLinkedOpenHashMap<PlayerChunk> pendingUnload;
final LongSet loadedChunks; // Paper - private -> package
public final WorldServer world;
@@ -170,7 +172,7 @@ public class PlayerChunkMap extends IChunkLoader implements PlayerChunk.d {
public PlayerChunkMap(WorldServer worldserver, File file, DataFixer datafixer, DefinedStructureManager definedstructuremanager, Executor executor, IAsyncTaskHandler<Runnable> iasynctaskhandler, ILightAccess ilightaccess, ChunkGenerator<?> chunkgenerator, WorldLoadListener worldloadlistener, Supplier<WorldPersistentData> supplier, int i) {
super(new File(worldserver.getWorldProvider().getDimensionManager().a(file), "region"), datafixer);
- this.visibleChunks = this.updatingChunks.clone();
+ //this.visibleChunks = this.updatingChunks.clone(); // Paper - no more cloning
this.pendingUnload = new Long2ObjectLinkedOpenHashMap();
this.loadedChunks = new LongOpenHashSet();
this.unloadQueue = new LongOpenHashSet();
@@ -262,9 +264,52 @@ public class PlayerChunkMap extends IChunkLoader implements PlayerChunk.d {
return (PlayerChunk) this.updatingChunks.get(i);
}
+ // Paper start - remove cloning of visible chunks unless accessed as a collection async
+ private static final boolean DEBUG_ASYNC_VISIBLE_CHUNKS = Boolean.getBoolean("paper.debug-async-visible-chunks");
+ private boolean isIterating = false;
+ private boolean hasPendingVisibleUpdate = false;
+ public void forEachVisibleChunk(java.util.function.Consumer<PlayerChunk> consumer) {
+ org.spigotmc.AsyncCatcher.catchOp("forEachVisibleChunk");
+ boolean prev = isIterating;
+ isIterating = true;
+ try {
+ for (PlayerChunk value : this.visibleChunks.values()) {
+ consumer.accept(value);
+ }
+ } finally {
+ this.isIterating = prev;
+ if (!this.isIterating && this.hasPendingVisibleUpdate) {
+ this.visibleChunks.copyFrom(this.pendingVisibleChunks);
+ this.pendingVisibleChunks.clear();
+ this.hasPendingVisibleUpdate = false;
+ }
+ }
+ }
+ public Long2ObjectLinkedOpenHashMap<PlayerChunk> getVisibleChunks() {
+ if (Thread.currentThread() == this.world.serverThread) {
+ return this.visibleChunks;
+ } else {
+ synchronized (this.visibleChunks) {
+ if (DEBUG_ASYNC_VISIBLE_CHUNKS) new Throwable("Async getVisibleChunks").printStackTrace();
+ if (this.visibleChunksClone == null) {
+ this.visibleChunksClone = this.hasPendingVisibleUpdate ? this.pendingVisibleChunks.clone() : this.visibleChunks.clone();
+ }
+ return this.visibleChunksClone;
+ }
+ }
+ }
+ // Paper end
+
@Nullable
public PlayerChunk getVisibleChunk(long i) { // Paper - protected -> public
- return (PlayerChunk) this.visibleChunks.get(i);
+ // Paper start - mt safe get
+ if (Thread.currentThread() != this.world.serverThread) {
+ synchronized (this.visibleChunks) {
+ return (PlayerChunk) (this.hasPendingVisibleUpdate ? this.pendingVisibleChunks.get(i) : this.visibleChunks.get(i));
+ }
+ }
+ return (PlayerChunk) (this.hasPendingVisibleUpdate ? this.pendingVisibleChunks.get(i) : this.visibleChunks.get(i));
+ // Paper end
}
protected IntSupplier c(long i) {
@@ -444,8 +489,9 @@ public class PlayerChunkMap extends IChunkLoader implements PlayerChunk.d {
// Paper end
protected void save(boolean flag) {
+ Long2ObjectLinkedOpenHashMap<PlayerChunk> visibleChunks = this.getVisibleChunks(); // Paper remove clone of visible Chunks unless saving off main thread (watchdog kill)
if (flag) {
- List<PlayerChunk> list = (List) this.visibleChunks.values().stream().filter(PlayerChunk::hasBeenLoaded).peek(PlayerChunk::m).collect(Collectors.toList());
+ List<PlayerChunk> list = (List) visibleChunks.values().stream().filter(PlayerChunk::hasBeenLoaded).peek(PlayerChunk::m).collect(Collectors.toList()); // Paper - remove cloning of visible chunks
MutableBoolean mutableboolean = new MutableBoolean();
do {
@@ -473,7 +519,7 @@ public class PlayerChunkMap extends IChunkLoader implements PlayerChunk.d {
// this.i(); // Paper - nuke IOWorker
PlayerChunkMap.LOGGER.info("ThreadedAnvilChunkStorage ({}): All chunks are saved", this.w.getName());
} else {
- this.visibleChunks.values().stream().filter(PlayerChunk::hasBeenLoaded).forEach((playerchunk) -> {
+ visibleChunks.values().stream().filter(PlayerChunk::hasBeenLoaded).forEach((playerchunk) -> {
IChunkAccess ichunkaccess = (IChunkAccess) playerchunk.getChunkSave().getNow(null); // CraftBukkit - decompile error
if (ichunkaccess instanceof ProtoChunkExtension || ichunkaccess instanceof Chunk) {
@@ -643,7 +689,20 @@ public class PlayerChunkMap extends IChunkLoader implements PlayerChunk.d {
if (!this.updatingChunksModified) {
return false;
} else {
- this.visibleChunks = this.updatingChunks.clone();
+ // Paper start - stop cloning visibleChunks
+ synchronized (this.visibleChunks) {
+ if (isIterating) {
+ hasPendingVisibleUpdate = true;
+ this.pendingVisibleChunks.copyFrom(this.updatingChunks);
+ } else {
+ hasPendingVisibleUpdate = false;
+ this.pendingVisibleChunks.clear();
+ this.visibleChunks.copyFrom(this.updatingChunks);
+ this.visibleChunksClone = null;
+ }
+ }
+ // Paper end
+
this.updatingChunksModified = false;
return true;
}
Improve mid tick chunk loading, Fix Oversleep, other improvements Process loads outside of any canSleep check. Original intent was to only apply those restrictions to generations but realized I had some checks higher up the call chain. Reworked the back off strategy to just run every 1 millisecond per world, and to apply the per tick limit to generations only. This guarantees that your chunk will load with at most around 1ms delay. Additionally, fire midTick processing in a few more places, notably the oversleep section so we can keep processing loads here too which has a large up to 50ms window... Speaking of oversleep, we had a bug in our implementation changes for Timings that caused oversleep to not sleep the correct amount. Because we now moved it into the NEXT tick instead of THIS tick, the value of nextTick had already been increased to +50ms, resulting in the risk of sleeping more than it should, but, more importantly, this caused every task that was trying to NOT run during oversleep to actually run during oversleep. This is now fixed. Another small tweak is to the /tps command, to no longer show the star when TPS is right at 20. Due to ineffeciencies in the sleep precision, TPS is commonly 20.02. This causes the star to show up almost constantly, so now only show it if we actually hit a real "catchup". This commit also improves the changes to the CallbackExecutor, in that it now is also recursion safe. It was possible that the executor could run tasks out of desired order if the executor task scheduled more executor tasks. We solve this by ensuring new additions do not enter the currently iterated queue. Each depth level will have its own queue. Fixes #3220
2020-04-26 03:47:29 +00:00
@@ -1109,12 +1168,12 @@ public class PlayerChunkMap extends IChunkLoader implements PlayerChunk.d {
}
protected Iterable<PlayerChunk> f() {
- return Iterables.unmodifiableIterable(this.visibleChunks.values());
+ return Iterables.unmodifiableIterable(this.getVisibleChunks().values()); // Paper
}
void a(Writer writer) throws IOException {
CSVWriter csvwriter = CSVWriter.a().a("x").a("z").a("level").a("in_memory").a("status").a("full_status").a("accessible_ready").a("ticking_ready").a("entity_ticking_ready").a("ticket").a("spawning").a("entity_count").a("block_entity_count").a(writer);
- ObjectBidirectionalIterator objectbidirectionaliterator = this.visibleChunks.long2ObjectEntrySet().iterator();
+ ObjectBidirectionalIterator objectbidirectionaliterator = this.getVisibleChunks().long2ObjectEntrySet().iterator(); // Paper
while (objectbidirectionaliterator.hasNext()) {
Entry<PlayerChunk> entry = (Entry) objectbidirectionaliterator.next();
diff --git a/src/main/java/org/bukkit/craftbukkit/CraftWorld.java b/src/main/java/org/bukkit/craftbukkit/CraftWorld.java
index 051506fce8..630d6470a4 100644
--- a/src/main/java/org/bukkit/craftbukkit/CraftWorld.java
+++ b/src/main/java/org/bukkit/craftbukkit/CraftWorld.java
@@ -74,6 +74,7 @@ import net.minecraft.server.GameRules;
import net.minecraft.server.GroupDataEntity;
import net.minecraft.server.IBlockData;
import net.minecraft.server.IChunkAccess;
+import net.minecraft.server.MCUtil;
import net.minecraft.server.MinecraftKey;
import net.minecraft.server.MovingObjectPosition;
import net.minecraft.server.PacketPlayOutCustomSoundEffect;
@@ -290,6 +291,7 @@ public class CraftWorld implements World {
return ret;
}
public int getTileEntityCount() {
+ return MCUtil.ensureMain(() -> {
// We don't use the full world tile entity list, so we must iterate chunks
Long2ObjectLinkedOpenHashMap<PlayerChunk> chunks = world.getChunkProvider().playerChunkMap.visibleChunks;
int size = 0;
@@ -301,11 +303,13 @@ public class CraftWorld implements World {
size += chunk.tileEntities.size();
}
return size;
+ });
}
public int getTickableTileEntityCount() {
return world.tileEntityListTick.size();
}
public int getChunkCount() {
+ return MCUtil.ensureMain(() -> {
int ret = 0;
for (PlayerChunk chunkHolder : world.getChunkProvider().playerChunkMap.visibleChunks.values()) {
@@ -314,7 +318,7 @@ public class CraftWorld implements World {
}
}
- return ret;
+ return ret; });
}
public int getPlayerCount() {
return world.players.size();
@@ -432,6 +436,14 @@ public class CraftWorld implements World {
@Override
public Chunk[] getLoadedChunks() {
+ // Paper start
+ if (Thread.currentThread() != world.getMinecraftWorld().serverThread) {
+ synchronized (world.getChunkProvider().playerChunkMap.visibleChunks) {
+ Long2ObjectLinkedOpenHashMap<PlayerChunk> chunks = world.getChunkProvider().playerChunkMap.visibleChunks;
+ return chunks.values().stream().map(PlayerChunk::getFullChunk).filter(Objects::nonNull).map(net.minecraft.server.Chunk::getBukkitChunk).toArray(Chunk[]::new);
+ }
+ }
+ // Paper end
Long2ObjectLinkedOpenHashMap<PlayerChunk> chunks = world.getChunkProvider().playerChunkMap.visibleChunks;
return chunks.values().stream().map(PlayerChunk::getFullChunk).filter(Objects::nonNull).map(net.minecraft.server.Chunk::getBukkitChunk).toArray(Chunk[]::new);
}
--
Improve mid tick chunk loading, Fix Oversleep, other improvements Process loads outside of any canSleep check. Original intent was to only apply those restrictions to generations but realized I had some checks higher up the call chain. Reworked the back off strategy to just run every 1 millisecond per world, and to apply the per tick limit to generations only. This guarantees that your chunk will load with at most around 1ms delay. Additionally, fire midTick processing in a few more places, notably the oversleep section so we can keep processing loads here too which has a large up to 50ms window... Speaking of oversleep, we had a bug in our implementation changes for Timings that caused oversleep to not sleep the correct amount. Because we now moved it into the NEXT tick instead of THIS tick, the value of nextTick had already been increased to +50ms, resulting in the risk of sleeping more than it should, but, more importantly, this caused every task that was trying to NOT run during oversleep to actually run during oversleep. This is now fixed. Another small tweak is to the /tps command, to no longer show the star when TPS is right at 20. Due to ineffeciencies in the sleep precision, TPS is commonly 20.02. This causes the star to show up almost constantly, so now only show it if we actually hit a real "catchup". This commit also improves the changes to the CallbackExecutor, in that it now is also recursion safe. It was possible that the executor could run tasks out of desired order if the executor task scheduled more executor tasks. We solve this by ensuring new additions do not enter the currently iterated queue. Each depth level will have its own queue. Fixes #3220
2020-04-26 03:47:29 +00:00
2.26.2