How to recover Zookeeper from java.io.EOFException after a server crash?

user3395041 picture user3395041 · May 27, 2017 · Viewed 12.5k times · Source

How to recover from the following error that started happening after a server crash? Zookeeper won’t start and the following message is showing repeatedly on the log.

2017-05-27 01:02:08,072 [myid:] - INFO [main:Environment@100] - Server environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib 
2017-05-27 01:02:08,072 [myid:] - INFO [main:Environment@100] - Server environment:java.io.tmpdir=/tmp 
2017-05-27 01:02:08,072 [myid:] - INFO [main:Environment@100] - Server environment:java.compiler=<NA> 
2017-05-27 01:02:08,072 [myid:] - INFO [main:Environment@100] - Server environment:os.name=Linux 
2017-05-27 01:02:08,072 [myid:] - INFO [main:Environment@100] - Server environment:os.arch=amd64 
2017-05-27 01:02:08,073 [myid:] - INFO [main:Environment@100] - Server environment:os.version=3.10.0-514.16.1.el7.x86_64 
2017-05-27 01:02:08,073 [myid:] - INFO [main:Environment@100] - Server environment:user.name=zookeeper 
2017-05-27 01:02:08,073 [myid:] - INFO [main:Environment@100] - Server environment:user.home=/opt/zookeeper 
2017-05-27 01:02:08,073 [myid:] - INFO [main:Environment@100] - Server environment:user.dir=/ 
2017-05-27 01:02:08,074 [myid:] - INFO [main:ZooKeeperServer@829] - tickTime set to 2000 
2017-05-27 01:02:08,074 [myid:] - INFO [main:ZooKeeperServer@838] - minSessionTimeout set to -1 
2017-05-27 01:02:08,074 [myid:] - INFO [main:ZooKeeperServer@847] - maxSessionTimeout set to -1 
2017-05-27 01:02:08,080 [myid:] - INFO [main:NIOServerCnxnFactory@89] - binding to port 0.0.0.0/0.0.0.0:2181 
2017-05-27 01:02:08,385 [myid:] - ERROR [main:Util@239] - Last transaction was partial. 
2017-05-27 01:02:08,400 [myid:] - ERROR [main:Util@239] - Last transaction was partial. 
2017-05-27 01:02:08,403 [myid:] - ERROR [main:Util@239] - Last transaction was partial. 
2017-05-27 01:02:08,403 [myid:] - ERROR [main:Util@239] - Last transaction was partial. 
2017-05-27 01:02:08,404 [myid:] - ERROR [main:Util@239] - Last transaction was partial. 
2017-05-27 01:02:08,404 [myid:] - ERROR [main:ZooKeeperServerMain@64] - Unexpected exception, exiting abnormally 
java.io.EOFException 
at java.io.DataInputStream.readInt(DataInputStream.java:392) 
at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63) 
at org.apache.zookeeper.server.persistence.FileHeader.deserialize(FileHeader.java:64) 
at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.inStreamCreated(FileTxnLog.java:585) 
at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.createInputArchive(FileTxnLog.java:604) 
at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.goToNextLog(FileTxnLog.java:570) 
at org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:652) 
at org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:166) 
at org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:223) 
at org.apache.zookeeper.server.ZooKeeperServer.loadData(ZooKeeperServer.java:283) 
at org.apache.zookeeper.server.ZooKeeperServer.startdata(ZooKeeperServer.java:410) 
at org.apache.zookeeper.server.NIOServerCnxnFactory.startup(NIOServerCnxnFactory.java:118) 
at org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:119) 
at org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:87) 
at org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:53) 
at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:116) 
at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)

thanks IPVP

Answer

Roshan Summun picture Roshan Summun · Oct 12, 2017

The solution for me was to find the 0 length log file in /hadoop/zookeeper/version-2 (or whichever place your dataDir is) and delete it. Start ZooKeeper afterwards.