Spring之ShutDown Hook死锁现象解读
作者:Epoch-Elysian
Spring ShutDown Hook死锁现象
偶然出现一次项目异常spring却没有正常停止的情况,最终发现是Spring Shutdown导致的死锁现象。
某个框架里嵌入了类似这样的一段代码
@Component public class ShutDownHookTest implements ApplicationListener<ContextRefreshedEvent> { @Override public void onApplicationEvent(ContextRefreshedEvent event) { if (onException) { System.out.println("test shutdown hook deadlock"); System.exit(0); } } }
它的逻辑就是想要在出现异常后,通过System.exit来确保应用程序退出。
而且没有使用异步事件,是在主线程下跑了System.exit,然后就发现springboot server还是正常运行着的。
而且程序看着好像也没问题,由于我们是dubbo服务化系统,在测试环境上服务还是正常的。
这很明显不符常理,正常来说,System.exit这样的指令是spring能够感知到的,并且会执行shutDown处理的,先来看看Spring 注册ShutdownHook
public abstract class AbstractApplicationContext extends DefaultResourceLoader implements ConfigurableApplicationContext { @Override public void registerShutdownHook() { if (this.shutdownHook == null) { // No shutdown hook registered yet. this.shutdownHook = new Thread(SHUTDOWN_HOOK_THREAD_NAME) { @Override public void run() { //重点在这里获取startupShutdownMonitor的监视器锁 synchronized (startupShutdownMonitor) { doClose(); } } }; Runtime.getRuntime().addShutdownHook(this.shutdownHook); } } protected void doClose() { // Check whether an actual close attempt is necessary... if (this.active.get() && this.closed.compareAndSet(false, true)) { if (logger.isDebugEnabled()) { logger.debug("Closing " + this); } if (!NativeDetector.inNativeImage()) { LiveBeansView.unregisterApplicationContext(this); } try { // Publish shutdown event. publishEvent(new ContextClosedEvent(this)); } catch (Throwable ex) { logger.warn("Exception thrown from ApplicationListener handling ContextClosedEvent", ex); } // Stop all Lifecycle beans, to avoid delays during individual destruction. if (this.lifecycleProcessor != null) { try { this.lifecycleProcessor.onClose(); } catch (Throwable ex) { logger.warn("Exception thrown from LifecycleProcessor on context close", ex); } } // Destroy all cached singletons in the context's BeanFactory. destroyBeans(); // Close the state of this context itself. closeBeanFactory(); // Let subclasses do some final clean-up if they wish... onClose(); // Reset local application listeners to pre-refresh state. if (this.earlyApplicationListeners != null) { this.applicationListeners.clear(); this.applicationListeners.addAll(this.earlyApplicationListeners); } // Switch to inactive. this.active.set(false); } } }
也就是说spring新起了一个线程,加入了JVM Shutdown钩子函数。
重点是close前要获取startupShutdownMonitor的对象监视器锁,这个锁看着就很眼熟,Spring在refresh时也会获取这把锁。
public abstract class AbstractApplicationContext extends DefaultResourceLoader implements ConfigurableApplicationContext { @Override public void refresh() throws BeansException, IllegalStateException { synchronized (this.startupShutdownMonitor) { StartupStep contextRefresh = this.applicationStartup.start("spring.context.refresh"); // Prepare this context for refreshing. prepareRefresh(); // Tell the subclass to refresh the internal bean factory. ConfigurableListableBeanFactory beanFactory = obtainFreshBeanFactory(); // Prepare the bean factory for use in this context. prepareBeanFactory(beanFactory); ...... } } }
这个时候我们猜想,是获取startupShutdownMonitor死锁了。
jstack打下线程栈看看
"SpringContextShutdownHook" #18 prio=5 os_prio=0 tid=0x0000000024e00800 nid=0x407c waiting for monitor entry [0x000000002921f000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.springframework.context.support.AbstractApplicationContext$1.run(AbstractApplicationContext.java:991)
- waiting to lock <0x00000006c494f430> (a java.lang.Object)
"main" #1 prio=5 os_prio=0 tid=0x0000000002de4000 nid=0x1ff4 in Object.wait() [0x0000000002dde000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x00000006c4a43118> (a org.springframework.context.support.AbstractApplicationContext$1)
at java.lang.Thread.join(Thread.java:1252)
- locked <0x00000006c4a43118> (a org.springframework.context.support.AbstractApplicationContext$1)
at java.lang.Thread.join(Thread.java:1326)
at java.lang.ApplicationShutdownHooks.runHooks(ApplicationShutdownHooks.java:107)
at java.lang.ApplicationShutdownHooks$1.run(ApplicationShutdownHooks.java:46)
at java.lang.Shutdown.runHooks(Shutdown.java:123)
at java.lang.Shutdown.sequence(Shutdown.java:167)
at java.lang.Shutdown.exit(Shutdown.java:212)
- locked <0x00000006c4845128> (a java.lang.Class for java.lang.Shutdown)
at java.lang.Runtime.exit(Runtime.java:109)
at java.lang.System.exit(System.java:971)
at io.seata.server.ShutDownHookTest.onApplicationEvent(ShutDownHookTest.java:12)
at io.seata.server.ShutDownHookTest.onApplicationEvent(ShutDownHookTest.java:7)
at org.springframework.context.event.SimpleApplicationEventMulticaster.doInvokeListener(SimpleApplicationEventMulticaster.java:176)
at org.springframework.context.event.SimpleApplicationEventMulticaster.invokeListener(SimpleApplicationEventMulticaster.java:169)
at org.springframework.context.event.SimpleApplicationEventMulticaster.multicastEvent(SimpleApplicationEventMulticaster.java:143)
at org.springframework.context.support.AbstractApplicationContext.publishEvent(AbstractApplicationContext.java:421)
at org.springframework.context.support.AbstractApplicationContext.publishEvent(AbstractApplicationContext.java:378)
at org.springframework.context.support.AbstractApplicationContext.finishRefresh(AbstractApplicationContext.java:938)
at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:586)
- locked <0x00000006c494f430> (a java.lang.Object)
at org.springframework.boot.web.servlet.context.ServletWebServerApplicationContext.refresh(ServletWebServerApplicationContext.java:144)
at org.springframework.boot.SpringApplication.refresh(SpringApplication.java:771)
at org.springframework.boot.SpringApplication.refresh(SpringApplication.java:763)
乍一看jstack并没有提示线程死锁(jvisualvm、jconsle之类的工具也不行),但是从线程栈来看:
- main线程先获取到了startupShutdownMonitor锁 <0x00000006c494f430>
- SpringContextShutdownHook线程在等待startupShutdownMonitor锁
- main线程掉了Thread.join阻塞在获取<0x00000006c4a43118>这把锁
根本原因是main线程调System.exit阻塞住了,一直往下追踪,会发现阻塞在ApplicationShutdownHooks这里
class ApplicationShutdownHooks { /* Iterates over all application hooks creating a new thread for each * to run in. Hooks are run concurrently and this method waits for * them to finish. */ static void runHooks() { Collection<Thread> threads; synchronized(ApplicationShutdownHooks.class) { threads = hooks.keySet(); hooks = null; } for (Thread hook : threads) { hook.start(); } for (Thread hook : threads) { while (true) { try { // 等待shutdow线程结束 hook.join(); break; } catch (InterruptedException ignored) { } } } } }
总结
整个死锁的流程:
- main线程-spring refresh开始时会获取startupShutdownMonitor对象监视器锁
- main线程-在spring refresh还未完成的时候,触发了System.exit指令
- SpringContextShutdownHook线程-SpringContextShutdownHook线程开始工作,等待获取startupShutdownMonitor对象监视器锁
- main线程调用Thread.join等待SpringContextShutdownHook线程结束
所以在Spring未完成refresh时,是不能够触发System.exit指令的
以上为个人经验,希望能给大家一个参考,也希望大家多多支持脚本之家。