首页 > 软件编程 > java > Java程序死锁

Java程序死锁问题定位与解决方法

2024-11-27 09:37:23 作者：徐俊生

死锁是一种特定的程序状态,主要是由于循环依赖导致彼此一直处于等待中,而使得程序陷入僵局,相当尴尬,死锁不仅仅发生在线程之间,而对于资源独占的进程之间同样可能出现死锁,本文给大家介绍了Java程序死锁问题定位与解决方法,需要的朋友可以参考下

1. 死锁概述

1.1 什么是死锁

一定是发生在并发中；
互不相让：当两个（或更多）线程（或进程）相互持有对方所要的资源，又不主动释放，导致程序陷入无尽的阻塞，这就是死锁。

1.2 死锁产生的必要条件

导致死锁的条件有四个，这四个条件同时满足就会产生死锁。

互斥条件：某些资源只能由一个线程独占使用，其他线程在资源被占用时只能等待。
请求和保持条件：一个线程因请求资源而阻塞时，对已获得的资源保持不放。
不可抢占条件：线程已获得的资源，在未使用完之前，不能强行剥夺。
循环等待条件：若干线程之间形成一种头尾相接的循环等待资源关系。

2. 死锁的案例分析

public class Resource {
    private String name;
    private int count;

    public Resource(String name) {
        this.name = name;
    }

    public void staticResource() {
        synchronized (this) {
            System.out.println("static resource");
            count++;
        }
    }

    public void saveResource(Resource resource) {
        synchronized (this) {
            System.out.println("save resource：" + Thread.currentThread().getName());
            resource.staticResource();
        }
    }
}

public class DeadLock {
    public static void main(String[] args) {
        Resource resource1 = new Resource("resource1");
        Resource resource2 = new Resource("resource2");
        Thread threadA = new Thread(() -> {
            for (int i = 0; i < 100; i++) {
                resource1.saveResource(resource2);
            }
        });

        Thread threadB = new Thread(() -> {
            for (int i = 0; i < 100; i++) {
                resource2.saveResource(resource1);
            }
        });

        threadA.start();
        threadB.start();
    }
}

打印结果：

save resource：Thread-0
save resource：Thread-1

死锁原因分析：

线程 A 行为：

threadA 在调用 resource1.saveResource(resource2) 时：
- 首先锁住了 resource1 对象。
- 然后试图锁住 resource2 对象，进入其 staticResource 方法。

线程B行为：

threadB 在调用 resource2.saveResource(resource1) 时：
- 首先锁住了 resource2 对象。
- 然后试图锁住 resource1 对象，进入其 staticResource 方法。

死锁发生的原因：

如果 threadA 已经锁住 resource1，并等待锁住 resource2，而此时 threadB 已经锁住 resource2 并等待锁住 resource1，就会发生循环等待。
两个线程互相等待对方释放锁，从而陷入死锁状态。

3. 死锁排查

首先，通过 jps 命令，查看 Java 进程的 pid。

C:\Users\shawn>jps
22568
24488 Launcher
10060 DeadLock
28076 Jps

然后，通过 jstack <pid> 命令查看线程 dump 日志。当发现死锁时，可以在打印的 dump 日志中找到Found one Java-level deadlock: 信息，根据信息的内容可以分析死锁出现的原因。

C:\Users\shawn>jstack 23128
2024-11-23 15:38:34
Full thread dump Java HotSpot(TM) 64-Bit Server VM (25.321-b07 mixed mode):
=============================
Found one Java-level deadlock:
=============================
"Thread-1":
  waiting to lock monitor 0x0000022b6c713f08 (object 0x000000076bdaa990, a com.atu.deadlock.Resource),
  which is held by "Thread-0"
"Thread-0":
  waiting to lock monitor 0x0000022b6c7169a8 (object 0x000000076bdaa9e8, a com.atu.deadlock.Resource),
  which is held by "Thread-1"

Java stack information for the threads listed above:
===================================================
"Thread-1":
        at com.atu.deadlock.Resource.staticResource(Resource.java:13)
        - waiting to lock <0x000000076bdaa990> (a com.atu.deadlock.Resource)
        at com.atu.deadlock.Resource.saveResource(Resource.java:21)
        - locked <0x000000076bdaa9e8> (a com.atu.deadlock.Resource)
        at com.atu.deadlock.DeadLock.lambda$main$1(DeadLock.java:18)
        at com.atu.deadlock.DeadLock$$Lambda$2/1096979270.run(Unknown Source)
        at java.lang.Thread.run(Thread.java:750)
"Thread-0":
        at com.atu.deadlock.Resource.staticResource(Resource.java:13)
        - waiting to lock <0x000000076bdaa9e8> (a com.atu.deadlock.Resource)
        at com.atu.deadlock.Resource.saveResource(Resource.java:21)
        - locked <0x000000076bdaa990> (a com.atu.deadlock.Resource)
        at com.atu.deadlock.DeadLock.lambda$main$0(DeadLock.java:12)
        at com.atu.deadlock.DeadLock$$Lambda$1/1324119927.run(Unknown Source)
        at java.lang.Thread.run(Thread.java:750)

Found 1 deadlock.

4. 线上发生死锁应该怎么办

首先保存案发现场，然后立刻重启服务器（使用 java 相应的命令把整个堆栈信息保存下来），不能进一步影响用户体验；
暂时保证线上服务的安全，然后再利用刚才保存的信息，排查死锁，修改代码，重新发版。

5. 常见死锁修复策略

前面我们说死锁的四个必要条件，我们只需要破坏其中任意一个，就可以避免死锁的产生。其中，互斥条件我们不可以破坏，因为这是互斥锁的基本约束，其他三个条件都可以破坏。

破坏请求和保持条件：线程在请求开始前，一次性申请所有的资源。
破坏不可抢占条件：占用部分资源的线程进一步申请其他资源时，如果申请不到，可以主动释放它占有的资源。
破坏循环等待条件：靠按序申请资源来预防。按某一顺序申请资源，释放资源则反序释放。破坏循环等待条件。

5.1 破坏请求和保持条件

要破坏占用资源所带来的等待，可以一次性申请所有资源，保证同时申请这个操作是在一个临界区中，然后通过一个单独的角色来管理这个临界区。

这个角色有两个很重要的功能，就是同时申请资源和同时释放资源，并且这个角色一定是一个单例。

先定义一个 ApplyLock 类，用来实现统一锁资源的申请，该类中有两个方法：

一个是 applyLock() 方法，用来申请锁；
另一个是free()方法，用来统一释放锁。

public class ApplyLock {
    private List<Object> list = new ArrayList<>();

    public synchronized boolean applyLock(Resource resource1, Resource resource2) {
        if (list.contains(resource1) || list.contains(resource2)) {
            return false;
        } else {
            list.add(resource1);
            list.add(resource2);
            return true;
        }
    }

    public synchronized void free(Resource resource1, Resource resource2) {
        list.remove(resource1);
        list.remove(resource2);
    }
}

修改 Resource 类，定义一个全局唯一的 ApplyLock 实例，然后在 saveResource 中调用 applyLock() 方法和 free() 方法进行统一锁资源的获取和释放。

public class Resource {
    private String name;
    private int count;

    static ApplyLock applyLock = new ApplyLock();

    public Resource(String name) {
        this.name = name;
    }

    public void staticResource() {
        synchronized (this) {
            System.out.println("static resource");
            count++;
        }
    }

    public void saveResource(Resource resource) {
        applyLock.applyLock(this, resource);
        try {
            System.out.println("save resource：" + Thread.currentThread().getName());
            resource.staticResource();
        } finally {
            applyLock.free(this, resource);
        }
    }
}

由于当前涉及的相关资源都实现了一个统一的锁资源获取和释放，从而打破了请求和保持条件。

5.2 破坏不可抢占条件

破坏不可抢占条件的核心是当前线程能够主动释放尝试占有的资源，这一点 synchronized无法实现。

原因是 synchronized 在申请不到资源时会直接进入阻塞状态，一旦线程被阻塞就无法再释放已经占有的资源。
在 java.util.concurrent 包中的 Lock 锁可以轻松地解决这个问题。Lock 接口中有一个 tryLock() 方法可以尝试抢占资源，如果抢占成功则返回 true，否则返回 false，而且这个过程不会阻塞当前线程。

import java.util.concurrent.locks.ReentrantLock;

public class Resource {
    private String name;
    private int count;

    ReentrantLock lock = new ReentrantLock();

    public Resource(String name) {
        this.name = name;
    }

    public void staticResource() {
        if (lock.tryLock()) {
            try {
                System.out.println("static resource");
                count++;
            } finally {
                lock.unlock();
            }
        } else {
            System.out.println("抢锁失败");
        }
    }

    public void saveResource(Resource resource) {
        if (lock.tryLock()) {
            try {
                System.out.println("save resource：" + Thread.currentThread().getName());
                resource.staticResource();
            } finally {
                lock.unlock();
            }
        } else {
            System.out.println("抢锁失败");
        }
    }
}

5.3 破坏循环等待条件

破坏循环等待条件的基本思想是：把资源按照某种顺序编号，所有锁资源的申请都按照某种顺序来获取。 比如，可以根据 hashCode 来确定加锁顺序，再根据 hashCode 的大小确定加锁的对象，实现代码如下。

public class Resource {
    private String name;
    private int count;

    public Resource(String name) {
        this.name = name;
    }

    public void staticResource() {
        synchronized (this) {
            System.out.println("static resource");
            count++;
        }
    }

    public void saveResource(Resource resource) {
        Resource lock = this.hashCode() > resource.hashCode() ? this : resource;
        synchronized (lock) {
            System.out.println("save resource：" + Thread.currentThread().getName());
            resource.staticResource();
        }
    }
}

5.4 经典的哲学家就餐问题

如图所示：

有 5 个哲学家围坐在一张圆桌旁。
每个哲学家都有一个吃饭和思考的状态。
圆桌上放着 5 根筷子（与哲学家数量相同）。
哲学家必须同时拿起两根筷子（左手和右手各一根）才能吃饭，吃完后放下筷子继续思考。

问题描述：如果每个哲学家都拿起左边的筷子并等待右边的筷子，导致所有人相互等待，陷入死锁。

编号为 0 的哲学家拿到编号为 0 的筷子，并等待编号为 1 的筷子。
编号为 1 的哲学家拿到编号为 1 的筷子，并等待编号为 2 的筷子。
编号为 2 的哲学家拿到编号为 2 的筷子，并等待编号为 3 的筷子。
编号为 3 的哲学家拿到编号为 3 的筷子，并等待编号为 4 的筷子。
编号为 4 的哲学家拿到编号为 4 的筷子，并等待编号为 0 的筷子。

哲学家就餐问题（死锁）:

public class DiningPhilosophers {
    public static class Philosopher implements Runnable {

        private Object leftChopstick;
        private Object rightChopstick;

        public Philosopher(Object leftChopstick, Object rightChopstick) {
            this.leftChopstick = leftChopstick;
            this.rightChopstick = rightChopstick;
        }

        @Override
        public void run() {
            while (true) {
                //思考
                try {
                    doAction("Thinking");
                    //吃饭
                    //拿起左边筷子，拿起右边筷子 放下右边筷子 放下左边筷子

                    synchronized (leftChopstick) {
                        doAction("Picked up left chopstick");
                        synchronized (rightChopstick) {
                            doAction("Picked up right chopstick -eating");

                            doAction("Put down right chopstick");
                        }
                        doAction("Put down left chopstick");
                    }
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
            }
        }

        private void doAction(String action) throws InterruptedException {
            System.out.println(Thread.currentThread().getName() + " " + action);
            Thread.sleep((long) Math.random() * 10);

        }
    }

    public static void main(String[] args) {
        Philosopher[] philosophers = new Philosopher[5];
        Object[] chopsticks = new Object[philosophers.length];
        for (int i = 0; i < chopsticks.length; i++) {
            chopsticks[i] = new Object();
        }
        for (int i = 0; i < philosophers.length; i++) {
            Object leftChopstick = chopsticks[i];
            Object rightChopstick = chopsticks[(i + 1) % chopsticks.length];

            philosophers[i] = new Philosopher(leftChopstick, rightChopstick);

            new Thread(philosophers[i], "哲学家" + (i + 1) + "号").start();
        }
    }
}

解决的方式有很多，这里我们通过改变一个哲学家拿筷子的顺序，解决死锁问题。

哲学家就餐的换手方案：

public class DiningPhilosophersFix {
    public static class Philosopher implements Runnable {

        private Object leftChopstick;
        private Object rightChopstick;

        public Philosopher(Object leftChopstick, Object rightChopstick) {
            this.leftChopstick = leftChopstick;
            this.rightChopstick = rightChopstick;
        }

        @Override
        public void run() {
            while (true) {
                //思考
                try {
                    doAction("Thinking");
                    //吃饭
                    //拿起左边筷子，拿起右边筷子 放下右边筷子 放下左边筷子

                    synchronized (leftChopstick) {
                        doAction("Picked up left chopstick");
                        synchronized (rightChopstick) {
                            doAction("Picked up right chopstick -eating");

                            doAction("Put down right chopstick");
                        }
                        doAction("Put down left chopstick");
                    }
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
            }
        }

        private void doAction(String action) throws InterruptedException {
            System.out.println(Thread.currentThread().getName() + " " + action);
            Thread.sleep((long) Math.random() * 10);

        }
    }

    public static void main(String[] args) {
        Philosopher[] philosophers = new Philosopher[5];
        Object[] chopsticks = new Object[philosophers.length];
        for (int i = 0; i < chopsticks.length; i++) {
            chopsticks[i] = new Object();
        }
        for (int i = 0; i < philosophers.length; i++) {
            Object leftChopstick = chopsticks[i];
            Object rightChopstick = chopsticks[(i + 1) % chopsticks.length];

            if (i == philosophers.length - 1) {
                philosophers[i] = new Philosopher(rightChopstick, leftChopstick);
            } else {
                philosophers[i] = new Philosopher(leftChopstick, rightChopstick);

            }
            new Thread(philosophers[i], "哲学家" + (i + 1) + "号").start();
        }
    }
}

6. 实际工程中如何有效避免死锁

设置超时时间：
- Lock 的 tryLock(long timeout, TimeUnit unit)；
- synchronized 不具备尝试锁的能力。
使用最小化锁：减少锁的数量和作用范围，能显著降低死锁发生的概率。
避免嵌套锁：尽量避免线程在持有一个锁时尝试获取另一个锁。
使用高级并发工具：Semaphore、CountDownLatch、ReadWriteLock。

7. 其他活性故障

死锁是最常见的活跃性问题，除了死锁之外，还有一些类似的问题，会导致程序无法顺利执行，统称为活跃性问题。

7.1 活锁

什么是活锁：线程处于一种“忙碌但无效”的状态，始终无法完成任务。（俗称内耗）

特点：

程序一直在运行，但是一直在做没有意义的工作。

活锁代码示例：

public class LiveLock {
    static class Spoon {
        private Diner owner; //就餐者

        public synchronized void use() {
            System.out.printf("%s has eaten!", owner.name);
        }

        public Spoon(Diner owner) {
            this.owner = owner;
        }

        public Diner getOwner() {
            return owner;
        }

        public void setOwner(Diner owner) {
            this.owner = owner;
        }

    }

    static class Diner {
        private String name;
        private boolean isHungry;

        public Diner(String name) {
            this.name = name;
            isHungry = true;
        }

        public void eatWith(Spoon spoon, Diner spouse) {
            while (isHungry) { //只有饿的情况下才能进来

                //问题在此处：一直再谦让
                if (spouse.isHungry) {
                    System.out.println(name + ": 亲爱的" + spouse.name + "你先吃吧");
                    spoon.setOwner(spouse);
                    continue;
                }

                spoon.use();
                isHungry = false;
                System.out.println(name + ": 我吃好了");
                spoon.setOwner(spouse);
            }
        }
    }

    public static void main(String[] args) {
        Diner husband = new Diner("牛郎");
        Diner wife = new Diner("织女");

        Spoon spoon = new Spoon(husband);

        new Thread(new Runnable() {
            @Override
            public void run() {
                husband.eatWith(spoon, wife);
            }
        }).start();
        new Thread(new Runnable() {
            @Override
            public void run() {
                wife.eatWith(spoon, husband);
            }
        }).start();
    }
}

打印结果：

牛郎: 亲爱的织女你先吃吧
织女: 亲爱的牛郎你先吃吧
牛郎: 亲爱的织女你先吃吧
织女: 亲爱的牛郎你先吃吧
牛郎: 亲爱的织女你先吃吧
织女: 亲爱的牛郎你先吃吧
牛郎: 亲爱的织女你先吃吧
织女: 亲爱的牛郎你先吃吧
牛郎: 亲爱的织女你先吃吧
织女: 亲爱的牛郎你先吃吧
...

解决：以太网的指数退避算法，加入随机因素。

public class LiveLockFix {
    static class Spoon {
        private Diner owner; //就餐者

        public synchronized void use() {
            System.out.printf("%s has eaten!", owner.name);
        }

        public Spoon(Diner owner) {
            this.owner = owner;
        }

        public Diner getOwner() {
            return owner;
        }

        public void setOwner(Diner owner) {
            this.owner = owner;
        }

    }

    static class Diner {
        private String name;
        private boolean isHungry;

        public Diner(String name) {
            this.name = name;
            isHungry = true;
        }

        public void eatWith(Spoon spoon, Diner spouse) {
            while (isHungry) { //只有饿的情况下才能进来
                Random random = new Random();
                //问题在此处：一直再谦让
                if (spouse.isHungry && random.nextInt(10) < 9) {
                    System.out.println(name + ": 亲爱的" + spouse.name + "你先吃吧");
                    spoon.setOwner(spouse);
                    continue;
                }

                spoon.use();
                isHungry = false;
                System.out.println(name + ": 我吃好了");
                spoon.setOwner(spouse);
            }
        }
    }

    public static void main(String[] args) {
        Diner husband = new Diner("牛郎");
        Diner wife = new Diner("织女");

        Spoon spoon = new Spoon(husband);

        new Thread(new Runnable() {
            @Override
            public void run() {
                husband.eatWith(spoon, wife);
            }
        }).start();
        new Thread(new Runnable() {
            @Override
            public void run() {
                wife.eatWith(spoon, husband);
            }
        }).start();
    }
}

活锁的解决方法：

增加随机性：通过引入随机的等待时间（如使用随机退避算法），避免线程/进程按照相同的模式重复操作。
设置重试次数或超时：为线程的尝试次数或时间限制设置一个阈值。如果超过限制，则采用其他策略，如强制退出或降级处理。

7.2 饥饿

线程饥饿问题其实指的公平性问题。是指某个线程因无法获取所需资源而无法执行，一直处于等待状态的情况。

饥饿代码示例：

public class StarvationExample {
    private static final Object lock = new Object();

    public static void main(String[] args) {
        Thread highPriorityThread = new Thread(() -> {
            synchronized (lock) {
                while (true) {
                    System.out.println("High priority thread is running...");
                    try {
                        Thread.sleep(1000);
                    } catch (InterruptedException e) {
                        e.printStackTrace();
                    }
                }
            }
        });

        Thread lowPriorityThread = new Thread(() -> {
            synchronized (lock) {
                while (true) {
                    System.out.println("Low priority thread is running...");
                    try {
                        Thread.sleep(1000);
                    } catch (InterruptedException e) {
                        e.printStackTrace();
                    }
                }
            }
        });

        highPriorityThread.setPriority(Thread.MAX_PRIORITY);
        lowPriorityThread.setPriority(Thread.MIN_PRIORITY);

        highPriorityThread.start();
        lowPriorityThread.start();
    }
}

问题解析：

上述代码中，高优先级线程（highPriorityThread）由于持有 lock 锁资源，它可能会导致低优先级线程（lowPriorityThread）一直无法执行，从而出现线程饥饿的现象。

线程饥饿原因：

资源分配不均：如果一个线程的优先级一直较低，而系统的调度策略总是优先执行高优先级的线程，那么低优先级线程就可能一直得不到执行的机会，从而发生饥饿。
线程被无限阻塞：当获得锁的线程需要执行无限时间长的操作时（比如 IO 或者无限循环），那么后面的线程将会被无限阻塞，导致被饿死。

饥饿的解决方法：

设置合适的线程优先级
使用公平性调度算法

8. 总结

死锁

特点：两个或多个线程（进程）相互等待对方释放资源，导致所有线程都无法继续执行。
解决方法：避免一个线程持有多个资源的情况，或使用超时机制，如果一个线程在一定时间内没能获得锁，就放弃等待。

活锁

特点：线程仍然在运行，但由于不断地响应对方，始终没有实际进展。
解决方法：为避免活锁，可以设置超时机制，或者使用协调机制来避免线程之间过度的反应。

饥饿

特点：线程无法获得执行机会，但其他线程仍然在运行，造成某些线程得不到资源。
解决方法：使用公平锁或合理的优先级策略，确保每个线程都有机会执行，不会被长时间忽略。

以上就是Java程序死锁问题定位与解决方法的详细内容，更多关于Java程序死锁的资料请关注脚本之家其它相关文章！