java实现从url路径中下载pdf文档到本地

2025-11-12 09:55:38 作者：小二爱编程·

这篇文章主要为大家详细介绍了java实现从url路径中下载pdf文档到本地,文中的示例代码讲解详细,具有一定的借鉴价值,有需要的小伙伴可以了解下

这篇文章主要为大家介绍了如何从指定的网络URL下载PDF文件。主要通过HttpURLConnection建立连接，设置超时和User-Agent，读取输入流并将其写入本地文件系统。以下是完整代码：

package com.cellstrain.icell.util;


import java.io.*;
import java.net.*;

public class DownloadPdf {

    /**
     * 从网络Url中下载文件
     * @param urlStr
     * @param fileName
     * @param savePath
     * @throws IOException
     */
    public static void  downLoadByUrl(String urlStr,String fileName,String savePath) throws IOException{
        URL url = new URL(urlStr);
        HttpURLConnection conn = (HttpURLConnection)url.openConnection();
        //设置超时间为3秒
        conn.setConnectTimeout(5*1000);
        //防止屏蔽程序抓取而返回403错误
        conn.setRequestProperty("User-Agent", "Mozilla/4.0 (compatible; MSIE 5.0; Windows NT; DigExt)");
        //得到输入流
        InputStream inputStream = conn.getInputStream();
        //获取自己数组
        byte[] getData = readInputStream(inputStream);
        //文件保存位置
        File saveDir = new File(savePath);
        if(!saveDir.exists()){
            saveDir.mkdir();
        }
        File file = new File(saveDir+File.separator+fileName);
        FileOutputStream fos = new FileOutputStream(file);
        fos.write(getData);
        if(fos!=null){
            fos.close();
        }
        if(inputStream!=null){
            inputStream.close();
        }
        System.out.println("info:"+url+" download success");

    }


    /**
     * 从输入流中获取字节数组
     * @param inputStream
     * @return
     * @throws IOException
     */
    public static  byte[] readInputStream(InputStream inputStream) throws IOException {
        byte[] buffer = new byte[1024];
        int len = 0;
        ByteArrayOutputStream bos = new ByteArrayOutputStream();
        while((len = inputStream.read(buffer)) != -1) {
            bos.write(buffer, 0, len);
        }
        bos.close();
        return bos.toByteArray();
    }

    public static void main(String[] args) {
        try{
            downLoadByUrl("https://www.mybiosource.com/images/tds/protocol_samples/MBS700_Antibody_Set_Sandwich_ELISA_Protocol.pdf",
                    "ELISA.pdf","E:/upload/protocol");
        }catch (Exception e) {
            // TODO: handle exception
        }
    }
}

方法补充

1.java实现从url路径中下载pdf文档到本地

在开始之前，确保你的计算机上安装了Java开发环境（JDK）。建议使用Java 8及以上版本。你可以使用任何开发工具，例如Eclipse、IntelliJ IDEA或简单的文本编辑器。

下载PDF文档的步骤

下载PDF文档的步骤如下：

确定PDF文件的URL。
创建一个HTTP连接，向该URL发送请求。
读取响应内容。
将内容写入本地文件。

代码示例

下面是一个简单的Java代码示例，演示了如何从URL路径下载PDF文档到本地。

import java.io.BufferedInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.net.HttpURLConnection;
import java.net.URL;

public class PdfDownloader {
    
    public static void downloadPdf(String pdfUrl, String destinationFile) {
        try {
            URL url = new URL(pdfUrl);
            HttpURLConnection httpURLConnection = (HttpURLConnection) url.openConnection();
            httpURLConnection.setRequestMethod("GET");
            httpURLConnection.setDoOutput(true);
            
            int responseCode = httpURLConnection.getResponseCode();
            if (responseCode == HttpURLConnection.HTTP_OK) { // 200 status code
                InputStream inputStream = new BufferedInputStream(httpURLConnection.getInputStream());
                FileOutputStream fileOutputStream = new FileOutputStream(destinationFile);
                
                byte[] buffer = new byte[1024];
                int bytesRead;
                while ((bytesRead = inputStream.read(buffer)) != -1) {
                    fileOutputStream.write(buffer, 0, bytesRead);
                }
                
                fileOutputStream.close();
                inputStream.close();
                System.out.println("PDF downloaded successfully: " + destinationFile);
            } else {
                System.out.println("Failed to download PDF, response code: " + responseCode);
            }
            httpURLConnection.disconnect();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
    
    public static void main(String[] args) {
        String pdfUrl = " // 替换为实际PDF链接
        String destinationFile = "downloaded.pdf"; // 目标文件名
        downloadPdf(pdfUrl, destinationFile);
    }
}

代码详解

导入必要的类：我们需要使用URLConnection和InputStream等类来处理网络连接和读取数据。
下载函数：downloadPdf函数接受PDF文件的URL和本地文件路径作为参数，使用HttpURLConnection类创建HTTP连接。
读取和写入数据：成功连接后，通过输入流读取PDF数据，并通过输出流将其写入本地文件。
异常处理：使用try-catch语句处理可能出现的IOException异常。

2.Java根据URL下载文件到本地的2种方式(大型文件与小型文件)

a.小型文件推荐使用

代码解析

首先创建了一个URL对象website，用来表示远程文件的地址。
然后创建了一个ReadableByteChannel对象rbc和一个FileOutputStream对象fos。ReadableByteChannel用于读取远程文件的字节流，FileOutputStream用于将读取的内容写入本地文件。
在try块中，通过URL对象打开一个连接并获取其字节流，然后使用transferFrom方法将远程文件的内容直接传输到本地文件。这是NIO的一种高效的文件传输方式。
如果在上述过程中发生异常，将会捕获并打印异常信息。
无论是否发生异常，最后都会执行finally块中的清理工作，关闭文件输出流和远程字节流通道，以释放资源。

实现代码

public static void downloadFile(String remoteFilePath, String localFilePath) {
        URL website = null;
        ReadableByteChannel rbc = null;
        FileOutputStream fos = null;
        try {
            website = new URL(remoteFilePath);
            rbc = Channels.newChannel(website.openStream());
            fos = new FileOutputStream(localFilePath);//本地要存储的文件地址 例如：test.txt
            fos.getChannel().transferFrom(rbc, 0, Long.MAX_VALUE);
        } catch (Exception e) {
            e.printStackTrace();
        }finally{
            if(fos!=null){
                try {
                    fos.close();
                } catch (IOException e) {
                    e.printStackTrace();
                }
            }
            if(rbc!=null){
                try {
                    rbc.close();
                } catch (IOException e) {
                    e.printStackTrace();
                }
 
            }
        }
    }

b.大型文件推荐使用

代码解析

首先创建了一个URL对象url，用来表示要下载文件的地址。
使用URL对象打开一个连接，并将其强制转换为HttpURLConnection对象。HttpURLConnection是Java提供的用于发送HTTP请求和接收HTTP响应的类。
通过连接获取输入流 inputStream，使用BufferedInputStream对输入流进行缓存。这是为了避免一次性读取大文件造成内存溢出。
创建一个File对象 file，表示要保存的本地文件。如果该文件已存在，则删除之。
创建一个输出流 outputStream，将文件作为输出目标。
创建一个字节数组 buffer，大小为5MB（1024 * 1024 * 5），用于缓存每次从输入流中读取的数据。
使用 while 循环，不断从输入流中读取数据到缓冲区，然后将缓冲区的内容写入输出流。循环会一直进行，直到输入流的末尾。
关闭连接 connection，并在 finally 块中关闭输入流和输出流。使用 IOUtils.closeQuietly 方法可以安全地关闭流，即使发生异常也不会抛出异常。
总的来说，这段代码实现了从指定URL下载文件到本地的功能，并且通过缓存流和分块读取的方式，避免了一次性读取大文件导致的内存溢出问题。同时，在下载完成或出现异常后，也进行了资源的关闭和释放操作。

实现代码

public static void downloadFile1(String downloadUrl, String path){
        InputStream inputStream = null;
        OutputStream outputStream = null;
        try {
            URL url = new URL(downloadUrl);
            //这里没有使用 封装后的ResponseEntity 就是也是因为这里不适合一次性的拿到结果，放不下content,会造成内存溢出
            HttpURLConnection connection =(HttpURLConnection) url.openConnection();
 
            //使用bufferedInputStream 缓存流的方式来获取下载文件，不然大文件会出现内存溢出的情况
            inputStream = new BufferedInputStream(connection.getInputStream());
            File file = new File(path);
            if (file.exists()) {
                file.delete();
            }
            outputStream = new FileOutputStream(file);
            //这里也很关键每次读取的大小为5M 不一次性读取完
            byte[] buffer = new byte[1024 * 1024 * 5];// 5MB
            int len = 0;
            while ((len = inputStream.read(buffer)) != -1) {
                outputStream.write(buffer, 0, len);
            }
            connection.disconnect();
        }catch (Exception e){
            e.printStackTrace();
        }finally {
            IOUtils.closeQuietly(outputStream);
            IOUtils.closeQuietly(inputStream);
        }
    }

到此这篇关于java实现从url路径中下载pdf文档到本地的文章就介绍到这了,更多相关java url下载pdf内容请搜索脚本之家以前的文章或继续浏览下面的相关文章希望大家以后多多支持脚本之家！

java实现从url路径中下载pdf文档到本地

您可能感兴趣的文章: