C#教程

关注公众号 jb51net

关闭
首页 > 软件编程 > C#教程 > C#创建带结构标签PDF

C#代码实现创建带结构标签的PDF文档

作者:2501_93070778

带标签的 PDF(也称为 PDF/UA)是一种包含结构化标签树的 PDF 文档,本文将介绍如何使用 Spire.PDF for .NET,在 C# 和 VB.NET 中从零开始创建一个带标签的 PDF 文档,有需要的可以了解下

带标签的 PDF(也称为 PDF/UA)是一种包含结构化标签树的 PDF 文档,其结构类似于 HTML,用于定义文档的层级和内容组织方式。通过这些标签,屏幕阅读器等辅助工具可以准确识别文档结构,实现无障碍阅读,确保信息完整传达。

本文将介绍如何使用 Spire.PDF for .NET,在 C# 和 VB.NET 中从零开始创建一个带标签的 PDF 文档。

安装 Spire.PDF for .NET

在开始之前,需要将 Spire.PDF for .NET 包中的 DLL 文件添加为 .NET 项目的引用。您可以通过官方下载链接获取 DLL 文件,或直接通过 NuGet 进行安装。

PM> Install-Package Spire.PDF

创建包含丰富结构元素的带标签 PDF

在带标签的 PDF 文档中添加结构元素时,首先需要创建一个 PdfTaggedContent 类对象。随后,通过 PdfTaggedContent.StructureTreeRoot.AppendChildElement() 方法向结构树的根节点添加元素。

下面以添加“标题(heading)”元素为例,介绍使用 Spire.PDF for .NET 创建带标签 PDF 的具体步骤:

下面的代码示例展示了如何在 C# 和 VB.NET 中创建包含多种结构元素(如 document、heading、paragraph、figure 和 table)的带标签 PDF 文档。

using Spire.Pdf;
using Spire.Pdf.Graphics;
using Spire.Pdf.Interchange.TaggedPdf;
using Spire.Pdf.Tables;
using System.Data;
using System.Drawing;

namespace CreatePDFUA
{
    class Program
    {
        static void Main(string[] args)
        {
            // 创建 PdfDocument 对象
            PdfDocument doc = new PdfDocument();

            // 添加一个页面
            PdfPageBase page = doc.Pages.Add(PdfPageSize.A4, new PdfMargins(20));          

            // 设置 Tab 顺序为结构顺序
            page.SetTabOrder(TabOrder.Structure);

            // 创建 PdfTaggedContent 类对象
            PdfTaggedContent taggedContent = new PdfTaggedContent(doc);

            // 设置文档语言和标题
            taggedContent.SetLanguage("en-US");
            taggedContent.SetTitle("test");

            // 设置 PDF/UA1 标识(符合无障碍标准)
            taggedContent.SetPdfUA1Identification();

            // 创建字体和画刷
            PdfTrueTypeFont font = new PdfTrueTypeFont(new Font("Times New Roman", 14), true);
            PdfSolidBrush brush = new PdfSolidBrush(Color.Black);

            // 添加“document”结构元素
            PdfStructureElement document = taggedContent.StructureTreeRoot.AppendChildElement(PdfStandardStructTypes.Document);

            // 添加“heading”(一级标题)元素
            PdfStructureElement heading1 = document.AppendChildElement(PdfStandardStructTypes.HeadingLevel1);
            heading1.BeginMarkedContent(page);      
            string headingText = "What Is a Tagged PDF?";
            page.Canvas.DrawString(headingText, font, brush, new PointF(0, 0));
            heading1.EndMarkedContent(page);

            // 添加“paragraph”(段落)元素
            PdfStructureElement paragraph = document.AppendChildElement(PdfStandardStructTypes.Paragraph);
            paragraph.BeginMarkedContent(page);
            string paragraphText = "“Tagged PDF” doesn't seem like a life-changing term. But for some, it is. For people who are " +
                "blind or have low vision and use assistive technology (such as screen readers and connected Braille displays) to " +
                "access information, an untagged PDF means they are missing out on information contained in the document because assistive " +
                "technology cannot “read” untagged PDFs. Digital accessibility has opened up so many avenues to information that were once " +
                "closed to people with visual disabilities, but PDFs often get left out of the equation.";
            RectangleF rect = new RectangleF(0, 30, page.Canvas.ClientSize.Width, page.Canvas.ClientSize.Height);
            page.Canvas.DrawString(paragraphText, font, brush, rect);
            paragraph.EndMarkedContent(page);

            // 添加“figure”(图像)元素
            PdfStructureElement figure = document.AppendChildElement(PdfStandardStructTypes.Figure);
            figure.BeginMarkedContent(page);
            PdfImage image = PdfImage.FromFile(@"C:\Users\Administrator\Desktop\pdfua.png");
            page.Canvas.DrawImage(image, new PointF(0, 150));
            figure.EndMarkedContent(page);

            // 添加“table”(表格)元素
            PdfStructureElement table = document.AppendChildElement(PdfStandardStructTypes.Table);
            table.BeginMarkedContent(page);
            PdfTable pdfTable = new PdfTable();
            pdfTable.Style.DefaultStyle.Font = font;

            DataTable dataTable = new DataTable();
            dataTable.Columns.Add("Name");
            dataTable.Columns.Add("Age");
            dataTable.Columns.Add("Sex");
            dataTable.Rows.Add(new string[] { "John", "22", "Male" });
            dataTable.Rows.Add(new string[] { "Katty", "25", "Female" });

            pdfTable.DataSource = dataTable;
            pdfTable.Style.ShowHeader = true;
            pdfTable.Draw(page.Canvas, new PointF(0, 280), 300f);
            table.EndMarkedContent(page);

            // 将文档保存为文件
            doc.SaveToFile("CreatePDFUA.pdf");
        }
    }
}

方法补充

下面小编为大家整理了其他C#创建带标签PDF文件的方法,希望对大家有所帮助

C#实现代码:

using Spire.Pdf;
using Spire.Pdf.Graphics;
using Spire.Pdf.Interchange.TaggedPdf;
using System.Drawing;
 
namespace CreateTaggedPDF
{
    class Program
    {
        static void Main(string[] args)
        {
            //创建PdfDocument类的对象
            PdfDocument pdf = new PdfDocument();
 
            //添加一页
            pdf.Pages.Add(PdfPageSize.A4);
 
            //设置tab order
            pdf.Pages[0].SetTabOrder(TabOrder.Structure);
 
            //创建PdfTaggedContent类的对象
            PdfTaggedContent taggedContent = new PdfTaggedContent(pdf);
            taggedContent.SetLanguage("en-US");
            taggedContent.SetTitle("test");
 
            //创建字体、画刷、字符串格式
            PdfTrueTypeFont font = new PdfTrueTypeFont(new Font("Times New Roman", 10), true);
            PdfSolidBrush brush = new PdfSolidBrush(Color.Black);
            PdfStringFormat format = new PdfStringFormat(PdfTextAlignment.Left);
 
            //添加elements
            PdfStructureElement article = taggedContent.StructureTreeRoot.AppendChildElement(PdfStandardStructTypes.Document);
            PdfStructureElement paragraph1 = article.AppendChildElement(PdfStandardStructTypes.Paragraph);
            PdfStructureElement span1 = paragraph1.AppendChildElement(PdfStandardStructTypes.Span);
            span1.BeginMarkedContent(pdf.Pages[0]);
            //绘制内容到页面
            pdf.Pages[0].Canvas.DrawString("A PDF tag is the key to accessing the contents of PDF documents with supporting technologies such as screen readers. ", font, brush, new Rectangle(40, 0, 480, 80), format);
            span1.EndMarkedContent(pdf.Pages[0]);
 
            PdfStructureElement paragraph2 = article.AppendChildElement(PdfStandardStructTypes.Paragraph);
            paragraph2.BeginMarkedContent(pdf.Pages[0]);
            pdf.Pages[0].Canvas.DrawString("A PDF tag arranges the PDF content in a hierarchical architecture or tag tree.", font, brush, new Rectangle(40, 80, 480, 80), format);
            paragraph2.EndMarkedContent(pdf.Pages[0]);
 
            PdfStructureElement figure1 = article.AppendChildElement(PdfStandardStructTypes.Figure);
            //Set Alternate text 
            figure1.Alt = "replacement text1";
            figure1.BeginMarkedContent(pdf.Pages[0], null);
            PdfImage image = PdfImage.FromFile(@"logo.png");
            pdf.Pages[0].Canvas.DrawImage(image, new PointF(40, 200), new SizeF(100, 100));//绘制图片到页面
            figure1.EndMarkedContent(pdf.Pages[0]);
 
            PdfStructureElement figure2 = article.AppendChildElement(PdfStandardStructTypes.Figure);
            //Set Alternate text
            figure2.Alt = "replacement text2";
            figure2.BeginMarkedContent(pdf.Pages[0], null);
            pdf.Pages[0].Canvas.DrawRectangle(PdfPens.Black, new Rectangle(300, 200, 100, 100));
            figure2.EndMarkedContent(pdf.Pages[0]);
 
            //保存文档          
            pdf.SaveToFile("CreateTaggedFile_result.pdf");
        }
    }
}

vb.net实现代码

Imports Spire.Pdf
Imports Spire.Pdf.Graphics
Imports Spire.Pdf.Interchange.TaggedPdf
Imports System.Drawing
 
Namespace CreateTaggedPDF
	Class Program
		Private Shared Sub Main(args As String())
			'创建PdfDocument类的对象
			Dim pdf As New PdfDocument()
 
			'添加一页
			pdf.Pages.Add(PdfPageSize.A4)
 
			'设置tab order
			pdf.Pages(0).SetTabOrder(TabOrder.[Structure])
 
			'创建PdfTaggedContent类的对象
			Dim taggedContent As New PdfTaggedContent(pdf)
			taggedContent.SetLanguage("en-US")
			taggedContent.SetTitle("test")
 
			'创建字体、画刷、字符串格式
			Dim font As New PdfTrueTypeFont(New Font("Times New Roman", 10), True)
			Dim brush As New PdfSolidBrush(Color.Black)
			Dim format As New PdfStringFormat(PdfTextAlignment.Left)
 
			'添加elements
			Dim article As PdfStructureElement = taggedContent.StructureTreeRoot.AppendChildElement(PdfStandardStructTypes.Document)
			Dim paragraph1 As PdfStructureElement = article.AppendChildElement(PdfStandardStructTypes.Paragraph)
			Dim span1 As PdfStructureElement = paragraph1.AppendChildElement(PdfStandardStructTypes.Span)
			span1.BeginMarkedContent(pdf.Pages(0))
			'绘制内容到页面
			pdf.Pages(0).Canvas.DrawString("A PDF tag is the key to accessing the contents of PDF documents with supporting technologies such as screen readers. ", font, brush, New Rectangle(40, 0, 480, 80), format)
			span1.EndMarkedContent(pdf.Pages(0))
 
			Dim paragraph2 As PdfStructureElement = article.AppendChildElement(PdfStandardStructTypes.Paragraph)
			paragraph2.BeginMarkedContent(pdf.Pages(0))
			pdf.Pages(0).Canvas.DrawString("A PDF tag arranges the PDF content in a hierarchical architecture or tag tree.", font, brush, New Rectangle(40, 80, 480, 80), format)
			paragraph2.EndMarkedContent(pdf.Pages(0))
 
			Dim figure1 As PdfStructureElement = article.AppendChildElement(PdfStandardStructTypes.Figure)
			'Set Alternate text 
			figure1.Alt = "replacement text1"
			figure1.BeginMarkedContent(pdf.Pages(0), Nothing)
			Dim image As PdfImage = PdfImage.FromFile("logo.png")
			pdf.Pages(0).Canvas.DrawImage(image, New PointF(40, 200), New SizeF(100, 100))
			'绘制图片到页面
			figure1.EndMarkedContent(pdf.Pages(0))
 
			Dim figure2 As PdfStructureElement = article.AppendChildElement(PdfStandardStructTypes.Figure)
			'Set Alternate text
			figure2.Alt = "replacement text2"
			figure2.BeginMarkedContent(pdf.Pages(0), Nothing)
			pdf.Pages(0).Canvas.DrawRectangle(PdfPens.Black, New Rectangle(300, 200, 100, 100))
			figure2.EndMarkedContent(pdf.Pages(0))
 
			'保存文档          
			pdf.SaveToFile("CreateTaggedFile_result.pdf")
			System.Diagnostics.Process.Start("CreateTaggedFile_result.pdf")
		End Sub
	End Class
End Namespace

到此这篇关于C#代码实现创建带结构标签的PDF文档的文章就介绍到这了,更多相关C#创建带结构标签PDF内容请搜索脚本之家以前的文章或继续浏览下面的相关文章希望大家以后多多支持脚本之家!

您可能感兴趣的文章:
阅读全文