基于Python打造超漂亮的HTML两栏文本对比工具
作者:幸福清风
在日常开发、文档撰写或代码审查中,我们经常需要对比两个文本文件的差异。虽然系统自带的 diff
工具或 IDE 插件可以满足基本需求,但它们往往不够直观,尤其是当需要分享给非技术人员时。
今天,我将带你用 Python + difflib + HTML + CSS 手动打造一个美观、清晰、支持行内字符级差异高亮的两栏对比工具,并自动生成可浏览器查看的 HTML 报告!
功能亮点
双栏并排显示:左右分别为两个文件内容,一目了然
行号标注:每行左侧显示行号,删除/新增行特殊标记为 -
行内差异高亮:
- 红色删除线表示被删内容(
<span class="highlight-delete">
) - 绿色加粗表示新增内容(
<span class="highlight-insert">
)
智能分组上下文:使用 difflib.SequenceMatcher
智能识别修改块,保留上下文
自动换行 & 响应式布局:长行自动折行,适配不同屏幕
美观现代化 UI:采用扁平化设计 + 阴影 + 圆角 + 悬停效果
一键打开浏览器预览:生成后自动调用本地浏览器打开结果
核心技术栈
技术 | 用途 |
---|---|
difflib | 计算文本差异(行级 + 字符级) |
os, urllib.parse | 文件路径处理与 URL 构造 |
webbrowser | 自动打开浏览器 |
HTML/CSS | 渲染可视化界面 |
pre-wrap + word-break | 实现代码块自动换行 |
实现思路详解
1. 读取文件内容
def read_file(filepath): with open(filepath, 'r', encoding='utf-8') as f: return [line.rstrip('\n') for line in f.readlines()]
注意:保留原始换行符信息用于对比,但去除 \n
避免干扰字符级 diff。
2. 行内字符级差异高亮
利用 difflib.SequenceMatcher.get_opcodes()
分析两行字符串的差异:
def highlight_line_diff(text1, text2): sm = difflib.SequenceMatcher(None, text1, text2) result1 = [] result2 = [] for tag, i1, i2, j1, j2 in sm.get_opcodes(): part1 = text1[i1:i2] part2 = text2[j1:j2] if tag == 'equal': result1.append(part1) result2.append(part2) elif tag == 'delete': result1.append(f'<span class="highlight-delete">{part1}</span>') elif tag == 'insert': result2.append(f'<span class="highlight-insert">{part2}</span>') elif tag == 'replace': result1.append(f'<span class="highlight-delete">{part1}</span>') result2.append(f'<span class="highlight-insert">{part2}</span>') return ''.join(result1), ''.join(result2)
示例:
- 原文:
Hello world
- 修改:
Hello Python
- 输出:
Hello <del>world</del><ins>Python</ins>
3. 使用get_grouped_opcodes(n=3)显示上下文
matcher = difflib.SequenceMatcher(None, lines1, lines2) blocks = matcher.get_grouped_opcodes(n=3) # 每个差异块前后保留3行相同内容
这样不会只展示“纯差异”,而是让用户看到变更前后的完整逻辑上下文,极大提升可读性。
4. 构建 HTML 表格结构
每一行生成如下结构:
<tr> <td class="line-num">42</td> <td class="content left"><pre>这是修改过的<del>旧内容</del><ins>新内容</ins></pre></td> <td class="line-num ins">42</td> <td class="content right added"><pre>这是修改过的<ins>新内容</ins></pre></td> </tr>
并通过 CSS 控制样式:
- 删除行:粉红底 + 红边框
- 新增行:淡绿底 + 绿边框
- 修改行:浅黄底 + 橙边框
- 空白占位:灰色斜体
-
5. 美化 CSS 样式(关键技巧)
td.content pre { white-space: pre-wrap; /* 保留空格和换行 */ word-wrap: break-word; /* 超长单词也能断行 */ } table.diff { table-layout: fixed; /* 固定列宽,避免错位 */ width: 100%; }
小贴士:如果不设置 table-layout: fixed
,长文本会撑破表格!
使用方法
只需调用函数即可:
generate_beautiful_side_by_side_diff( file1_path="file1.txt", file2_path="file2.txt", open_in_browser=True, output_html="【美化版】两栏对比结果.html" )
运行后会自动:
- 生成 HTML 文件
- 打印路径
- 自动用默认浏览器打开预览
效果截图预览(文字描述)
想象一下这个画面:
- 顶部是深蓝色标题栏:“文档两栏对比”
- 下方是一个整洁的双栏表格,左侧是
file1.txt
,右侧是file2.txt
- 修改过的行左侧行号变橙色,背景微黄,左边有橙色竖条
- 被删除的行画了删除线,红色提示
- 新增的行绿色高亮,还有绿色边框
- 每一行内的变动字符都被精准标记
- 最下方还有说明图例:红=删除|绿=新增|黄=修改
是不是比 Git 的命令行 diff 清晰多了?
完整代码(可直接复制运行)
import difflib import os import webbrowser from urllib.parse import urljoin, urlparse def read_file(filepath): """读取文件内容,按行返回列表(每行保留换行符)""" try: with open(filepath, 'r', encoding='utf-8') as f: lines = f.readlines() return [line.rstrip('\n') for line in lines] except Exception as e: print(f"无法读取文件: {filepath}, 错误: {e}") return [] def highlight_line_diff(text1, text2): """对两个字符串进行字符级差异高亮""" sm = difflib.SequenceMatcher(None, text1, text2) result1 = [] result2 = [] for tag, i1, i2, j1, j2 in sm.get_opcodes(): part1 = text1[i1:i2] part2 = text2[j1:j2] if tag == 'equal': result1.append(part1) result2.append(part2) elif tag == 'delete': result1.append(f'<span class="highlight-delete">{part1}</span>') result2.append(part2) elif tag == 'insert': result1.append(part1) result2.append(f'<span class="highlight-insert">{part2}</span>') elif tag == 'replace': result1.append(f'<span class="highlight-delete">{part1}</span>') result2.append(f'<span class="highlight-insert">{part2}</span>') return ''.join(result1), ''.join(result2) def generate_beautiful_side_by_side_diff(file1_path, file2_path, open_in_browser, output_html="diff_comparison.html"): """生成美化版两栏对比 HTML""" lines1 = read_file(file1_path) lines2 = read_file(file2_path) filename1 = os.path.basename(file1_path) filename2 = os.path.basename(file2_path) matcher = difflib.SequenceMatcher(None, lines1, lines2) blocks = matcher.get_grouped_opcodes(n=3) table_rows = [] for group in blocks: for tag, i1, i2, j1, j2 in group: if tag == 'equal': for line1, line2 in zip(lines1[i1:i2], lines2[j1:j2]): hl_line1, hl_line2 = highlight_line_diff(line1, line2) table_rows.append(f""" <tr> <td class="line-num">{i1 + 1}</td> <td class="content left"><pre>{hl_line1 or ' '}</pre></td> <td class="line-num">{j1 + 1}</td> <td class="content right"><pre>{hl_line2 or ' '}</pre></td> </tr>""") i1 += 1; j1 += 1 elif tag == 'delete': for line in lines1[i1:i2]: hl_line, _ = highlight_line_diff(line, "") table_rows.append(f""" <tr> <td class="line-num del">{i1 + 1}</td> <td class="content left deleted"><pre>{hl_line or ' '}</pre></td> <td class="line-num">-</td> <td class="content right empty"><pre>-</pre></td> </tr>""") i1 += 1 elif tag == 'insert': for line in lines2[j1:j2]: _, hl_line = highlight_line_diff("", line) table_rows.append(f""" <tr> <td class="line-num">-</td> <td class="content left empty"><pre>-</pre></td> <td class="line-num ins">{j1 + 1}</td> <td class="content right added"><pre>{hl_line or ' '}</pre></td> </tr>""") j1 += 1 elif tag == 'replace': max_len = max(i2 - i1, j2 - j1) for k in range(max_len): line1 = lines1[i1 + k] if i1 + k < i2 else "" line2 = lines2[j1 + k] if j1 + k < j2 else "" hl_line1, hl_line2 = highlight_line_diff(line1, line2) lineno1 = str(i1 + k + 1) if i1 + k < i2 else "-" lineno2 = str(j1 + k + 1) if j1 + k < j2 else "-" cls1 = "replaced" if line1 else "empty" cls2 = "replaced" if line2 else "empty" table_rows.append(f""" <tr> <td class="line-num {cls1}">{lineno1}</td> <td class="content left {cls1}"><pre>{hl_line1 or ' '}</pre></td> <td class="line-num {cls2}">{lineno2}</td> <td class="content right {cls2}"><pre>{hl_line2 or ' '}</pre></td> </tr>""") if not table_rows: table_rows.append(""" <tr> <td colspan="2" style="text-align:center; color:green;">✅ 两文件内容完全相同</td> <td colspan="2" style="text-align:center; color:green;">✅ No differences found</td> </tr> """) custom_css = """ <style> body { font-family: "Microsoft YaHei", Arial, sans-serif; background-color: #f8f9fa; color: #333; padding: 20px; line-height: 1.6; } .container { max-width: 1400px; margin: 0 auto; background: white; border-radius: 8px; box-shadow: 0 4px 8px rgba(0,0,0,0.1); overflow: hidden; } header { background: #2c3e50; color: white; padding: 15px 20px; text-align: center; } header h1 { margin: 0; font-size: 1.5em; } .diff-table-container { overflow-x: auto; } table.diff { width: 100%; border-collapse: collapse; table-layout: fixed; font-size: 14px; } table.diff th { background: #34495e; color: white; padding: 10px 8px; text-align: center; position: sticky; top: 0; z-index: 10; } td.line-num { width: 40px; text-align: right; font-weight: bold; color: #777; background: #f8f8f8; padding: 6px 4px; user-select: none; white-space: nowrap; font-family: monospace; font-size: 13px; } td.line-num.del { color: #e74c3c; } td.line-num.ins { color: #2ecc71; } td.content { width: 45%; white-space: normal; word-wrap: break-word; word-break: break-word; padding: 6px 8px; vertical-align: top; } td.content pre { margin: 0; font-family: "Consolas", "Menlo", "Monaco", monospace; font-size: 13px; line-height: 1.5; white-space: pre-wrap; word-wrap: break-word; word-break: break-word; background: none; border: none; padding: 0; } table.diff tr:hover { background-color: #f1f9ff; } td.deleted { background-color: #fff8f8; border-left: 3px solid #e74c3c; } td.added { background-color: #f8fff8; border-left: 3px solid #2ecc71; } td.replaced { background-color: #fffff0; border-left: 3px solid #f39c12; } td.empty { color: #ccc; font-style: italic; } .highlight-delete { background-color: #ffebee; text-decoration: line-through; color: #c62828; padding: 0 2px; border-radius: 3px; } .highlight-insert { background-color: #e8f5e8; color: #2e7d32; font-weight: bold; padding: 0 2px; border-radius: 3px; } </style> """ full_html = f"""<!DOCTYPE html> <html lang="zh"> <head> <meta charset="UTF-8"> <title>文档对比结果</title> {custom_css} </head> <body> <div class="container"> <header> <h1>📄 文档两栏对比</h1> <p>{filename1} <strong>vs</strong> {filename2}</p> </header> <div class="diff-table-container"> <table class="diff"> <thead> <tr> <th style="width:40px">行号</th> <th style="width:45%">{filename1}</th> <th style="width:40px">行号</th> <th style="width:45%">{filename2}</th> </tr> </thead> <tbody> {''.join(table_rows)} </tbody> </table> </div> <div style="padding: 15px; font-size: 13px; color: #666; text-align: center; background: #f5f5f5;"> <strong>说明:</strong> <span style="color: #e74c3c;">删除</span> | <span style="color: #2ecc71;">新增</span> | <span style="color: #f39c12;">修改</span> • 行内变化已高亮 </div> </div> </body> </html>""" with open(output_html, 'w', encoding='utf-8') as f: f.write(full_html) print(f"✅ 美化版两栏对比已生成: {os.path.abspath(output_html)}") print(f"👉 请用浏览器打开查看优化后的效果。") if open_in_browser: try: abs_path = os.path.abspath(output_html) file_url = 'file:///' + abs_path.replace('\\', '/') if os.name == 'nt' else urljoin('file:', urlparse(abs_path).path.replace(os.sep, '/')) webbrowser.open(file_url) print(f"👉 已自动打开浏览器预览: {file_url}") except Exception as e: print(f"⚠️ 浏览器打开失败,请手动打开:\n {abs_path}") # === 使用示例 === if __name__ == "__main__": file1 = r"C:\Users\Administrator\Desktop\file2.txt" file2 = r"C:\Users\Administrator\Desktop\file1.txt" generate_beautiful_side_by_side_diff(file1, file2, True, "【美化版】两栏对比结果.html")
结语
这个小工具虽然只有 300 多行代码,但却融合了文本处理、算法匹配、前端渲染和用户体验设计。它不仅实用,还能作为学习 difflib
和 HTML/CSS 布局的优秀范例。
到此这篇关于基于Python打造超漂亮的HTML两栏文本对比工具的文章就介绍到这了,更多相关Python两栏文本对比内容请搜索脚本之家以前的文章或继续浏览下面的相关文章希望大家以后多多支持脚本之家!