docker

关注公众号 jb51net

关闭
首页 > 网站技巧 > 服务器 > 云和虚拟化 > docker > Docker部署Elasticsearch配置分词器

Docker容器部署Elasticsearch并配置分词器的方法实现

作者:冷炫風刃

本文主要介绍了Docker容器部署Elasticsearch并配置分词器,文中通过示例代码介绍的非常详细,对大家的学习或者工作具有一定的参考学习价值,需要的朋友们下面随着小编来一起学习学习吧

一、环境准备

1.1 拉取Elasticsearch镜像

# 拉取指定版本(建议使用7.x或8.x稳定版)
docker pull elasticsearch:7.17.20
# 验证镜像
docker images | grep elasticsearch

1.2 创建挂载目录

# 创建数据、配置、插件、日志目录
mkdir -p /opt/es/{data,config,plugins,logs}
# 设置权限(ES容器以uid:1000用户运行)
chown -R 1000:1000 /opt/es

1.3 基础配置文件

创建 /opt/es/config/elasticsearch.yml

cluster.name: "docker-cluster"
network.host: 0.0.0.0
http.port: 9200
discovery.type: single-node
xpack.security.enabled: false
http.cors.enabled: true
http.cors.allow-origin: "*"

二、启动Elasticsearch容器

2.1 运行容器

docker run -d \
  --name elasticsearch \
  -p 9200:9200 \
  -p 9300:9300 \
  -e "discovery.type=single-node" \
  -e "ES_JAVA_OPTS=-Xms512m -Xmx512m" \
  -v /opt/es/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml \
  -v /opt/es/data:/usr/share/elasticsearch/data \
  -v /opt/es/plugins:/usr/share/elasticsearch/plugins \
  -v /opt/es/logs:/usr/share/elasticsearch/logs \
  --restart=unless-stopped \
  elasticsearch:7.17.20

2.2 验证启动

# 检查容器状态
docker ps | grep elasticsearch
# 测试ES服务
curl http://localhost:9200

预期返回:

{
  "name" : "node-1",
  "cluster_name" : "docker-cluster",
  "version" : { "number" : "7.17.20" }
}

三、安装IK分词器

3.1 方法一:在线安装(进入容器)

# 进入容器
docker exec -it elasticsearch /bin/bash
# 在线安装IK(版本需与ES完全一致)
./bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v7.17.20/elasticsearch-analysis-ik-7.17.20.zip
# 退出并重启
exit
docker restart elasticsearch

3.2 方法二:离线安装(推荐)

bash

# 1. 下载对应版本的IK插件包
cd /opt/es/plugins
wget https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v7.17.20/elasticsearch-analysis-ik-7.17.20.zip

# 2. 解压到ik目录
unzip elasticsearch-analysis-ik-7.17.20.zip -d ik
rm -f elasticsearch-analysis-ik-7.17.20.zip

# 3. 设置权限
chown -R 1000:1000 /opt/es/plugins/ik

# 4. 重启容器
docker restart elasticsearch

3.3 方法三:Dockerfile构建(生产推荐)

dockerfile

FROM elasticsearch:7.17.20
COPY elasticsearch-analysis-ik-7.17.20.zip /tmp/
RUN ./bin/elasticsearch-plugin install -b file:///tmp/elasticsearch-analysis-ik-7.17.20.zip
docker build -t elasticsearch-with-ik:7.17.20 .
docker run -d --name elasticsearch -p 9200:9200 elasticsearch-with-ik:7.17.20

四、验证IK分词器

4.1 查看已安装插件

docker exec -it elasticsearch ./bin/elasticsearch-plugin list

预期输出:analysis-ik

4.2 测试分词效果

curl -X POST "http://localhost:9200/_analyze" -H "Content-Type: application/json" -d'
{
  "analyzer": "ik_max_word",
  "text": "中华人民共和国国徽"
}'

ik_max_word(最细粒度)返回示例

{
  "tokens": [
    {"token": "中华人民共和国", "position": 0},
    {"token": "中华人民", "position": 1},
    {"token": "中华", "position": 2},
    {"token": "华人", "position": 3},
    {"token": "人民共和国", "position": 4},
    {"token": "人民", "position": 5},
    {"token": "共和国", "position": 6},
    {"token": "共和", "position": 7},
    {"token": "国", "position": 8},
    {"token": "国徽", "position": 9}
  ]
}
# 测试ik_smart(最少切分)
curl -X POST "http://localhost:9200/_analyze" -H "Content-Type: application/json" -d'
{
  "analyzer": "ik_smart",
  "text": "中华人民共和国国徽"
}'

五、自定义词典配置

5.1 找到IK配置文件位置

# 查看插件目录挂载位置
docker inspect elasticsearch | grep -A 10 "Mounts"

配置文件位于:/usr/share/elasticsearch/config/analysis-ik/

5.2 修改IKAnalyzer.cfg.xml

# 进入容器
docker exec -it elasticsearch /bin/bash
# 编辑配置文件
vi /usr/share/elasticsearch/config/analysis-ik/IKAnalyzer.cfg.xml

添加扩展词典配置:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
<properties>
    <comment>IK Analyzer 扩展配置</comment>
    <!-- 自定义扩展词典 -->
    <entry key="ext_dict">ext_dict.dic</entry>
    <!-- 自定义停用词词典 -->
    <entry key="ext_stopwords">stopword.dic</entry>
</properties>

5.3 创建自定义词典文件

# 进入配置目录
cd /usr/share/elasticsearch/config/analysis-ik/
# 创建扩展词典(每行一个词)
echo "传智播客" >> ext_dict.dic
echo "奥力给" >> ext_dict.dic
echo "人工智能" >> ext_dict.dic
# 创建停用词词典
echo "的" >> stopword.dic
echo "了" >> stopword.dic
echo "呢" >> stopword.dic
# 退出容器
exit
# 重启ES
docker restart elasticsearch

5.4 挂载方式管理词典(推荐)

# 1. 从容器复制默认配置到宿主机
docker cp elasticsearch:/usr/share/elasticsearch/config/analysis-ik ./analysis-ik-config
# 2. 修改宿主机上的配置文件
vi ./analysis-ik-config/IKAnalyzer.cfg.xml
echo "自定义词汇" >> ./analysis-ik-config/ext_dict.dic
# 3. 重新挂载启动(挂载配置目录)
docker run -d \
  --name elasticsearch \
  -p 9200:9200 \
  -v /opt/es/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml \
  -v /opt/es/data:/usr/share/elasticsearch/data \
  -v /opt/es/plugins:/usr/share/elasticsearch/plugins \
  -v /opt/es/logs:/usr/share/elasticsearch/logs \
  -v $(pwd)/analysis-ik-config:/usr/share/elasticsearch/config/analysis-ik \
  elasticsearch:7.17.20

六、验证自定义词典

curl -X POST "http://localhost:9200/_analyze" -H "Content-Type: application/json" -d'
{
  "analyzer": "ik_max_word",
  "text": "传智播客Java就业率超过95%,奥力给!"
}'

如果"传智播客"和"奥力给"被正确识别为一个完整词元,说明自定义词典生效。

七、常见问题与排错

7.1 容器启动后闪退

# 查看错误日志
docker logs elasticsearch
# 常见原因:
# 1. IK版本与ES版本不匹配 → 重新下载匹配版本
# 2. 插件目录权限不足 → chown -R 1000:1000 /opt/es/plugins
# 3. 内存不足 → 调整ES_JAVA_OPTS=-Xms256m -Xmx256m

7.2 分词器不生效

# 检查插件是否安装成功
docker exec -it elasticsearch ./bin/elasticsearch-plugin list

# 检查文件编码(必须是UTF-8,无BOM)
file -bi /opt/es/plugins/ik/config/*.dic

7.3 自定义词典未加载

# 查看ES日志确认词典加载
docker logs elasticsearch 2>&1 | grep -i "dic"
# 预期输出:
# [INFO] loading ext_dict.dic
# [INFO] loading stopword.dic

八、快速参考

操作命令
启动ESdocker start elasticsearch
停止ESdocker stop elasticsearch
重启ESdocker restart elasticsearch
查看日志docker logs -f elasticsearch
进入容器docker exec -it elasticsearch /bin/bash
测试分词curl -X POST "localhost:9200/_analyze" -H "Content-Type: application/json" -d'{...}'

到此这篇关于Docker容器部署Elasticsearch并配置分词器的文章就介绍到这了,更多相关Docker部署Elasticsearch配置分词器内容请搜索脚本之家以前的文章或继续浏览下面的相关文章希望大家以后多多支持脚本之家!

您可能感兴趣的文章:
阅读全文