ElasticSearch学习之文档API相关操作
作者:程序员皮卡秋
前言
本节主要给大家讲一下文档API相关操作。在学习之前,建议大家先回顾前几节内容,让自己有一个整体的认知,不要把概念混淆了。我们前几节都在讲索引
,它是和文档挂钩的,文档我们可以理解为数据
,数据有增删改查
,本节就主要跟大家讲一下文档
的增删改查操作。
本文偏实战一些,好了, 废话不多说直接开整吧~
创建文档
这里沿用之前的例子,使用class_1
的索引,我们先看下它的索引结构:
GET /class_1
返回:
{ "class_1" : { "aliases" : { "class" : { } }, "mappings" : { "properties" : { "name" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword", "ignore_above" : 256 } } }, "num" : { "type" : "long" } } }, "settings" : { "index" : { "refresh_interval" : "3s", "number_of_shards" : "3", "provided_name" : "class_1", "creation_date" : "1670812583980", "number_of_replicas" : "1", "uuid" : "CTD3dM-fQm-KFEVl4nAgRQ", "version" : { "created" : "7060299" } } } } }
通过结构可以看到,它主要有两个字段name
和num
,那么我们怎么往里边添加数据呢?
创建文档分为以下几种情况:
- 创建
单个数据指定ID
:使用_doc
路由+PUT
请求+id
参数 - 创建
单个数据不指定ID
:使用_doc
路由+POST
请求 - 创建
单个数据指定ID并进行ID唯一性控制
:使用_doc
路由+PUT
请求+id
参数+op_type=create
参数 - 创建
批量数据指定ID
:使用_bulk
路由+PUT
请求/POST
请求+create
关键字+_id
属性 - 创建
批量数据不指定ID
:使用_bulk
路由+PUT
请求/POST
请求+create
关键字
下面,带大家一个一个看
单个数据
指定ID
PUT /class_1/_doc/1 { "name":"a", "num": 5 }
创建成功返回:
{ "_index" : "class_1", "_type" : "_doc", "_id" : "1", "_version" : 1, "result" : "created", "_shards" : { "total" : 2, "successful" : 2, "failed" : 0 }, "_seq_no" : 1, "_primary_term" : 3 }
不指定ID
PUT /class_1/_doc/ { "name":"b", "num": 6 }
{ "error" : "Incorrect HTTP method for uri [/class_1/_doc/?pretty=true] and method [PUT], allowed: [POST]", "status" : 405 }
创建失败了,告诉我们这里要使用POST
POST /class_1/_doc/ { "name":"b", "num": 6 }
{ "_index" : "class_1", "_type" : "_doc", "_id" : "h2Fg-4UBECmbBdQA6VLg", "_version" : 1, "result" : "created", "_shards" : { "total" : 2, "successful" : 2, "failed" : 0 }, "_seq_no" : 0, "_primary_term" : 3 }
可以看到不指定id情况下,创建的文档_id
随机生成了
ID唯一性控制
PUT /class_1/_doc/1?op_type=create { "name":"c", "num": 7 }
{ "error" : { "root_cause" : [ { "type" : "version_conflict_engine_exception", "reason" : "[1]: version conflict, document already exists (current version [1])", "index_uuid" : "CTD3dM-fQm-KFEVl4nAgRQ", "shard" : "2", "index" : "class_1" } ], "type" : "version_conflict_engine_exception", "reason" : "[1]: version conflict, document already exists (current version [1])", "index_uuid" : "CTD3dM-fQm-KFEVl4nAgRQ", "shard" : "2", "index" : "class_1" }, "status" : 409 }
可以看到,创建失败了,返回document already exists
批量数据
指定ID
PUT class_1/_bulk { "create":{ "_id": 2 } } {"name":"d","num": 8} { "create":{ "_id": 3 } } { "name":"e","num": 9} { "create":{ "_id": 4 } } {"name":"f","num": 10}
tip: 这里要注意,不能有空行,json对象{}需要在同一行
{ "took" : 25, "errors" : false, "items" : [ { "create" : { "_index" : "class_1", "_type" : "_doc", "_id" : "2", "_version" : 1, "result" : "created", "_shards" : { "total" : 2, "successful" : 2, "failed" : 0 }, "_seq_no" : 0, "_primary_term" : 4, "status" : 201 } }, { "create" : { "_index" : "class_1", "_type" : "_doc", "_id" : "3", "_version" : 1, "result" : "created", "_shards" : { "total" : 2, "successful" : 2, "failed" : 0 }, "_seq_no" : 1, "_primary_term" : 4, "status" : 201 } }, { "create" : { "_index" : "class_1", "_type" : "_doc", "_id" : "4", "_version" : 1, "result" : "created", "_shards" : { "total" : 2, "successful" : 2, "failed" : 0 }, "_seq_no" : 2, "_primary_term" : 4, "status" : 201 } } ] }
不指定ID
很简单,去掉id
属性就好了~
PUT class_1/_bulk { "create":{ } } {"name":"g","num": 8} { "create":{ } } { "name":"h","num": 9} { "create":{ } } {"name":"i","num": 10}
{ "took" : 30, "errors" : false, "items" : [ { "create" : { "_index" : "class_1", "_type" : "_doc", "_id" : "iGFt-4UBECmbBdQAnVJe", "_version" : 1, "result" : "created", "_shards" : { "total" : 2, "successful" : 2, "failed" : 0 }, "_seq_no" : 3, "_primary_term" : 4, "status" : 201 } }, { "create" : { "_index" : "class_1", "_type" : "_doc", "_id" : "iWFt-4UBECmbBdQAnVJg", "_version" : 1, "result" : "created", "_shards" : { "total" : 2, "successful" : 2, "failed" : 0 }, "_seq_no" : 4, "_primary_term" : 4, "status" : 201 } }, { "create" : { "_index" : "class_1", "_type" : "_doc", "_id" : "imFt-4UBECmbBdQAnVJg", "_version" : 1, "result" : "created", "_shards" : { "total" : 2, "successful" : 2, "failed" : 0 }, "_seq_no" : 5, "_primary_term" : 4, "status" : 201 } } ] }
修改文档
文档修改分以下几种情况:
- 按照
ID
全量更新单个
数据:使用_doc
路由+PUT
请求+id
参数 - 按照
ID
全量更新单个
数据并进行乐观锁
控制:使用_doc
路由+PUT
请求+if_seq_no&if_primary_term
参数+id
参数 - 按照
ID
部分更新单个
数据(包含属性添加):使用_update
路由+POST
请求+id
参数 - 按照
ID
全量更新批量
数据:使用_bulk
路由+PUT
请求/POST
请求+index
关键字+_id
属性 - 按照
ID
部分更新批量
数据(包含属性添加):使用_bulk
路由+PUT
请求/POST
请求+update
关键字+_id
属性 - 按照条件
修改
数据:使用_update_by_query
路由+POST
请求+ctx._source[字段名称]
=字段值 - 按照条件给数据
新增
属性:使用_update_by_query
路由+POST
请求+ctx._source[字段名称]
=字段值 - 按照条件给数据
移除
属性:使用_update_by_query
路由+POST
请求+ctx._source.remove
(字段名称)
同样的,带大家一个个来看~
按照ID单个
全量更新
PUT /class_1/_doc/1 { "name":"k", "num": 5 }
{ "_index" : "class_1", "_type" : "_doc", "_id" : "1", "_version" : 2, "result" : "updated", "_shards" : { "total" : 2, "successful" : 2, "failed" : 0 }, "_seq_no" : 2, "_primary_term" : 3 }
再次修改:
PUT /class_1/_doc/1 { "name":"k", "num": 6 }
{ "_index" : "class_1", "_type" : "_doc", "_id" : "1", "_version" : 3, "result" : "updated", "_shards" : { "total" : 2, "successful" : 2, "failed" : 0 }, "_seq_no" : 3, "_primary_term" : 3 }
大家观察一下这个_version
字段,发现它的版本号是递增的,也就是说会随着我们的修改而变化
基于乐观锁全量更新
跟上条件if_seq_no,if_primary_term
PUT /class_1/_doc/1?if_seq_no=3&if_primary_term=3 { "name":"l", "num": 6 }
{ "_index" : "class_1", "_type" : "_doc", "_id" : "1", "_version" : 4, "result" : "updated", "_shards" : { "total" : 2, "successful" : 2, "failed" : 0 }, "_seq_no" : 4, "_primary_term" : 3 }
再操作一下:
{ "error" : { "root_cause" : [ { "type" : "version_conflict_engine_exception", "reason" : "[1]: version conflict, required seqNo [3], primary term [3]. current document has seqNo [4] and primary term [3]", "index_uuid" : "CTD3dM-fQm-KFEVl4nAgRQ", "shard" : "2", "index" : "class_1" } ], "type" : "version_conflict_engine_exception", "reason" : "[1]: version conflict, required seqNo [3], primary term [3]. current document has seqNo [4] and primary term [3]", "index_uuid" : "CTD3dM-fQm-KFEVl4nAgRQ", "shard" : "2", "index" : "class_1" }, "status" : 409 }
发现操作失败了,因为条件不符合 required seqNo [3], primary term [3]
,上一步操作完之后seqNo和primary term [4]
部分更新
PUT /class_1/_update/1 { "doc":{ "name":"m", "num": 1 } }
按照ID批量
全量更新
PUT class_1/_bulk { "create":{ "_id": 2 } } {"name":"d","num": 8} { "create":{ "_id": 3 } } { "name":"e","num": 9} { "create":{ "_id": 4 } } {"name":"f","num": 10}
这个应该好理解
部分更新
需要修改为update
并添加属性
PUT class_1/_bulk { "update":{ "_id": 2 } } { "doc":{"name":"d","num": 8}} { "update":{ "_id": 3 } } { "doc":{ "name":"e","num": 9}} { "update":{ "_id": 4 } } { "doc":{"name":"f","num": 10}}
返回:
{ "took" : 32, "errors" : false, "items" : [ { "update" : { "_index" : "class_1", "_type" : "_doc", "_id" : "2", "_version" : 2, "result" : "updated", "_shards" : { "total" : 2, "successful" : 2, "failed" : 0 }, "_seq_no" : 6, "_primary_term" : 4, "status" : 200 } }, { "update" : { "_index" : "class_1", "_type" : "_doc", "_id" : "3", "_version" : 2, "result" : "updated", "_shards" : { "total" : 2, "successful" : 2, "failed" : 0 }, "_seq_no" : 7, "_primary_term" : 4, "status" : 200 } }, { "update" : { "_index" : "class_1", "_type" : "_doc", "_id" : "4", "_version" : 2, "result" : "updated", "_shards" : { "total" : 2, "successful" : 2, "failed" : 0 }, "_seq_no" : 8, "_primary_term" : 4, "status" : 200 } } ] }
按照条件修改
修改字段
查找name=d
的数据修改为num=10,name=e
POST class_1/_update_by_query { "query": { "match": { "name": "d" } }, "script": { "source": "ctx._source['num']='10';ctx._source['name']='e'", "lang": "painless" } }
返回:
{ "took" : 108, "timed_out" : false, "total" : 1, "updated" : 1, "deleted" : 0, "batches" : 1, "version_conflicts" : 0, "noops" : 0, "retries" : { "bulk" : 0, "search" : 0 }, "throttled_millis" : 0, "requests_per_second" : -1.0, "throttled_until_millis" : 0, "failures" : [ ] }
增加字段
POST class_1/_update_by_query { "query": { "match": { "name": "e" } }, "script": { "source": "ctx._source['desc']=['hhhh']", "lang": "painless" } }
{ "took" : 344, "timed_out" : false, "total" : 2, "updated" : 2, "deleted" : 0, "batches" : 1, "version_conflicts" : 0, "noops" : 0, "retries" : { "bulk" : 0, "search" : 0 }, "throttled_millis" : 0, "requests_per_second" : -1.0, "throttled_until_millis" : 0, "failures" : [ ] }
接着我们查下class_1
的索引结构:
{ "class_1" : { "aliases" : { "class" : { } }, "mappings" : { "properties" : { "age" : { "type" : "long" }, "desc" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword", "ignore_above" : 256 } } }, "name" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword", "ignore_above" : 256 } } }, "num" : { "type" : "long" } } }, "settings" : { "index" : { "refresh_interval" : "3s", "number_of_shards" : "3", "provided_name" : "class_1", "creation_date" : "1670812583980", "number_of_replicas" : "1", "uuid" : "CTD3dM-fQm-KFEVl4nAgRQ", "version" : { "created" : "7060299" } } } } }
可以看到多了一个字段:
{ "desc" : { "type" : "text", "fields" : { "keyword" : { "type" : "keyword", "ignore_above" : 256 } } } }
移除字段
POST class_1/_update_by_query { "query": { "match": { "name": "e" } }, "script": { "source": "ctx._source.remove('desc')", "lang": "painless" } }
大家可以试着运行一下,然后再查下索引
删除文档
按照ID & 单个删除
DELETE /class_1/_doc/2
{ "_index" : "class_1", "_type" : "_doc", "_id" : "2", "_version" : 5, "result" : "deleted", "_shards" : { "total" : 2, "successful" : 2, "failed" : 0 }, "_seq_no" : 12, "_primary_term" : 4 }
按照ID & 批量删除
PUT class_1/_bulk { "delete":{"_id":"2" } } { "delete":{"_id":"3" } }
按照条件删除
POST class_1/_delete_by_query { "query":{ "match_all":{ "name": "e" } } }
结束语
本节主要讲了ES中的文档API操作
,还遗留一个查询
操作, 该部分内容较多,放到后边给大家讲,更多关于ElasticSearch文档API操作的资料请关注脚本之家其它相关文章!