I have one index which is quite large (about 100Gb), so I had to extend my disk space on my digital ocean survey by adding another volume (I run everything on only one node). I told elasticsearch that it now has to consider two disk locations by
/usr/share/elasticsearch/bin/elasticsearch -Epath.data=/var/lib/elasticsearch,/mnt/volume-sfo2-01/es_data
elasticsearch does seem to have taken notice of this since it wrote some stuff to the new location
/mnt/volume-sfo2-01/es_data# cd nodes/ /mnt/volume-sfo2-01/es_data/nodes# ls 0 /mnt/volume-sfo2-01/es_data/nodes# cd 0/ /mnt/volume-sfo2-01/es_data/nodes/0# ls indices node.lock _state /mnt/volume-sfo2-01/es_data/nodes/0# cd indices /mnt/volume-sfo2-01/es_data/nodes/0/indices# ls DixLGLrJRXm1gSYcFzkzzw nmZbce8wTayJC2s_eMC0-g Qd-9ZnFIRoSM2z7AohKm-w Sm_tyYTJTty0ImvDamFaQw /mnt/volume-sfo2-01/es_data/nodes/0/indices# cd DixLGLrJRXm1gSYcFzkzzw/ /mnt/volume-sfo2-01/es_data/nodes/0/indices/DixLGLrJRXm1gSYcFzkzzw# ls _state
which is identical to the stuff I find in /var/lib/elasticsearch/data, except of the actual index information in the lowest level.
Reading the elasticsearch documentary I got the impression that elasticsearch is arranging the new index over the two disk locations, but will not split a shard between the two locations. So I initialized the index with 5 shards so that it can split the data between the volumes.
The survey does seem to have detected the two data paths since the log file shows
[2017-06-17T19:16:57,079][INFO ][o.e.e.NodeEnvironment ] [WU6cQ-o] using [2] data paths, mounts [[/ (/dev/vda1), /mnt/volume-sfo2-01 (/dev/sda)]], net usable_space [29.6gb], net total_space [98.1gb], spins? [possibly], types [ext4]
However, when I index the new indices, with constantly uses all the disk space on my original disk and eventually runs out of disk space with the error
raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info) elasticsearch.exceptions.TransportError: TransportError(500, u'index_failed_engine_exception', u'Index failed for [pubmed_paper#25949809]')
It never shifts one of the shards to the second volume? Do I miss anything? Can I manually guide the disk space usage?
Here are the elasticsearch version details:
# curl -XGET 'localhost:9200' { "name" : "WU6cQ-o", "cluster_name" : "elasticsearch", "cluster_uuid" : "hKc147QfQqCefLliStLNtw", "version" : { "number" : "5.1.1", "build_hash" : "5395e21", "build_date" : "2016-12-06T12:36:15.409Z", "build_snapshot" : false, "lucene_version" : "6.3.0" }, "tagline" : "You Know, for Search" }
and here is the default path file structure, where ekasticsearch stores all the information (instead of sharing it with the second path)
/var/lib/elasticsearch/elasticsearch/nodes/0/indices/DixLGLrJRXm1gSYcFzkzzw# ls 0 1 2 3 4 _state
one question is probably whether I can just take one of these shards and move it to the other location?
0 comments:
Post a Comment