Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multi-fields config #28

Open
jmwiersma opened this issue Oct 16, 2016 · 6 comments
Open

multi-fields config #28

jmwiersma opened this issue Oct 16, 2016 · 6 comments

Comments

@jmwiersma
Copy link

Hi,

I like to push string input from CSV columns both as analyzed field for full-text search, and well as a not_analyzed field to Elasticsearch. Normally that can be done with the fields mapping parameter;
https://www.elastic.co/guide/en/elasticsearch/reference/current/multi-fields.html
How do I specify that using embulk ?

@sakama
Copy link
Contributor

sakama commented Oct 18, 2016

Hi.
Currently, embulk-output-elasticsearch doesn't support index_template.

But I think it's a good idea to support this feature like fluent-plugin-elasticsearch is doing.

@muga
Copy link
Contributor

muga commented Oct 20, 2016

@jmwiersma
Copy link
Author

Sorry, I feel a technical reply is beyond my abilities on this one.

For now as a work around I have pushed a template to ES, to created non_analyzed fields for any index named embulk-*.
This is a mirror of how LogStash handles this.
Example:

curl -XPUT http://localhost:9200/_template/embulk-* -d '{ "template" : "embulk-*", "settings" : { "index.refresh_interval" : "5s" }, "mappings" : { "_default_" : { "_all" : {"enabled" : true, "omit_norms" : true}, "dynamic_templates" : [ { "message_field" : { "match" : "message", "match_mapping_type" : "string", "mapping" : { "type" : "string", "index" : "analyzed", "omit_norms" : true, "fielddata" : { "format" : "disabled" } } } }, { "string_fields" : { "match" : "*", "match_mapping_type" : "string", "mapping" : { "type" : "string", "index" : "analyzed", "omit_norms" : true, "fielddata" : { "format" : "disabled" }, "fields" : { "raw" : {"type": "string", "index" : "not_analyzed", "doc_values" : true, "ignore_above" : 256} } } } } ] } } }'

@sakama
Copy link
Contributor

sakama commented Oct 21, 2016

@muga @jmwiersma

How can we specify this mapping as plugin config options? Any thoughts?

I think specify as a file is better.
Description method of index template is too complex to specify as column option of embulk.

For example, fluent-plugin-elasticseach has 2 options.

@muga
Copy link
Contributor

muga commented Oct 21, 2016

Thank you. name and file path (or json content) are good as config options. Will take it.

@niwatolli3
Copy link

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

4 participants