vendredi 1 juillet 2016

MultTermVectors in Elasticsearch Java


I am using the following function to get the term vector for some set of IDs.

public static void builtTermVectorRequest(Client client, String index, Map<String, String> postIDs) {
    TermVectorsRequest termVectorsRequest = new TermVectorsRequest();
    termVectorsRequest.index(index).type("post");
    for (Map.Entry<String, String> entry : postIDs.entrySet()) {
      String currentPostId = entry.getKey();
      String currentParentID = entry.getValue();
      termVectorsRequest
              .id(currentPostId)
              .parent(currentParentID)
              .termStatistics(true)
              .selectedFields("content");
    }

    MultiTermVectorsRequestBuilder mtbuilder = client.prepareMultiTermVectors();
    mtbuilder.add(termVectorsRequest);

    MultiTermVectorsResponse response = mtbuilder.execute().actionGet();
    XContentBuilder builder;
    try {
      builder = XContentFactory.jsonBuilder().startObject();
      response.toXContent(builder, ToXContent.EMPTY_PARAMS);
      builder.endObject();
      System.out.println(builder.prettyPrint().string());
    } catch (IOException e) {}
  }

Here I have some document IDs along with their parent IDs as the documents are child documents.

I get that the documents were not found even when they exist.

To confirm I tried the same thing in Python using:

body = dict(docs=map(lambda x:
                     {
                         "fields": ["content"],
                         "_id": x["_id"],
                         "_routing": x["_routing"],
                         "term_statistics": "true"
                     }, result["hits"]["hits"]))

es_client = elasticsearch.Elasticsearch([{'host': '192.168.111.12', 'port': 9200}])

all_term_vectors = es_client.mtermvectors(
    index="prf_test",
    doc_type="post",
    body=body
)

and I get results back.

What is wrong with the Java code?


Aucun commentaire:

Enregistrer un commentaire