Full Text Search (FTS) is a main capability of Content Management systems to search both content and metadata associated to the content. In a previous blog, I already discussed about a new fully scalable architecture for Content Management using Apache Chemistry with Couchbase repository for metadata (and possibly blobs). Today, I would like to discuss about how to integrate FTS capability in a scalable way in this architecture without the need for yet another tier (ElasticSearch, Solr, LudicWorks).
In 2015, Couchbase has announced the development of CBFT which stands for Couchbase Full Text search, actually in developer preview. CBFT is simple, integrated distributed Full Text server which covers 80% of features of most applications.You can find more informations on CBFT here: http://connect15.couchbase.com/agenda/sneak-peek-cbft-full-text-search-couchbase/
In this article, I will start to investigate how to integrate CBFT in CMIS Apache Chemistry for metadata full text search.
- Setup
To install Couchbase, follow the documentation here.
Create a bucket called cmismeta. This bucket contains the metatada of each content (folder, file).
To install Apache Chemistry using Couchbase repository, follow the documentation here.
To install CBFT, follow the documentation here.
- Create a CBFT index
Start CBFT on a local node : cbft -s http://localhost:8091
On the Indexes listing page, click on the New Index button.
To test your index, you need to add content on cmismeta bucket. You can either do it using the Apache Chemistry workbench to create content (folder, files) that will be associated with metadata in cmismeta bucket, or by adding simple content for testing (then remove it).
In this example, I already have a bunch of files added to the Content Management Couchbase repository.
Open the query tab and enter a query using Bleve syntax
- CMIS Apache Chemistry project
First, you need to activate the full text query capabilities of CMIS Couchbase repository class.
- Create a CBFT index
On the Indexes listing page, click on the New Index button.
To test your index, you need to add content on cmismeta bucket. You can either do it using the Apache Chemistry workbench to create content (folder, files) that will be associated with metadata in cmismeta bucket, or by adding simple content for testing (then remove it).
In this example, I already have a bunch of files added to the Content Management Couchbase repository.
Open the query tab and enter a query using Bleve syntax
First, you need to activate the full text query capabilities of CMIS Couchbase repository class.
In this example, I already have a bunch of files added to the Content Management Couchbase repository.
Open the query tab and enter a query using Bleve syntax
- CMIS Apache Chemistry project
First, you need to activate the full text query capabilities of CMIS Couchbase repository class.
public class CouchbaseRepository {
private RepositoryInfo createRepositoryInfo(CmisVersion cmisVersion) {
// set repo infos
RepositoryInfoImpl repositoryInfo = new RepositoryInfoImpl();
repositoryInfo.setCmisVersionSupported(cmisVersion.value());
...
// set repo capabilities
RepositoryCapabilitiesImpl capabilities = new RepositoryCapabilitiesImpl();
capabilities.setCapabilityQuery(CapabilityQuery.FULLTEXTONLY);
...
repositoryInfo.setCapabilities(capabilities);
return repositoryInfo;
}
}
To query the CBFT index, we are using the REST API with a Jersey client.
First, add the dependency in the maven pom file.
First, add the dependency in the maven pom file.
<dependency>
<groupId>com.sun.jersey</groupId>
<artifactId>jersey-client</artifactId>
<version>1.8</version>
</dependency>
Then create a new CBFT service class. This service needs the CBFT location and index name. I provides a simple query method returning a list of keys referring to cmismeta bucket in Couchbase.
package org.apache.chemistry.opencmis.couchbase;
import java.util.ArrayList;
import java.util.List;
import com.couchbase.client.java.document.json.JsonArray;
import com.couchbase.client.java.document.json.JsonObject;
import com.sun.jersey.api.client.Client;
import com.sun.jersey.api.client.ClientResponse;
import com.sun.jersey.api.client.WebResource;
import java.util.ArrayList;
import java.util.List;
import com.couchbase.client.java.document.json.JsonArray;
import com.couchbase.client.java.document.json.JsonObject;
import com.sun.jersey.api.client.Client;
import com.sun.jersey.api.client.ClientResponse;
import com.sun.jersey.api.client.WebResource;
public class CBFTService {
private String cbftLocation = null;
private Client client = null;
private String indexid = null;
public CBFTService(String location, String indexid) {
this.cbftLocation = location;
this.indexid = indexid;
client = Client.create();
}
/** Search cbft index.
* @param query the query to search
* @return list of keys matching the query
* */
public List<String> query(String query){
List<String> results = new ArrayList<String>();
WebResource webResource = client
.resource("http://"+this.cbftLocation+":8095/api/index/"+indexid+"/query");
String input = "{" +
public CBFTService(String location, String indexid) {
this.cbftLocation = location;
this.indexid = indexid;
client = Client.create();
}
/** Search cbft index.
* @param query the query to search
* @return list of keys matching the query
* */
public List<String> query(String query){
List<String> results = new ArrayList<String>();
WebResource webResource = client
.resource("http://"+this.cbftLocation+":8095/api/index/"+indexid+"/query");
String input = "{" +
"\"q\": \""+query+"\"," +
"\"indexName\": \""+indexid+"\"," +
"\"size\": 10,"+
"\"from\": 0,"+
"\"explain\": true,"+
"\"highlight\": {}," +
"\"query\": {" +
"\"boost\": 1,"+
"\"query\": \""+query + "\""+
"},"+
"\"fields\": [" +
"\"*\"" +
"]," +
"\"ctl\": {" +
"\"consistency\": {"+
"\"level\": \"\"," +
"\"vectors\": {}"+
"},"+
"\"timeout\": 0"+
"}"+
"}";
"\"indexName\": \""+indexid+"\"," +
"\"size\": 10,"+
"\"from\": 0,"+
"\"explain\": true,"+
"\"highlight\": {}," +
"\"query\": {" +
"\"boost\": 1,"+
"\"query\": \""+query + "\""+
"},"+
"\"fields\": [" +
"\"*\"" +
"]," +
"\"ctl\": {" +
"\"consistency\": {"+
"\"level\": \"\"," +
"\"vectors\": {}"+
"},"+
"\"timeout\": 0"+
"}"+
"}";
ClientResponse response = webResource.type("application/json")
.post(ClientResponse.class, input);
if (response.getStatus() != 200) {
throw new RuntimeException("Failed : HTTP error code : "
+ response.getStatus());
}
String output = response.getEntity(String.class);
.post(ClientResponse.class, input);
if (response.getStatus() != 200) {
throw new RuntimeException("Failed : HTTP error code : "
+ response.getStatus());
}
String output = response.getEntity(String.class);
JsonObject content = JsonObject.fromJson(output);
JsonArray hits = content.getArray("hits");
if(hits != null){
String id;
if(hits != null){
String id;
for(int i=0 ; i<hits.size(); i++){
id = hits.getObject(i).getString("id");
results.add(id);
id = hits.getObject(i).getString("id");
results.add(id);
Being new to the blogging world I feel like there is still so much to learn. Your tips helped to clarify a few things for me as well as giving..
RépondreSupprimerEssay writing service
MBA assignment help
Java assignment help
Marketing assignment help