Configure Remote Search using REST APIs | HCL Digital Experience
This section shows how to configure Remote Search for your HCL Digital Experience 9.5 environments on a traditional IBM WebSphere Application Server Network Deployment-based cluster DX deployment cluster, a Docker container, or on supported Red Hat OpenShift and Kubernetes container platforms using REST APIs.
Introduction
REST APIs are available to allow listing, deleting, modifying, and defining Portal Search Services (and their parameters). REST APIs may also be used to list, delete, and define Portal Search collections and Portal Search Content Providers (and their parameters). Reference the HCL DX 9.5 Help Center topic REST APIs for Search for additional information.
Configuring Remote Search for DX Portal requires changes to the WebSphere configuration of both the remote search server, as well as the DX Portal server. In addition, Portal changes need to be made on the Portal server itself.
Traditionally, these changes occurred via the Search Admin GUI on DX Portal. Starting HCL Digital Experience 9.5 CF199 and higher deployments, a new set of REST services also enables users to configure Remote Search. A REST service is implemented, and may be used to perform many of the same Remote Search configuration tasks in a selected environment. The environment can be a traditional IBM WebSphere Application Server Network Deployment-based cluster DX deployment, a set of Docker images, or a set of DX Kubernetes PODs.
Prerequisites
- In general, at least one (1) DX Portal Server and exactly one (1) DX Remote Search Server instances must be running. This can be in Docker, in Kubernetes, or a cluster.
- The Portal Servers must have addressability to the Remote Search Server and
vice-versa. Optimally, this is handled through a DNS server so each of the
servers has an IP address statically assigned and resolvable via DNS.
Note: Some services typically assign IP addresses dynamically and are NOT available in DNS. This is true (by default) for Docker. To resolve this issue in Docker, add parameters to the
docker run
command.Docker images can be started like it follows both - to statistically assign a DNS address, and for that DNS address to be in the /etc/hosts file on the servers:#!/bin/bash PORTALIP="172.18.0.10" REMOTESEARCHIP="172.18.0.11" DOCKERHOST="172.19.0.1" PORTAL_DOCKER_IMAGE="quintana-docker.artifactory.cwp.pnp-hcl.com/dx-build-output/core/dxen:v95_CF192_20210206-022427_rohan_DXQ-14209_on_develop_601eaed4" REMOTE_SEARCH_DOCKER_IMAGE="quintana-docker.artifactory.cwp.pnp-hcl.com/dxrs:v95_CF192_20210208-055522_rohan_develop_60215986" echo "Starting portal docker image with tag" $PORTAL_DOCKER_IMAGE echo "Starting remote search docker image with tag" $REMOTE_SEARCH_DOCKER_IMAGE # Start the two docker images #Portal docker run -d --name portaldocker --net aDockerNetwork --ip="$PORTALIP" -p 10039:10039 -p 10041:10041 -p 10042:10042 -p 10200-10205:10200-10205 -p 7777:7777 --add-host="remotesearch:$REMOTESEARCHIP" --add-host="remotesearch:$REMOTESEARCHIP2" --add-host="dockerHost:$DOCKERHOST" $PORTAL_DOCKER_IMAGE #Remote Search docker run -d --name remotesearch --net aDockerNetwork --ip="$REMOTESEARCHIP" -p 8880:8880 -p 2809:2809 -p 9043:9043 -p 9060:9060 -p 9080:9080 -p 9403:9403 --add-host="portaldocker:$PORTALIP" --add-host="dockerHost:$DOCKERHOST" $REMOTE_SEARCH_DOCKER_IMAGE
In order to use statically assigned IP addresses like in the example above, a private Docker subnet is created using the following command:docker network create --internal --subnet 172.18.0.0/16 aDockerNetwork
The default network used in the example is 172.19.0.0/16. This is the address range used by the Docker host.
The example will result to a new set ofConfigEngine
tasks to exist for the WebSphere configuration portion. A new set of REST APIs are also now in place to support thecommand-line
configuration of Portal Search.Note: The configuration commands used in the example configures remote search in a DX environment. However, the collections are empty even though they are defined. To populate the collections, the crawlers must be started. This can either be achieved by manually starting them, putting them on a schedule, or a combination of both.
Access Rights
For any attempted operation, the user that makes the request must first log into the Portal. The logged-in user is then checked for sufficient privileges before the requested action to any subsequent Remote Search REST API request is executed.
If the logged-in user has no sufficient privileges, the Remote Search REST API request is rejected, and an appropriate response is returned.
New REST APIs
Like all REST services, the type of HTTP command (GET
,
PUT
, POST
, DELETE
) dictates
the type of operation.
The format of the URL is very similar for each type. However, some of the types (e.g.
POST
) require JSON input to define the add.
GET -> list
POST -> add
DELETE -> delete
/wps/mycontenthandler/!ut/p/searchadmin/service/Remote+PSE+service+EJB/collection/JCRCollection1/provider/JCR+Content
- The initial portal (/wps/mycontenthandler/!ut/p/searchadmin/service) is invariant and is present in all REST commands for remote search configuration.
- Remote+PSE+service+EJB presents the name of the service on
which to perform an operation. Note that in a URL, the space character is NOT
allowed. You can either replace the space character with the
"
+
" character, or replace the space character with "%20
". Both forms are equivalent. - The collection character sequence is only required when operating on a collection or providers within a collection. In the example, the collection name is JCRCollection1. This happens to be the required collection name for searches of artifacts by the WCM Authoring GUI.
- If the URL is malformed for whatever reason, an error will be returned in response to the request.
- Lastly, and only required when doing operations on a content provider for a
particular service and collection, is you need to add the required character
sequence provider, followed by the name of the provider in question.
In our example, the provider is called JCR Content. Note that a "
+
" replaces a space character in the URL. Thus, the actual provider name is JCR Content.
For all commands, the HTTP response code is useful. For example, if the HTTP response
code is 401
, then it is likely that the one has NOT used the REST
login before the REST configure command.
All these commands require an "Authenticated" status. The POST
and
DEL
commands require administrator access rights on the search
configuration objects.
In all cases, a combination of the HTTP response code along with a potential error message in the response payload indicates a variety of potential issues. Some of these issues may include a lack of access rights for the intended operation, the fact that the resource already exists (for example, trying to create/POST a service name that already exists), and more. Otherwise, a successful returns an HTTP response code of 20x
List
The following command list details of various remote search resources. No JSON body is required on the request. The HTTP response is the JSON which matches the type of the request.
LIST
" doesn't exist, the returned JSON
will be empty (e.g.
"{}
").http://localhost:10039/wps/mycontenthandler/!ut/p/searchadmin/services
http://localhost:10039/wps/mycontenthandler/!ut/p/searchadmin/service/{service name}/
http://localhost:10039/wps/mycontenthandler/!ut/p/searchadmin/service/{service name}/collection/{collection name}
http://localhost:10039/wps/mycontenthandler/!ut/p/searchadmin/service/{service name}/collection/{collection name}/provider/{content provider name}
http://localhost:10039/wps/mycontenthandler/!ut/p/searchadmin/services
{
"services": [
{
"name": "Remote PSE service EJB",
"link": "/wps/mycontenthandler/!ut/p/searchadmin/service/Remote+PSE+service+EJB"
}
]
}
Note that each service name is followed by a relative link, which can be used to get more details of the service.
http://localhost:10039/wps/mycontenthandler/!ut/p/searchadmin/service/Remote+PSE+service+EJB/
{
"RESOURCE_ENVIRONMENT_PROVIDER_NAME": "SearchPropertiesService",
"facetedFields": "null",
"WORK_MANAGER_DEPLOY": "wps/searchIndexWM",
"EJB_Example": "ejb/com/ibm/hrl/portlets/WsPse/WebScannerLiteEJBHome",
"DefaultCollectionsDirectory": "null",
"CONTENT_SOURCE_TYPE_FEATURE_NAME": "ContentSourceType",
"EJB": "ejb/com/ibm/hrl/portlets/WsPse/WebScannerLiteEJBHome",
"MAX_BUILD_BATCH_SIZE": "10000",
"fieldTypes": "null",
"WORK_MANAGER_NATIVE": "force.hrl.work.manager.use.native.threads",
"WORK_MANAGER": "wps/searchIndexWM",
"PSE_TYPE_option_3": "soap",
"PSE_TYPE_option_2": "ejb",
"PSE_TYPE_option_1": "localhost",
"IIOP_URL": "iiop://remotesearch:2809",
"VALIDATE_COOKIE": "123",
"PortalCollectionSourceName": "Remote PSE service EJB",
"WORK_MANAGER_NAME": "wps/searchIndexWM",
"PSE_TYPE": "ejb",
"CONTENT_SOURCE_TYPE_FEATURE_VAL_PORTAL": "Portal",
"HTTP_MAX_BODY_SIZE_MB": "20",
"MAX_BUILD_INTERVAL_TIME_SECONDS": "300",
"SetProperties": "on",
"PortalCollectionName": "TestGood",
"IIOP_URL_Example": "iiop://localhost:2811",
"CLEAN_UP_TIME_OF_DAY_HOURS": "0",
"SOAP_URL_Example": "http://localhost:10000/WebScannerSOAP/servlet/rpcrouter",
"mappedFields": "null",
"OPEN_WCM_WINDOW": "/wps/myportal/wcmContent?WCM_GLOBAL_CONTEXT=",
"SOAP_URL": "null",
"DEFAULT_acls_FIELDINFO": "contentSearchable=false, fieldSearchable=true, returnable=true, sortable=false, supportsExactMatch=true, parametric=false, typeAhead=false",
"SecurityResolverId": "com.ibm.lotus.search.plugins.provider.core.PortalSecurityResolverFactory",
"CONTENT_SOURCE_TYPE_FEATURE_VAL_UPLOAD": "Upload",
"CONTENT_SOURCE_TYPE_FEATURE_VAL_WEB": "Web",
"OpenResultMode": "new",
"SEARCH_SECURITY_MODE": "SECURITY_MODE_PRE_POST_FILTER",
"collections": [
{
"name": "JCRCollection1",
"link": "/wps/mycontenthandler/!ut/p/searchadmin/service/Remote+PSE+service+EJB/collection/JCRCollection1"
},
{
"name": "Portal Search Collection",
"link": "/wps/mycontenthandler/!ut/p/searchadmin/service/Remote+PSE+service+EJB/collection/Portal+Search+Collection"
}
]
}
http://localhost:10039/wps/mycontenthandler/!ut/p/searchadmin/service/Remote+PSE+service+EJB/collection/JCRCollection1
{
"location": "/opt/HCL/AppServer/profiles/prs_profile/SearchCollections/JCRCollection1",
"IndexTitleKey": "JCRCollection1",
"IndexNameKey": "JCRCollection1",
"IndexLanguageKey": "en_US",
"location": "/opt/HCL/AppServer/profiles/prs_profile/SearchCollections/JCRCollection1",
"CollectionStatus": "true",
"IndexDescriptionKey": "JCRCollection1",
"DictionaryAnalysis": "true",
"providers": [
{
"name": "JCR Content",
"link": "/wps/mycontenthandler/!ut/p/searchadmin/service/Remote+PSE+service+EJB/collection/JCRCollection1/provider/JCR+Content"
}
]
}
Delete
{}
"), which is the same as the returned JSON if the request
is successful.
http://localhost:10039/wps/mycontenthandler/!ut/p/searchadmin/service/{service name}/
http://localhost:10039/wps/mycontenthandler/!ut/p/searchadmin/service/{service name}/collection/{collection name}
http://localhost:10039/wps/mycontenthandler/!ut/p/searchadmin/service/{service name}/collection/{collection name}/provider/{content provider name}
In general, after a successful delete operation (HTTP 200), expect that the response
JSON payload is null (e.g. "{}
").
Add
http://localhost:10039/wps/mycontenthandler/!ut/p/searchadmin/service/{service name}/
http://localhost:10039/wps/mycontenthandler/!ut/p/searchadmin/service/{service name}/collection/{collection name}
http://localhost:10039/wps/mycontenthandler/!ut/p/searchadmin/service/{service name}/collection/{collection name}/provider/{content provider name}
Error 400: {resource} Already Exists
where {resource} is one of "service", "collection" or "content provider" as is appropriate for the invalid request URL.
The JSON returned as a result of an add REST
call is exactly that,
which is returned for the same GET
call. Effectively, the returned
JSON echoes the input add
JSON request.
Starting a Crawler
Once the Service/Collection/Content Provider is configured, the crawlers will still not populate the indexes. To populate the indexes, the crawlers must be started.
- The first is via a scheduler, which automatically runs the crawler on a set schedule. Currently this schedule can only be configured in the search GUI.
- The second method is to immediately start the crawler either from the GUI or
via a REST service.The REST service to start a crawler looks as follows:
http://localhost:10039/wps/mycontenthandler/!ut/p/searchadmin/service/Remote+PSE+service+EJB/collection/Portal+Search+Collection/provider/WCMContentSource/crawl
This URL looks like very much a
POST
command to add a content provider. The only difference is that the crawl command is located at the end of the URL. This command will start an immediate crawl on the content provider in the previous portion of the URL.The output of the command is anHTTP 201
return code, along with a JSON body that is exactly like this:{ "crawl": "started" }
Use of API on Main Virtual Portal versus all other Virtual Portals
On a Virtual Portal, the “!ut/p/digest” portal of the URL must
be included as the contenthandler
cannot issue the redirect when
using the URL format without the portion mentioned.
As such, referring to the example URLs above, the “!ut/p/digest” portal of the URL is NOT included. This implies that this URL is issued in the "main" VP of the DX Portal. A 302 redirect will take place, and the “!ut/p/digest” will be inserted in the final URL. This portion of the URL can also be used for the VP URL request.