Comparing Stream-Based, Page.listChildren, and Query Builder Methods for Listing AEM Children Pages

Problem Statement:

What is the best way to list all the children in AEM?

Stream-based VS page.listChildren VS Query Builder

Introduction:

AEM Sling Query is a resource traversal tool recommended for content traversal in AEM. Traversal using listChildren(), getChildren(), or the Resource API is preferable to writing JCR Queries as querying can be more costly than traversal. Sling Query is not a replacement for JCR Queries. When traversal involves checking multiple levels down, Sling Query is recommended because it involves lazy evaluation of query results.

JCR queries in AEM development and recommends using them sparingly in production environments due to performance concerns. JCR queries are suitable for end-user searches and structured content retrieval but should not be used for rendering requests such as navigation or content counts.

How can I get all the child pages in AEM using JCR Query?

List<String> queryList = new ArrayList<>();
Map<String, String> map = new HashMap<>();
map.put("path", resource.getPath());
map.put("type", "cq:PageContent");
map.put("p.limit", "-1");

Session session = resolver.adaptTo(Session.class);
Query query = queryBuilder.createQuery(PredicateGroup.create(map), session);
SearchResult result = query.getResult();
ResourceResolver leakingResourceResolverReference = null;
try {
    for (final Hit hit : result.getHits()) {
        if (leakingResourceResolverReference == null) {
            leakingResourceResolverReference = hit.getResource().getResourceResolver();
        }
        queryList.add(hit.getPath());
    }
} catch (RepositoryException e) {
    log.error("Error collecting inherited section search results", e);
} finally {
    if (leakingResourceResolverReference != null) {
        leakingResourceResolverReference.close();
    }
}

But JCR Query consumes more resources

AEM recommends using Page.listchildren because of less complexity

List<String> pageList = new ArrayList<>();
Page page = resource.adaptTo(Page.class);
Iterator<Page> childIterator = page.listChildren(new PageFilter(), true);
StreamSupport.stream(((Iterable<Page>) () -> childIterator).spliterator(), false).forEach( r -> {
    pageList.add(r.getPath());
    }
);

But it sometimes misses some results in the result set and it’s slower compared to Java streams based

How about Java streams?

Java streams can iterate faster and execute faster and consumes very few resources

List<String> streamList = new ArrayList<>();
for (Resource descendant : (Iterable<? extends Resource>) traverse(resource)::iterator) {
    streamList.add(descendant.getPath());
}
private Stream<Resource> traverse(@NotNull Resource resourceRoot) {
    Stream<Resource> children = StreamSupport.stream(resourceRoot.getChildren().spliterator(), false)
            .filter(this::shouldFollow);
    return Stream.concat(
            shouldInclude(resourceRoot) ? Stream.of(resourceRoot) : Stream.empty(),
            children.flatMap(this::traverse)
    );
}

protected boolean shouldFollow(@NotNull Resource resource) {
    return !JcrConstants.JCR_CONTENT.equals(resource.getName());
}

protected boolean shouldInclude(@NotNull Resource resource) {
    return resource.getChild(JcrConstants.JCR_CONTENT) != null;
}

I recently came across this logic while debugging the OOTB sling sitemap generator: https://github.com/apache/sling-org-apache-sling-sitemap

results comparison

Stream-based results took just 3miliseconds compared to page.listChildren or query

AEM Query Builder Optimization using Java Streams and Resource Filter

Problem statement:

Can Java Streams and Resource Filter be used as an alternative to Query Builder queries in AEM for filtering pages and resources based on specific criteria?

Requirement:

The query for the pages whose resurcetype = “wknd/components/page” and get child resources which have an Image component (“wknd/components/image”) and get the file reference properties into a list

Query builder query would be like this:

@PostConstruct
private void initModel() {
  Map < String, String > map = new HashMap < > ();
  map.put("path", resource.getPath());
  map.put("property", "jcr:primaryType");
  map.put("property.value", "wknd/components/page");

  PredicateGroup predicateGroup = PredicateGroup.create(map);
  QueryBuilder queryBuilder = resourceResolver.adaptTo(QueryBuilder.class);

  Query query = queryBuilder.createQuery(predicateGroup, resourceResolver.adaptTo(Session.class));
  SearchResult result = query.getResult();

  List < String > imagePath = new ArrayList < > ();

  try {
    for (final Hit hit: result.getHits()) {
      Resource resultResource = hit.getResource();
      @NotNull
      Iterator < Resource > children = resultResource.listChildren();
      while (children.hasNext()) {
        final Resource child = children.next();
        if (StringUtils.equalsIgnoreCase(child.getResourceType(), "wknd/components/image")) {
          Image image = modelFactory.getModelFromWrappedRequest(request, child, Image.class);
          imagePath.add(image.getFileReference());
        }
      }
    }
  } catch (RepositoryException e) {
    LOGGER.error("error occurered while getting result resource {}", e.getMessage());
  }
}

Introduction

This article discusses the use of Java Streams and Resource Filter in optimizing AEM Query Builder queries. The article provides code examples for using Resource Filter Streams to filter pages and resources and using Java Streams to filter and map child resources based on specific criteria. The article also provides optimization strategies for AEM tree traversal to reduce memory consumption and improve performance.

Resource Filter bundle provides a number of services and utilities to identify and filter resources in a resource tree.

Resource Filter Stream:

ResourceFilterStream combines the ResourceStream functionality with the ResourcePredicates service to provide an ability to define a Stream<Resource> that follows specific child pages and looks for specific Resources as defined by the resources filter script. The ResourceStreamFilter is accessed by adaptation.

ResourceFilterStream rfs = resource.adaptTo(ResourceFilterStream.class);
rfs
  .setBranchSelector("[jcr:primaryType] == 'cq:Page'")
  .setChildSelector("[jcr:content/sling:resourceType] != 'apps/components/page/folder'")
  .stream()
  .collect(Collectors.toList());

Parameters

The ResourceFilter and ResourceFilteStream can have key-value pairs added so that the values may be used as part of the script resolution. Parameters are accessed by using the dollar sign ‘$’

rfs.setBranchSelector("[jcr:content/sling:resourceType] != $type").addParam("type","apps/components/page/folder");

Using Resource Filter Stream the example code would look like below:

@PostConstruct
private void initModel() {
  ResourceFilterStream rfs = resource.adaptTo(ResourceFilterStream.class);
  List < String > imagePaths = rfs.setBranchSelector("[jcr:primaryType] == 'cq:Page'")
    .setChildSelector("[jcr:content/sling:resourceType] == 'wknd/components/page'")
    .stream()
    .filter(r -> StringUtils.equalsIgnoreCase(r.getResourceType(), "wknd/components/image"))
    .map(r -> modelFactory.getModelFromWrappedRequest(request, r, Image.class).getFileReference())
    .collect(Collectors.toList());
}

Optimizing Traversals

Similar to indexing in a query there are strategies that you can do within a tree traversal so that traversals can be done in an efficient manner across a large number of resources. The following strategies will assist in traversal optimization.

Limit traversal paths

In a naive implementation of a tree traversal, the traversal occurs across all nodes in the tree regardless of the ability of the tree structure to support the nodes that are being looked for. An example of this is a tree of Page resources that has a child node of jcr:content which contains a subtree of data to define the page structure. If the jcr:content node is not capable of having a child resource of type Page and the goal of the traversal is to identify Page resources that match specific criteria then the traversal of the jcr:content node can not lead to additional matches. Using this knowledge of the resource structure, you can improve performance by adding a branch selector that prevents the traversal from proceeding down a nonproductive path

Limit memory consumption

The instantiation of a Resource object from the underlying ResourceResolver is a nontrivial consumption of memory. When the focus of a tree traversal is obtaining information from thousands of Resources, an effective method is to extract the information as part of the stream processing or utilize the forEach method of the ResourceStream object which allows the resource to be garbage collected in an efficient manner.

References:

https://sling.apache.org/documentation/bundles/resource-filter.html