Comparing Stream-Based, Page.listChildren, and Query Builder Methods for Listing AEM Children Pages

Problem Statement:

What is the best way to list all the children in AEM?

Stream-based VS page.listChildren VS Query Builder

Introduction:

AEM Sling Query is a resource traversal tool recommended for content traversal in AEM. Traversal using listChildren(), getChildren(), or the Resource API is preferable to writing JCR Queries as querying can be more costly than traversal. Sling Query is not a replacement for JCR Queries. When traversal involves checking multiple levels down, Sling Query is recommended because it involves lazy evaluation of query results.

JCR queries in AEM development and recommends using them sparingly in production environments due to performance concerns. JCR queries are suitable for end-user searches and structured content retrieval but should not be used for rendering requests such as navigation or content counts.

How can I get all the child pages in AEM using JCR Query?

List<String> queryList = new ArrayList<>();
Map<String, String> map = new HashMap<>();
map.put("path", resource.getPath());
map.put("type", "cq:PageContent");
map.put("p.limit", "-1");

Session session = resolver.adaptTo(Session.class);
Query query = queryBuilder.createQuery(PredicateGroup.create(map), session);
SearchResult result = query.getResult();
ResourceResolver leakingResourceResolverReference = null;
try {
    for (final Hit hit : result.getHits()) {
        if (leakingResourceResolverReference == null) {
            leakingResourceResolverReference = hit.getResource().getResourceResolver();
        }
        queryList.add(hit.getPath());
    }
} catch (RepositoryException e) {
    log.error("Error collecting inherited section search results", e);
} finally {
    if (leakingResourceResolverReference != null) {
        leakingResourceResolverReference.close();
    }
}

But JCR Query consumes more resources

AEM recommends using Page.listchildren because of less complexity

List<String> pageList = new ArrayList<>();
Page page = resource.adaptTo(Page.class);
Iterator<Page> childIterator = page.listChildren(new PageFilter(), true);
StreamSupport.stream(((Iterable<Page>) () -> childIterator).spliterator(), false).forEach( r -> {
    pageList.add(r.getPath());
    }
);

But it sometimes misses some results in the result set and it’s slower compared to Java streams based

How about Java streams?

Java streams can iterate faster and execute faster and consumes very few resources

List<String> streamList = new ArrayList<>();
for (Resource descendant : (Iterable<? extends Resource>) traverse(resource)::iterator) {
    streamList.add(descendant.getPath());
}

private Stream<Resource> traverse(@NotNull Resource resourceRoot) {
    Stream<Resource> children = StreamSupport.stream(resourceRoot.getChildren().spliterator(), false)
            .filter(this::shouldFollow);
    return Stream.concat(
            shouldInclude(resourceRoot) ? Stream.of(resourceRoot) : Stream.empty(),
            children.flatMap(this::traverse)
    );
}

protected boolean shouldFollow(@NotNull Resource resource) {
    return !JcrConstants.JCR_CONTENT.equals(resource.getName());
}

protected boolean shouldInclude(@NotNull Resource resource) {
    return resource.getChild(JcrConstants.JCR_CONTENT) != null;
}

I recently came across this logic while debugging the OOTB sling sitemap generator: https://github.com/apache/sling-org-apache-sling-sitemap

Stream-based results took just 3miliseconds compared to page.listChildren or query

7 thoughts on “Comparing Stream-Based, Page.listChildren, and Query Builder Methods for Listing AEM Children Pages”

Not entirely correct.
page.listchildren just lists pages (and only pages) as children, while resource.listChildren lists all child resources, including the mandatory jcr:content childnode, any potential ACL nodes/resources, sling folders etc. So their semantic is quite different.

(And the performance difference just comes from these additional checks.)

LikeLike

Kiran Sg says:

18th Apr 2023 at 4:13 pm

Hi Jorg,
I had a similar requirement in my current project and as per the Page API documentation if we set deep as true it gets all the children but if you have more 3000 or 4000 child pages then results are not consistent
But if you refer to the sling sitemap generator they are using streams to generate a sitemap: https://github.com/apache/sling-org-apache-sling-sitemap/blob/master/src/main/java/org/apache/sling/sitemap/spi/generator/ResourceTreeSitemapGenerator.java

LikeLike

Reply
1. Kiran Sg says:
  
  18th Apr 2023 at 4:35 pm
  
  Even with resource list children I have added a check to see if jcr:content exists to make sure if it’s a page or not 🙂
  
  LikeLike
2. Jörg says:
  
  13th May 2023 at 10:48 am
  
  What do you mean with “results are not consistent”?
  
  In the end all this code uses node.getNodes() to iterate through all child nodes. Some implementations add more features (like removing implementation details like rep:acl nodes etc), but in the majority of the code executed is idential
  
  LikeLike
3. Kiran Sg says:
  
  13th May 2023 at 11:32 am
  
  Let me check the rep:acl but I see the results set is different from environment to environment
  
  LikeLike
4. Jörg says:
  
  13th May 2023 at 12:05 pm
  
  But that is caused by the fact, that the content is different per environment. And order matters.
  I would define “non-consistent behavior” if you run the same code on the same env (with the same unchanged content) and get different results.
  
  LikeLike

java.util.Iterator listChildren(Filter filter,
boolean deep)
Returns an iterator over descendant resources that adapt to a page and are included in the given filter. other child resources are skipped.
Parameters:
filter – for iteration. may be null
deep – false traverses only children; true traverses all descendants
Returns:
iterator of child pages

page documentation says if we set it has true it gets all the children’ and I also thought Sling sitemap generator team would use this API but I was wrong they used streams

LikeLike

Comparing Stream-Based, Page.listChildren, and Query Builder Methods for Listing AEM Children Pages

Problem Statement:

Introduction:

How can I get all the child pages in AEM using JCR Query?

AEM recommends using Page.listchildren because of less complexity

How about Java streams?

Published by Kiran Sg

7 thoughts on “Comparing Stream-Based, Page.listChildren, and Query Builder Methods for Listing AEM Children Pages”

Leave a comment Cancel reply

Problem Statement:

Introduction:

How can I get all the child pages in AEM using JCR Query?

AEM recommends using Page.listchildren because of less complexity

How about Java streams?

Share this:

Related

Published by Kiran Sg

7 thoughts on “Comparing Stream-Based, Page.listChildren, and Query Builder Methods for Listing AEM Children Pages”

Leave a comment Cancel reply