Broken Page References AEM


Problem Statement:

How to get the list of all the broken references in AEM?


Requirement:

Get a List of all the broken references using MCP and provide the report


Introduction:

OOTB we get a Broken reference report provided by MCP, which can be used to get all the broken references in the content repo.

Broken Refernce Report

It’s highly recommended to run this process during

  1. off hours
  2. Don’t run on the root level
  3. Run it on 2nd level or 3rd level pages

How to run this process?

Provide Source path

Provide the regex so that it will consider only the references which point to /content or /etc (points to AEM)

You can also provide exclude properties to improve the traversal of nodes.

If you want to verify any broken links in the RTE fields or properties, then check the deep check checkbox and provide the properties list.

But the above process has a few issues.

  1. Html properties are not working as expected

We need a few customizations to this process by making a few changes to check HTML level references by adding JSOUP API

Add the following dependencies to your POM.xml

<dependency>
    <groupId>org.jsoup</groupId>
    <artifactId>jsoup</artifactId>
    <scope>provided</scope>
</dependency>
<dependency>
    <groupId>com.adobe.acs</groupId>
    <artifactId>acs-aem-commons-bundle</artifactId>
    <scope>provided</scope>
</dependency>

Get the following Broken reference code into your local as shown below:

Add the following code as shown below:

if (htmlFields.contains(property.getKey())) {
            stream = stream.flatMap(val -> {
                try {
                    Document doc = Jsoup.parse(val);
                    Elements anchors = doc.select("a");
                    return anchors.stream().map(link -> link.attr("href"));
                } catch (Exception e) {
                    log.warn("Could not parse links from property value of {}", property.getKey(), e);
                    return Stream.empty();
                }
            });
        }
At Line number 207

When we run it on wknd site it would look something like this:

Broken Reference Report

AEM with Java streams

Problem statement:

How to use java streams in AEM? Can I use streams for iterating and resources?

Requirement:

Use Java streams to iterate child nodes, validating and resources and API’s.

Introduction:

There are a lot of benefits to using streams in Java, such as the ability to write functions at a more abstract level which can reduce code bugs, compact functions into fewer and more readable lines of code, and the ease they offer for parallelization

  • Streams have a strong affinity with functions
  • Streams encourage less mutability
  • Streams encourage looser coupling
  • Streams can succinctly express quite sophisticated behavior
  • Streams provide scope for future efficiency gains

Java Objects:

This class consists of static utility methods for operating on objects. These utilities include null-safe or null-tolerant methods for computing the hash code of an object, returning a string for an object, and comparing two objects.

if (Objects.nonNull(resource)) {
  resource.getValueMap().get("myproperty", StringUtils.EMPTY);
}

Java Optional:

Trying using Java Optional util, which is a box type that holds a reference to another object.

Is immutable and non serializable ant there is no public constructor and can only be present or absent

It is created by the of(), ofNullable(), empty() static method.

In the below example Optional resource is created and you can check whether the resource is present and if present then get the valuemap

Optional < Resource > res = Optional.ofNullable(resource);
if (res.isPresent()) {
    res.get().getValueMap().get("myproperty", StringUtils.EMPTY);
}

you can also call stream to get children’s as shown below:

Optional < Resource > res = Optional.ofNullable(resource);
if (res.isPresent()) {
    List < Resource > jam = res.stream().filter(Objects::nonNull).collect(Collectors.toList());
}

Java Stream Support:

Low-level utility methods for creating and manipulating streams. This class is mostly for library writers presenting stream views of data structures; most static stream methods intended for end users are in the various Stream classes.

In the below example we are trying to get a resource iterator to get all the child resources and map the resources to a page and filter using Objects and finally collect the list of pages.

Iterator < Resource > iterator = childResources.getChildren().iterator();
List < Page > pages = StreamSupport.stream(((Iterable < Resource > )() -> iterator).spliterator(), false)
  .map(currentPage.getPageManager()::getContainingPage).filter(Objects::nonNull)
  .collect(Collectors.toList());

We can also Optional utility to get the children resources or empty list to avoid all kinds of null pointer exceptions.

List < Resource > pagesList = Optional.ofNullable(resource.getChild(Teaser.NN_ACTIONS))
  .map(Resource::getChildren)
  .map(Iterable::spliterator)
  .map(s -> StreamSupport.stream(s, false))
  .orElseGet(Stream::empty)
  .collect(Collectors.toList());

We can also adapt the resource to Page API and call the listchilderens to get all the children and using stream support we are going to map the page paths into a list as shown below:

terator < Page > childIterator = childResources.adaptTo(Page.class).listChildren();
StreamSupport.stream(((Iterable < Page > )() -> childIterator).spliterator(), false)
  .filter(Objects::nonNull)
  .map(childPage -> childPage.getPath())
  .collect(Collectors.toList());

Does this works only on resource and page API?

No, we can also use Content Fragment and other API’s as well for example in the below code we are trying to iterate contentfragment and get all the variations of the contentfragment.

Optional < ContentFragment > contentFragment = Optional.ofNullable(resource.adaptTo(ContentFragment.class));
Iterator < VariationDef > versionIterator = contentFragment.get().listAllVariations();
List < String > variationsList = StreamSupport.stream(((Iterable < VariationDef > )() -> versionIterator).spliterator(), false)
  .filter(Objects::nonNull)
  .map(cfVariation -> cfVariation.getTitle())
  .collect(Collectors.toList());

You can also learn more about other tricks and techniques of Java Streams:

AEM Query builder using Java streams