AEM Tool – Create / Generate tool from scratch

Problem Statement:

Create a tool similar to AEM Operations or ACS Commons for easy access and to run any maintenance process/tasks.


AEM OOTB comes with multiple tools in AEM and to access all the tools you need to navigate to the tool section and select the appropriate sections to perform all the operations

For example:

  1. AEM operations
  2. Managing templates
  3. Cloud configurations
  4. ACS Commons tools etc..

Tools are an essential part of AEM and avoid any dependency on Groovy scripts or any add scripts and can be managed or extended at any given point in time.

In order to create any tool from scratch takes a lot of time and man hours and once all the configs are ready then it takes more time to develop the services and servlet to handle the business logic.

By using this tool, you can avoid all kinds of configurations and initial setup and kick start with your own first tool from scratch

The tool generator consists of the:

  1. Sling model – to generate fields and handle inputs
  2. Servlet – to process request
  3. Component – to handle view (HTL), CSS, and JS

All the above sample scratch setups along with ready to check-in code will be added to the code base.

How to use the tool?

1. Click on the link to download the tool generator package and install it into your local instance
2. Once the package is installed go to sites -> tools -> and select the Tool Generator section
tool section
3. Select Tool Settings on the top right hand corner and provide your local repository paths like sling model path, servlet path, apps path, conf path, and CQ path if it already exists
tool page
4. Once all the settings are authored save the settings
tool settings
5. After coming to the generator page provide your tool name and tool description
tool authoring
6. If you have an existing tool then select Yes else Select No and provide the Tool Section name
7. Click on Create tool to create your new tool from scratch

Check your repository for all the file changes as shown below:

You can also check your new tool component and other configs on CRXDE

You can also visit your new tool by accessing the tool section (sites -> Tools -> {Your section name})

For more info on how to add a Sling model field please visit the below link:

Once your Sling model, servlet, and other things are ready make sure you add the following filter in your META-INF folder and keep it merge as shown below:

This would avoid replacing/overriding any other tools like ACS Commons or your other repo tools.

Broken Page References AEM

Problem Statement:

How to get the list of all the broken references in AEM?


Get a List of all the broken references using MCP and provide the report


OOTB we get a Broken reference report provided by MCP, which can be used to get all the broken references in the content repo.

Broken Refernce Report

It’s highly recommended to run this process during

  1. off hours
  2. Don’t run on the root level
  3. Run it on 2nd level or 3rd level pages

How to run this process?

Provide Source path

Provide the regex so that it will consider only the references which point to /content or /etc (points to AEM)

You can also provide exclude properties to improve the traversal of nodes.

If you want to verify any broken links in the RTE fields or properties, then check the deep check checkbox and provide the properties list.

But the above process has a few issues.

  1. Html properties are not working as expected

We need a few customizations to this process by making a few changes to check HTML level references by adding JSOUP API

Add the following dependencies to your POM.xml


Get the following Broken reference code into your local as shown below:

Add the following code as shown below:

if (htmlFields.contains(property.getKey())) {
            stream = stream.flatMap(val -> {
                try {
                    Document doc = Jsoup.parse(val);
                    Elements anchors ="a");
                    return -> link.attr("href"));
                } catch (Exception e) {
                    log.warn("Could not parse links from property value of {}", property.getKey(), e);
                    return Stream.empty();
At Line number 207

When we run it on wknd site it would look something like this:

Broken Reference Report

Broken Asset References AEM

Problem Statement:

Get a List of all the Assets which are missing references


Get the list of broken asset references to unpublish and remove them repo to improve the system stability and performance.


How do assets get published?

  1.  The author uploads the images and publishes the assets
  2. Create a launcher and workflow which process assets metadata and publish the pages
  3. Whenever we publish any pages and if the page has references to assets, then during publishing, it asks to replicate the references as well.

What happens when the page is unpublished?

  1.  When the page is deactivated, assets referenced to the page will not be deactivated because this asset might have reference to the other pages hence out of the box assets won’t be deactivated.
  2. If we perform cleanup, deactivate and delete old pages, we might not be cleaning up assets related to this page.

Advantages of cleaning up old assets?

  1. Drastically reduces repository size
  2. Improves DAM Asset search
  3. Improves indexing

Get Publish Report using Assets Report:

  • Go to Tools -> Assets -> Reports as shown below:
Asset Reports
  • Click on create and click on Publish report
Select Publish report
  • Provide folder path and start date and end date
Add Report details
  • Select the columns as per requirement
Configure columns for the report
  • Finally, report will be ready with all the assets lists as shown below
Completed Reports
  • Download the report to see the final list of images
Example Report CSV file

If Images are unpublished then we can ask authors to review and delete them

If images are published but has no references to figure this out, we need a new process.

MCP (Manage Controlled Processes) is both a dashboard for performing complex tasks and a rich API for defining these tasks as process definitions. In addition to kicking off new processes, users can also monitor running tasks, retrieve information about completed tasks, halt work, and so on.

Add the following maven dependency to your pom to extend MCP


Broken Asset reference:

Add the following dependencies to your pom.xml


Create a new MCP process by calling the MCP service and providing the implementation class:

package com.mysite.core.mcp;

import org.osgi.service.component.annotations.Component;
import org.osgi.service.component.annotations.Reference;
import com.adobe.acs.commons.mcp.ProcessDefinitionFactory;

@Component(service = ProcessDefinitionFactory.class)
public class BrokenAssetsFactory extends ProcessDefinitionFactory<BrokenAssets> {

    Replicator replicator;

    public String getName() {
        return "Broken Asset References";

    public BrokenAssets createProcessDefinitionInstance() {
        return new BrokenAssets(replicator);

Create an implementation class as shown below:

package com.mysite.core.mcp;

import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collections;
import java.util.EnumMap;
import java.util.Iterator;
import java.util.LinkedList;
import java.util.List;
import java.util.Locale;
import java.util.Map;
import java.util.Optional;
import javax.jcr.RepositoryException;
import org.apache.commons.collections4.ListUtils;
import org.apache.commons.lang3.StringUtils;
import org.apache.jackrabbit.util.Text;
import org.jetbrains.annotations.NotNull;
import com.adobe.acs.commons.fam.ActionManager;
import com.adobe.acs.commons.mcp.ProcessDefinition;
import com.adobe.acs.commons.mcp.ProcessInstance;
import com.adobe.acs.commons.mcp.form.Description;
import com.adobe.acs.commons.mcp.form.FileUploadComponent;
import com.adobe.acs.commons.mcp.form.FormField;
import com.adobe.acs.commons.mcp.form.RadioComponent;
import com.adobe.acs.commons.mcp.model.GenericReport;
import com.adobe.acs.commons.mcp.model.ManagedProcess;

 * Relocate Pages and/or Sites using a parallelized move process
public class BrokenAssets extends ProcessDefinition {
	private static final String SOURCE_PATH = "Path";
	private static final String CONTAINS_QUERY = "CONTAINS(s.*, '%s') or ";
	private final GenericReport report = new GenericReport();
	List<EnumMap<ReportColumns, String>> reportData = Collections.synchronizedList(new ArrayList<>());
	List<String> assetsList = new LinkedList<>();
    public enum PublishMethod {
        @Description("Select this option to generate Broken References Report")
    Replicator replicatorService;
    @FormField(name = "Mapping Excel", component = FileUploadComponent.class)
    private RequestParameter mappingExcel;
    private Spreadsheet spreadsheet;
    @FormField(name = "Type",
            description = "Please select the oprporiate option to execute",
            component = RadioComponent.EnumerationSelector.class,
            required = true,
            options = {"vertical", "default=BROKEN_REFERENCE_REPORT"})
    public PublishMethod publishMethod = PublishMethod.BROKEN_REFERENCE_REPORT;

	@FormField(name="Content Path",
			description="Content Path for search results",
	public String contentPath = "/content";
    @FormField(name="Chunk count",
            description="Max number of chunk the search results",
    public int chunkCount = 3000;
            description="Max number of retries per commit",
    public int retryCount = 2;
    @FormField(name="Retry delay",
            description="Delay between retries (in milliseconds)",
    public int retryWait = 500;
    public BrokenAssets(Replicator replicator) {
    	replicatorService = replicator;

    public void init() throws RepositoryException {    	
    	if(null != mappingExcel && mappingExcel.getSize() > 0) {
    		try {
                // Read spreadsheet
        		spreadsheet = new Spreadsheet(mappingExcel, SOURCE_PATH).buildSpreadsheet();
        		spreadsheet.getDataRowsAsCompositeVariants().forEach(row -> assetsList.add(getString(row, SOURCE_PATH)) );
            } catch (IOException e) {
                throw new RepositoryException("Unable to process spreadsheet", e);

    ManagedProcess instanceInfo;

    public void buildProcess(ProcessInstance instance, ResourceResolver rr) throws LoginException, RepositoryException {
        instanceInfo = instance.getInfo();
        switch ( {
	        case "broken_reference_report":
	        	instance.getInfo().setDescription("Collecting references");
	        	instance.defineAction("Searching Refs", rr, this::collectRefernce);

    public void storeReport(ProcessInstance instance, ResourceResolver rr) throws RepositoryException, PersistenceException {    	
        report.setRows(reportData, ReportColumns.class);
        report.persist(rr, instance.getPath() + "/jcr:content/report");

    public enum ReportColumns {
    protected void collectRefernce(ActionManager manager) {
    private void collectReferences(ResourceResolver resourceResolver) throws Exception {    	
		if(!assetsList.isEmpty()) {						 
			RetryUtils.withRetry(retryCount, retryWait, () -> {
			    List<List<String>> chunkedAssetList = ListUtils.partition(assetsList, chunkCount);					
			    int counter = 0;
			    while(!chunkedAssetList.isEmpty() && counter < chunkedAssetList.size()) {
			        List<String> chunk = chunkedAssetList.get(counter);
			        @NotNull Iterator<Resource> resourceResults = buildSQLQueryAndFetchResults(resourceResolver, contentPath, chunk);
			        collectReferences(resourceResults, chunk);
			if(!assetsList.isEmpty()) {;

	private void collectReferences(@NotNull Iterator<Resource> resourceResults, List<String> chunk) {
		while(resourceResults.hasNext()) {
			Resource resultRes =;
			if (null != resultRes) {
				resultRes.getValueMap().entrySet().forEach(property -> {
					if(!property.getKey().equalsIgnoreCase("dam:folderThumbnailPaths")) {
						Object prop = property.getValue();
						if (prop.getClass() == String[].class) {
							List<String> propertyValue = Arrays.asList((String[]) prop);
							if (!propertyValue.isEmpty()) {
								List<String> matchingAsset =
										assetPat -> -> sam.contains(assetPat)))
								if(!matchingAsset.isEmpty()) {
						} else if (prop.getClass() == String.class) {
							String propertyValue = (String) prop;
							if (StringUtils.isNotEmpty(propertyValue) ) {
								Optional<String> matchingAsset =;
								if(matchingAsset.isPresent()) {

	private void reportResult(String path) {
		EnumMap<ReportColumns, String> row = new EnumMap<>(ReportColumns.class);
		row.put(ReportColumns.ASSET_PATH, path);		
	private @NotNull Iterator<Resource> buildSQLQueryAndFetchResults(ResourceResolver resolver, String contentPath, List<String> chunk) {
        String groupStr = -> String.format(CONTAINS_QUERY, Text.escapeIllegalXpathSearchChars(r).replaceAll("'", "''"))).collect(Collectors.joining());        
		String querySt = "SELECT * FROM [nt:base] AS s WHERE ISDESCENDANTNODE(["+contentPath+"]) and "+ StringUtils.substringBeforeLast(groupStr, "or");
		return resolver.findResources(querySt, "JCR-SQL2");
	@SuppressWarnings({"rawtypes", "unchecked"})
    private String getString(Map<String, CompositeVariant> row, String path) {
        CompositeVariant v = row.get(path.toLowerCase(Locale.ENGLISH));
        if (v != null) {
            return (String) v.getValueAs(String.class);
        } else {
            return null;
package com.mysite.core.mcp;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class RetryUtils {
    private static final Logger log = LoggerFactory.getLogger(RetryUtils.class);

    public static interface CallToRetry {
        void process() throws Exception;

    public static boolean withRetry(int maxTimes, long intervalWait, CallToRetry call) throws Exception {
        if (maxTimes <= 0) {
            throw new IllegalArgumentException("Must run at least one time");
        if (intervalWait <= 0) {
            throw new IllegalArgumentException("Initial wait must be at least 1");
        Exception thrown = null;
        for (int i = 0; i < maxTimes; i++) {
            try {
                return true;
            } catch (Exception e) {
                thrown = e;
      "Encountered failure on {} due to {}, attempt retry {} of {}", call.getClass().getName() , e.getMessage(), (i + 1), maxTimes, e);
            try {
            } catch (InterruptedException wakeAndAbort) {
        throw thrown;

Once the code is deployed, please go to the following URL and click on start process as shown below:


Running Asset Reference Process:

After building the code you can see the new Process showing up in MCP

Borken Asset Refernce Process

Copy the Path column into the new Excel sheet as shown below

Path column into new excel file

Upload into the process and start to see all the images which are published yet unreferenced as shown below

Why does Chunk count?

Chunk count helps the SQL 2 query to group by the paths, which will be maxing 4500 and it won’t take more than that (configurable based on the environment). However basically, if we have 20000 / 4500 = 4.44 ~ 5 we will be running the query max five times to generate the below report

Share the report with the content authors team to validate if images are required if not plan to clean up using

After completing the process
Example report and we can download and share with Authors

Clean up Process:

Authors don’t have to unpublish and delete individual images, then can you use the below process to upload the excel sheet with all the approved image paths and upload it to the process to deactivate and delete

AEM Publish / UnPublish / Delete List of pages – MCP Process