Maintenance – AEM best practices and development guide

Problem Statement:

Get a List of all the Assets which are missing references

Requirement:

Get the list of broken asset references to unpublish and remove them repo to improve the system stability and performance.

Introduction:

How do assets get published?

The author uploads the images and publishes the assets
Create a launcher and workflow which process assets metadata and publish the pages
Whenever we publish any pages and if the page has references to assets, then during publishing, it asks to replicate the references as well.

What happens when the page is unpublished?

When the page is deactivated, assets referenced to the page will not be deactivated because this asset might have reference to the other pages hence out of the box assets won’t be deactivated.
If we perform cleanup, deactivate and delete old pages, we might not be cleaning up assets related to this page.

Advantages of cleaning up old assets?

Drastically reduces repository size
Improves DAM Asset search
Improves indexing

Get Publish Report using Assets Report:

Go to Tools -> Assets -> Reports as shown below:

Click on create and click on Publish report

Provide folder path and start date and end date

Select the columns as per requirement

Finally, report will be ready with all the assets lists as shown below

Download the report to see the final list of images

If Images are unpublished then we can ask authors to review and delete them

If images are published but has no references to figure this out, we need a new process.

MCP (Manage Controlled Processes) is both a dashboard for performing complex tasks and a rich API for defining these tasks as process definitions. In addition to kicking off new processes, users can also monitor running tasks, retrieve information about completed tasks, halt work, and so on.

Add the following maven dependency to your pom to extend MCP

<dependency>
    <groupId>com.adobe.acs</groupId>
    <artifactId>acs-aem-commons-bundle</artifactId>
    <version>5.0.4</version>
    <scope>provided</scope>
</dependency>

Broken Asset reference:

Add the following dependencies to your pom.xml

        <dependency>
            <groupId>org.jetbrains</groupId>
            <artifactId>annotations</artifactId>
            <version>17.0.0</version>
        </dependency>
        <dependency>
            <groupId>com.adobe.acs</groupId>
            <artifactId>acs-aem-commons-bundle</artifactId>
            <version>5.2.0</version>
            <scope>provided</scope>
        </dependency>
        <dependency>
            <groupId>org.jsoup</groupId>
            <artifactId>jsoup</artifactId>
            <version>1.13.1</version>
            <scope>provided</scope>
        </dependency>

Create a new MCP process by calling the MCP service and providing the implementation class:

package com.mysite.core.mcp;

import org.osgi.service.component.annotations.Component;
import org.osgi.service.component.annotations.Reference;
import com.adobe.acs.commons.mcp.ProcessDefinitionFactory;
import com.day.cq.replication.Replicator;

@Component(service = ProcessDefinitionFactory.class)
public class BrokenAssetsFactory extends ProcessDefinitionFactory<BrokenAssets> {

    @Reference
    Replicator replicator;

    @Override
    public String getName() {
        return "Broken Asset References";
    }

    @Override
    public BrokenAssets createProcessDefinitionInstance() {
        return new BrokenAssets(replicator);
    }
}

Create an implementation class as shown below:

package com.mysite.core.mcp;

import java.io.IOException;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collections;
import java.util.EnumMap;
import java.util.Iterator;
import java.util.LinkedList;
import java.util.List;
import java.util.Locale;
import java.util.Map;
import java.util.Optional;
import java.util.stream.Collectors;
import javax.jcr.RepositoryException;
import org.apache.commons.collections4.ListUtils;
import org.apache.commons.lang3.StringUtils;
import org.apache.jackrabbit.util.Text;
import org.apache.sling.api.request.RequestParameter;
import org.apache.sling.api.resource.LoginException;
import org.apache.sling.api.resource.PersistenceException;
import org.apache.sling.api.resource.Resource;
import org.apache.sling.api.resource.ResourceResolver;
import org.jetbrains.annotations.NotNull;
import com.adobe.acs.commons.data.CompositeVariant;
import com.adobe.acs.commons.data.Spreadsheet;
import com.adobe.acs.commons.fam.ActionManager;
import com.adobe.acs.commons.mcp.ProcessDefinition;
import com.adobe.acs.commons.mcp.ProcessInstance;
import com.adobe.acs.commons.mcp.form.Description;
import com.adobe.acs.commons.mcp.form.FileUploadComponent;
import com.adobe.acs.commons.mcp.form.FormField;
import com.adobe.acs.commons.mcp.form.RadioComponent;
import com.adobe.acs.commons.mcp.model.GenericReport;
import com.adobe.acs.commons.mcp.model.ManagedProcess;
import com.day.cq.replication.ReplicationStatus;
import com.day.cq.replication.Replicator;

/**
 * Relocate Pages and/or Sites using a parallelized move process
 */
public class BrokenAssets extends ProcessDefinition {
	
	private static final String SOURCE_PATH = "Path";
	private static final String CONTAINS_QUERY = "CONTAINS(s.*, '%s') or ";
	private final GenericReport report = new GenericReport();
	List<EnumMap<ReportColumns, String>> reportData = Collections.synchronizedList(new ArrayList<>());
	
	List<String> assetsList = new LinkedList<>();
	
    public enum PublishMethod {
        @Description("Select this option to generate Broken References Report")
        BROKEN_REFERENCE_REPORT
    }
    
    Replicator replicatorService;
    
    @FormField(name = "Mapping Excel", component = FileUploadComponent.class)
    private RequestParameter mappingExcel;
    private Spreadsheet spreadsheet;
    
    @FormField(name = "Type",
            description = "Please select the oprporiate option to execute",
            component = RadioComponent.EnumerationSelector.class,
            required = true,
            options = {"vertical", "default=BROKEN_REFERENCE_REPORT"})
    public PublishMethod publishMethod = PublishMethod.BROKEN_REFERENCE_REPORT;

	@FormField(name="Content Path",
			description="Content Path for search results",
			hint="/content",
			options={"default=/content"})
	public String contentPath = "/content";
    
    @FormField(name="Chunk count",
            description="Max number of chunk the search results",
            hint="4500",
            options={"default=3000"})
    public int chunkCount = 3000;
    
    @FormField(name="Retries",
            description="Max number of retries per commit",
            hint="2",
            options={"default=2"})
    public int retryCount = 2;
    
    @FormField(name="Retry delay",
            description="Delay between retries (in milliseconds)",
            hint="500,1000,...",
            options={"default=500"})
    public int retryWait = 500;
    
    public BrokenAssets(Replicator replicator) {
    	replicatorService = replicator;
	}

    @Override
    public void init() throws RepositoryException {    	
    	if(null != mappingExcel && mappingExcel.getSize() > 0) {
    		try {
                // Read spreadsheet
        		spreadsheet = new Spreadsheet(mappingExcel, SOURCE_PATH).buildSpreadsheet();
        		spreadsheet.getDataRowsAsCompositeVariants().forEach(row -> assetsList.add(getString(row, SOURCE_PATH)) );
            } catch (IOException e) {
                throw new RepositoryException("Unable to process spreadsheet", e);
            }
    	}    	    	    	
    }

    ManagedProcess instanceInfo;

    @Override
    public void buildProcess(ProcessInstance instance, ResourceResolver rr) throws LoginException, RepositoryException {
        instanceInfo = instance.getInfo();
        switch (publishMethod.name().toLowerCase()) {
	        case "broken_reference_report":
	        	instance.getInfo().setDescription("Collecting references");
	        	instance.defineAction("Searching Refs", rr, this::collectRefernce);
	            break;
        }        
    }

    @Override
    public void storeReport(ProcessInstance instance, ResourceResolver rr) throws RepositoryException, PersistenceException {    	
        report.setRows(reportData, ReportColumns.class);
        report.persist(rr, instance.getPath() + "/jcr:content/report");
    }

    public enum ReportColumns {
    	ASSET_PATH
    }
            
    protected void collectRefernce(ActionManager manager) {
    	manager.deferredWithResolver(this::collectReferences); 
    }
    
    private void collectReferences(ResourceResolver resourceResolver) throws Exception {    	
		if(!assetsList.isEmpty()) {						 
			RetryUtils.withRetry(retryCount, retryWait, () -> {
			    List<List<String>> chunkedAssetList = ListUtils.partition(assetsList, chunkCount);					
			    int counter = 0;
			    while(!chunkedAssetList.isEmpty() && counter < chunkedAssetList.size()) {
			        List<String> chunk = chunkedAssetList.get(counter);
			        counter++;
			        @NotNull Iterator<Resource> resourceResults = buildSQLQueryAndFetchResults(resourceResolver, contentPath, chunk);
			        collectReferences(resourceResults, chunk);
			    }
			});	
			if(!assetsList.isEmpty()) {
				assetsList.stream().forEach(this::reportResult);
			}
		}	        			
    }

	private void collectReferences(@NotNull Iterator<Resource> resourceResults, List<String> chunk) {
		while(resourceResults.hasNext()) {
			Resource resultRes = resourceResults.next();
			if (null != resultRes) {
				resultRes.getValueMap().entrySet().forEach(property -> {
					if(!property.getKey().equalsIgnoreCase("dam:folderThumbnailPaths")) {
						Object prop = property.getValue();
						if (prop.getClass() == String[].class) {
							List<String> propertyValue = Arrays.asList((String[]) prop);
							if (!propertyValue.isEmpty()) {
								List<String> matchingAsset = chunk.stream().filter(
										assetPat -> propertyValue.stream().anyMatch(sam -> sam.contains(assetPat)))
										.collect(Collectors.toList());
								if(!matchingAsset.isEmpty()) {
									chunk.removeIf(matchingAsset::contains);
									assetsList.removeIf(matchingAsset::contains);
								}
							}
						} else if (prop.getClass() == String.class) {
							String propertyValue = (String) prop;
							if (StringUtils.isNotEmpty(propertyValue) ) {
								Optional<String> matchingAsset = chunk.stream().filter(propertyValue::contains).findAny();
								if(matchingAsset.isPresent()) {
									chunk.remove(matchingAsset.get());
									assetsList.remove(matchingAsset.get());
								}
							}
						}
					}
				});
			}
		}
	}

	private void reportResult(String path) {
		EnumMap<ReportColumns, String> row = new EnumMap<>(ReportColumns.class);
		row.put(ReportColumns.ASSET_PATH, path);		
		reportData.add(row);
	}	
	
	private @NotNull Iterator<Resource> buildSQLQueryAndFetchResults(ResourceResolver resolver, String contentPath, List<String> chunk) {
        String groupStr = chunk.stream().map(r -> String.format(CONTAINS_QUERY, Text.escapeIllegalXpathSearchChars(r).replaceAll("'", "''"))).collect(Collectors.joining());        
		String querySt = "SELECT * FROM [nt:base] AS s WHERE ISDESCENDANTNODE(["+contentPath+"]) and "+ StringUtils.substringBeforeLast(groupStr, "or");
		return resolver.findResources(querySt, "JCR-SQL2");
    }
	
	@SuppressWarnings({"rawtypes", "unchecked"})
    private String getString(Map<String, CompositeVariant> row, String path) {
        CompositeVariant v = row.get(path.toLowerCase(Locale.ENGLISH));
        if (v != null) {
            return (String) v.getValueAs(String.class);
        } else {
            return null;
        }
    }
}

package com.mysite.core.mcp;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class RetryUtils {
    private static final Logger log = LoggerFactory.getLogger(RetryUtils.class);

    public static interface CallToRetry {
        void process() throws Exception;
    }

    public static boolean withRetry(int maxTimes, long intervalWait, CallToRetry call) throws Exception {
        if (maxTimes <= 0) {
            throw new IllegalArgumentException("Must run at least one time");
        }
        if (intervalWait <= 0) {
            throw new IllegalArgumentException("Initial wait must be at least 1");
        }
        Exception thrown = null;
        for (int i = 0; i < maxTimes; i++) {
            try {
                call.process();
                return true;
            } catch (Exception e) {
                thrown = e;
                log.info("Encountered failure on {} due to {}, attempt retry {} of {}", call.getClass().getName() , e.getMessage(), (i + 1), maxTimes, e);
            }
            try {
                Thread.sleep(intervalWait);
            } catch (InterruptedException wakeAndAbort) {
                break;
            }
        }
        throw thrown;
    }
}

Once the code is deployed, please go to the following URL and click on start process as shown below:

http://{domain}/apps/acs-commons/content/manage-controlled-processes.html

Running Asset Reference Process:

After building the code you can see the new Process showing up in MCP

Copy the Path column into the new Excel sheet as shown below

Upload into the process and start to see all the images which are published yet unreferenced as shown below

Why does Chunk count?

Chunk count helps the SQL 2 query to group by the paths, which will be maxing 4500 and it won’t take more than that (configurable based on the environment). However basically, if we have 20000 / 4500 = 4.44 ~ 5 we will be running the query max five times to generate the below report

Share the report with the content authors team to validate if images are required if not plan to clean up using

Example report and we can download and share with Authors

Clean up Process:

Authors don’t have to unpublish and delete individual images, then can you use the below process to upload the excel sheet with all the approved image paths and upload it to the process to deactivate and delete

AEM Publish / UnPublish / Delete List of pages – MCP Process

Tag: Maintenance

AEM Tool – Create / Generate tool from scratch

Problem Statement:

Introduction:

The tool generator consists of the:

How to use the tool?

1. Click on the link to download the tool generator package and install it into your local instance

2. Once the package is installed go to sites -> tools -> and select the Tool Generator section

3. Select Tool Settings on the top right hand corner and provide your local repository paths like sling model path, servlet path, apps path, conf path, and CQ path if it already exists

4. Once all the settings are authored save the settings

5. After coming to the generator page provide your tool name and tool description

6. If you have an existing tool then select Yes else Select No and provide the Tool Section name

7. Click on Create tool to create your new tool from scratch

Broken Page References AEM

Problem Statement:

Requirement:

Introduction:

How to run this process?

Broken Asset References AEM

Problem Statement:

Requirement:

Introduction:

What happens when the page is unpublished?

Advantages of cleaning up old assets?

Get Publish Report using Assets Report:

Broken Asset reference:

Running Asset Reference Process:

Clean up Process: