AEM Performance Optimization Activation/Replication

Problem Statement:

AEM Bulk Replication allows you to activate a series of pages and/or assets.

How can we make sure workflow won’t impact AEM performance (CPU or Heap memory) / throttle the system?

Introduction:

AEM bulk replication or activation is performed on a series of pages and/or assets. We usually perform bulk replication on tree structures or lists of paths.

Use case:

  1. MSM (Multi-site management) – rolling out a series of pages or site
  2. Editable template – add/remove new components on template structure and activate existing pages
  3. Bulk Asset ingests into the system
  4. Bulk redirect//vanity path update

ACS Commons Throttled Task Runner is built on Java management API for managing and monitoring the Java VM and can be used to pause tasks and terminate the tasks based on stats.

Throttled Task Runner (a managed thread pool) provides a convenient way to run many AEM-specific activities in bulk it basically checks for the Throttled Task Runner bean and gets current running stats of the actual work being done.

OSGi Configuration:

The Throttled Task Runner is OSGi configurable, but please note that changing configuration while work is being processed results in resetting the worker pool and can lose active work.

Throttled Task Runner OSGi

Max threads: Recommended not to exceed the number of CPU cores. Default 4.

Max CPU %: Used to throttle activity when CPU exceeds this amount. Range is 0..1; -1 means disable this check.

Max Heap %: Used to throttle activity when heap usage exceeds this amount. Range is 0..1; -1 means disable this check.

Cooldown time: Time to wait for CPU/MEM cooldown between throttle checks (in milliseconds)

Watchdog time: Maximum time allowed (in ms) per action before it is interrupted forcefully.

JMX MBeans

Throttled Task Runner MBean

This is the core worker pool. All action managers share the same task runner pool, at least in the current implementation. The task runner can be paused or halted entirely, throwing out any unfinished work.

Throttled task runner JMX
Throttled task runner JMX

How to use ACS Commons throttled task runner

Add the following dependency to your pom

Create a custom service or servlet as shown below:

Throttled Replication

inside the custom workflow process method check for the Low CPU and Low memory before starting your task to avoid performance impact on the system.

For bulk replication (publish/unpublish/delete) assets or pages, please refer to the AEM Operation Replication Tool

For best practices on the AEM servlet please refer to the link.

AEM MCP based Bulk Replication

Problem statement:

How to bulk replicate (Publish/Unpublish) pages in AEM using MCP?

Requirement:

Use bulk replication other than the OOTB tree activation tool and using MCP add a different replication agent instead of the default agent.

Introduction:

Usually for bulk replication we usually use the OOTB tree activation page and select the path to start the bulk replication i.e, activation only.

Activation Tree
start path and activate

MCP provides an easier way and with better UI to run bulk activation. You can also create a separate replication agent other than default agent so that existing authoring replication or existing schedulers won’t be blocked.

You can use the MCP queue to replicate synchronously and the default queue to replicate asynchronously. For pages more than 10K its recommended to use MCP 10k Queue

Select the Tree Activation:

Go to MCP page url: http://{domain}/apps/acs-commons/content/manage-controlled-processes.html

Select Tree Activation process

Provide the path of the page and select all for “What to Publish” and MCP Queue based on your requirement.

Provide an agent if you are using a different agent for bulk replication and select action to bulk publish or unpublish the pages.

Tree Activation Options

For excel sheet based activation use:

AEM Publish / UnPublish / Delete List of pages – MCP Process

AEM MCP Process for publishing, unpublishing, and deleting a list of pages

Problem Statement:

Can we Publish / Un Publish / Delete the list of pages mentioned in an Excel sheet?

Requirement:

This article discusses how to use the Manage Controlled Processes (MCP) dashboard in Adobe Experience Manager (AEM) to create a process for publishing, unpublishing, and deleting a list of pages in AEM based on an Excel sheet. The article provides code snippets for creating the ListTreeReplicationFactory service and implementation class called ListTreeReplication that reads the Excel sheet and performs the desired actions on the specified pages.

Create an MCP process to Publish / Un Publish / Delete the list of pages mentioned in an Excel sheet.

Introduction:

Usually, product owners or authors would like to Publish certain pages like offer/product/article pages based on business requirements and also would like to unpublish and delete to clean up the pages which are unnecessary.

This process will be really helpful during Excel sheet based migration.

MCP (Manage Controlled Processes) is both a dashboard for performing complex tasks and a rich API for defining these tasks as process definitions. In addition to kicking off new processes, users can also monitor running tasks, retrieve information about completed tasks, halt work, and so on.

Add the following maven dependency to your pom to extend MCP

<dependency>
    <groupId>com.adobe.acs</groupId>
    <artifactId>acs-aem-commons-bundle</artifactId>
    <version>5.0.4</version>
    <scope>provided</scope>
</dependency>

Create a new process called ListTreeReplicationFactory service using the MCP process definition

package com.mysite.mcp.process;

import org.osgi.service.component.annotations.Component;
import org.osgi.service.component.annotations.Reference;
import com.adobe.acs.commons.mcp.ProcessDefinitionFactory;
import com.day.cq.replication.Replicator;

@Component(service = ProcessDefinitionFactory.class, immediate = true)
public class ListTreeReplicationFactory extends ProcessDefinitionFactory<ListTreeReplication> {

    @Reference
    Replicator replicator;

    @Override
    public String getName() {
        return "List Tree Activation";
    }

    @Override
    protected ListTreeReplication createProcessDefinitionInstance() {
        return new ListTreeReplication(replicator);
    }
}

Create the implementation class called ListTreeReplication and add the following logic to read the excel sheet

package com.mysite.mcp.process;

import java.io.IOException;
import java.util.*;
import java.util.concurrent.atomic.AtomicInteger;
import javax.jcr.RepositoryException;
import javax.jcr.Session;
import org.apache.commons.lang3.StringUtils;
import org.apache.sling.api.request.RequestParameter;
import org.apache.sling.api.resource.LoginException;
import org.apache.sling.api.resource.PersistenceException;
import org.apache.sling.api.resource.ResourceResolver;
import com.adobe.acs.commons.data.CompositeVariant;
import com.adobe.acs.commons.data.Spreadsheet;
import com.adobe.acs.commons.fam.ActionManager;
import com.adobe.acs.commons.mcp.ProcessDefinition;
import com.adobe.acs.commons.mcp.ProcessInstance;
import com.adobe.acs.commons.mcp.form.CheckboxComponent;
import com.adobe.acs.commons.mcp.form.FileUploadComponent;
import com.adobe.acs.commons.mcp.form.FormField;
import com.adobe.acs.commons.mcp.form.SelectComponent;
import com.adobe.acs.commons.mcp.form.TextfieldComponent;
import com.adobe.acs.commons.mcp.model.GenericReport;
import com.day.cq.replication.AgentFilter;
import com.day.cq.replication.ReplicationActionType;
import com.day.cq.replication.ReplicationOptions;
import com.day.cq.replication.Replicator;
import com.day.cq.wcm.api.Page;
import com.day.cq.wcm.api.PageManager;
import com.day.cq.wcm.api.WCMException;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class ListTreeReplication extends ProcessDefinition {

    private static final Logger LOGGER = LoggerFactory.getLogger(ListTreeReplication.class);
	private static final String DESTINATION_PATH = "destination";
	private final GenericReport report = new GenericReport();
	private static final String REPORT_NAME = "TreeReplication-report";
	private static final String RUNNING = "Running "; 
	private static final String EXECUTING_KEYWORD = " Tree Replication";

    protected enum QueueMethod {
        USE_PUBLISH_QUEUE, USE_MCP_QUEUE, MCP_AFTER_10K
    }

    protected enum ReplicationAction {
        PUBLISH, UNPUBLISH, DELETE
    }

    private static int ASYNC_LIMIT = 10000;

    Replicator replicatorService;

    @FormField(name = "Replication Excel", component = FileUploadComponent.class)
    private RequestParameter repPathExcel;
    private Spreadsheet spreadsheet;

    @FormField(name = "Queueing Method",
            component = SelectComponent.EnumerationSelector.class,
            description = "For small publishing tasks, standard is sufficient.  For large folder trees, MCP is recommended.",
            options = "default=USE_MCP_QUEUE")
    QueueMethod queueMethod = QueueMethod.USE_MCP_QUEUE;

    @FormField(name = "Agents",
            component = TextfieldComponent.class,
            hint = "(leave blank for default agents)",
            description = "Publish agents to use, if blank then all default agents will be used. Multiple agents can be listed using commas or regex.")
    private String agents;
    List<String> agentList = new ArrayList<>();
    AgentFilter replicationAgentFilter;

    @FormField(name = "Action",
            component = SelectComponent.EnumerationSelector.class,
            description = "Publish or Unpublish?",
            options = "default=PUBLISH")
    ReplicationAction reAction = ReplicationAction.PUBLISH;

    @FormField(name = "Dry Run",
            component = CheckboxComponent.class,
            options = "checked",
            description = "If checked, only generate a report but don't perform the work"
    )
    private boolean dryRun = true;

    public ListTreeReplication(Replicator replicator) {
        replicatorService = replicator;
    }

    @Override
    public void init() throws RepositoryException {
    	try {
            // Read spreadsheet
            spreadsheet = new Spreadsheet(repPathExcel, DESTINATION_PATH).buildSpreadsheet();
        } catch (IOException e) {
            throw new RepositoryException("Unable to process spreadsheet", e);
        }
    }

    @Override
    public void buildProcess(ProcessInstance instance, ResourceResolver rr) throws LoginException, RepositoryException {    	    	
        report.setName(REPORT_NAME);                
        instance.getInfo().setDescription(RUNNING + reAction + EXECUTING_KEYWORD);    
        if (reAction == ReplicationAction.PUBLISH) {
            instance.defineCriticalAction("Activate tree structure", rr, this::activateTreeStructure);
            if (StringUtils.isEmpty(agents)) {
                replicationAgentFilter = AgentFilter.DEFAULT;
            } else {
                agentList = Arrays.asList(agents.toLowerCase(Locale.ENGLISH).split(","));
                replicationAgentFilter = agent -> agentList.stream().anyMatch(p -> p.matches(agent.getId().toLowerCase(Locale.ENGLISH)));
            }
        } else if (reAction == ReplicationAction.UNPUBLISH) {
            instance.defineCriticalAction("Deactivate tree structure", rr, this::deactivateTreeStructure);
        } else {
        	instance.defineCriticalAction("Delete tree structure", rr, this::deleteTreeStructure);
        }
    }

    public enum ReportColumns {
        PATH, ACTION, DESCRIPTION
    }

    List<EnumMap<ReportColumns, String>> reportData = Collections.synchronizedList(new ArrayList<>());

    private void recordAction(String path, String action, String description) {
        EnumMap<ReportColumns, String> row = new EnumMap<>(ReportColumns.class);
        row.put(ReportColumns.PATH, path);
        row.put(ReportColumns.ACTION, action);
        row.put(ReportColumns.DESCRIPTION, description);
        reportData.add(row);
    }

    @Override
    public void storeReport(ProcessInstance instance, ResourceResolver rr) throws RepositoryException, PersistenceException {    	
        report.setRows(reportData, ReportColumns.class);
        report.persist(rr, instance.getPath() + "/jcr:content/report");
    }

    private void activateTreeStructure(ActionManager t) {
    	spreadsheet.getDataRowsAsCompositeVariants().forEach(row -> performReplication(t, getString(row, DESTINATION_PATH)));    	
    }

    private void deactivateTreeStructure(ActionManager t) {
    	spreadsheet.getDataRowsAsCompositeVariants().forEach(row -> performAsynchronousReplication(t, getString(row, DESTINATION_PATH)));
    }
    
    private void deleteTreeStructure(ActionManager t) {
    	spreadsheet.getDataRowsAsCompositeVariants().forEach(row -> t.deferredWithResolver(r -> deletePage(r, getString(row, DESTINATION_PATH))));
    }

    private void deletePage(ResourceResolver resourceResolver, String destinationPath) {    	
    	try {
    		final PageManager pageManager = resourceResolver.adaptTo(PageManager.class);
        	Page destinationPage = resourceResolver.resolve(destinationPath).adaptTo(Page.class);
			pageManager.delete(destinationPage, false, true);
			recordAction(destinationPath, "Deletion", "Synchronous delete");
		} catch (WCMException e) {
            LOGGER.error("unable to delete page {}", e.getMessage());
		}
	}
    
    AtomicInteger replicationCount = new AtomicInteger();

    private void performReplication(ActionManager t, String path) {
        int counter = replicationCount.incrementAndGet();
        if (queueMethod == QueueMethod.USE_MCP_QUEUE
                || (queueMethod == QueueMethod.MCP_AFTER_10K && counter >= ASYNC_LIMIT)) {
            performSynchronousReplication(t, path);
        } else {
            performAsynchronousReplication(t, path);
        }
    }

    private void performSynchronousReplication(ActionManager t, String path) {
        ReplicationOptions options = buildOptions();
        options.setSynchronous(true);
        scheduleReplication(t, options, path);
        recordAction(path, reAction == ReplicationAction.PUBLISH ? "Publish" : "Unpublish", "Synchronous replication");
    }

    private void performAsynchronousReplication(ActionManager t, String path) {
        ReplicationOptions options = buildOptions();
        options.setSynchronous(false);
        scheduleReplication(t, options, path);
        recordAction(path, reAction == ReplicationAction.PUBLISH ? "Publish" : "Unpublish", "Asynchronous replication");
    }

    private ReplicationOptions buildOptions() {
        ReplicationOptions options = new ReplicationOptions();
        options.setFilter(replicationAgentFilter);
        return options;
    }

    private void scheduleReplication(ActionManager t, ReplicationOptions options, String path) {
        if (!dryRun) {
            t.deferredWithResolver(rr -> {
                Session session = rr.adaptTo(Session.class);
                replicatorService.replicate(session, reAction == ReplicationAction.PUBLISH ? ReplicationActionType.ACTIVATE : ReplicationActionType.DEACTIVATE, path, options);
            });
        }
    }

    @SuppressWarnings({"rawtypes", "unchecked"})
    private String getString(Map<String, CompositeVariant> row, String path) {
        CompositeVariant v = row.get(path.toLowerCase(Locale.ENGLISH));
        if (v != null) {
            return (String) v.getValueAs(String.class);
        } else {
            return null;
        }
    }
}

Once the code is deployed, please go to the following URL and click on start process as shown below:

http://{domain}/apps/acs-commons/content/manage-controlled-processes.html

You will see a new process called List Tree Activation as shown below:

Click on the process

Create an excel sheet with a column called destination and add the page paths.

List Replication sheet


Note: Please make sure do not run delete before unpublish.

Excel sheet: