Replicating Existing S3 Objects with Batch Operations

When you create an S3 replication rule, it only applies to objects uploaded after the rule is configured. Anything that already existed in the bucket stays put. For most situations, that means a manual step is needed to bring existing objects in line—and S3 Batch Operations is the tool AWS provides for exactly that.

Batch Operations can also handle one-off replications when ongoing replication isn’t needed, making it a flexible option beyond just backfilling.

Choosing a Manifest

Before a batch job can run, it needs a manifest: a file that tells AWS which objects to process. There are three ways to provide one.

S3 Inventory Report — AWS can generate a manifest.json automatically by creating an inventory configuration on the source bucket. Navigate to the bucket, go to Management → Inventory configurations, and follow the steps to schedule an inventory. Once generated, the resulting file can be used directly as the manifest.

CSV — For more control, a custom CSV file can be created listing the objects to include. At minimum, each row needs the bucket name and object key. A version ID can be added as a third column when needed. Object keys with spaces should be URL-encoded.

S3 Replication Configuration — If a replication rule is already set up, this option uses that configuration automatically. It’s the most convenient choice when backfilling objects to match an existing rule.

Copy vs. Replicate

Once the manifest is ready, the next decision is which operation to use. Both move data between buckets, but they behave differently.

Replicate uses existing replication rules to determine where objects go, making it the natural choice when a replication rule is already in place. It supports multiple destination buckets and requires version IDs in any CSV manifest.

Copy is more flexible for one-off jobs. It supports only a single destination bucket and offers a wider set of options for controlling how objects land in the target:

Storage class — Set the storage class of copied objects independently of the source.
Server-side encryption — Override the destination bucket’s default encryption settings if needed.
Object tags — Choose to copy, replace with new tags, or strip tags entirely.
Metadata — Copy, replace, or omit metadata. Since metadata can only be set at upload time, Copy is the only way to update it on existing objects.
Access control — Adjust ACLs on the copied objects, including cross-account access grants.
Object Lock — Apply a legal hold or a timed retention policy. Object Lock must already be enabled on the destination bucket before either option is available.

Configuring and Running the Job

After selecting the operation and configuring its options, a few job-level settings apply to all batch jobs:

Description — A label that appears in the Batch Operations console to help identify the job later.
Priority — When running multiple jobs simultaneously, higher-priority jobs are processed first.
Completion report — A report can be generated at the end of the job, covering either all tasks or only failures. Enabling the failed-tasks report is useful for troubleshooting.
IAM role — The job runs under an IAM role that must have the necessary permissions for both the source and destination buckets. If tags, Object Lock, or other optional features are used, the role needs those permissions as well.
Job tags — Tags can be applied to the job itself for cost tracking or to enforce IAM policies restricting which jobs certain users can manage.

After reviewing the settings, clicking Create job submits it. The job will appear in the Batch Operations console with a status of Awaiting your confirmation to run. Open the job, click Run job, and confirm to start processing.

Monitoring and Completion

Progress is visible from the job detail page. Once finished, the status updates to either Completed or Failed.

If the job fails, the reason is shown under Reason for termination. Permission issues are the most common cause — checking that the IAM role has full access to both buckets resolves most failures. If a completion report was configured, it will be linked at the bottom of the page with per-object status and error details.

Completed jobs can’t be rerun, but they can be cloned. The Clone job button on the job page creates a new job with the same configuration, which is particularly handy after fixing a permissions issue and needing to try again without rebuilding from scratch.

Closing Thoughts