Falcon can replicate data across multiple clusters using distcp, and do it according to the fequency you specify in the feed entity. Falcon uses a pull-based replication mechanism, meaning in every target cluster, for a given source cluster, a coordinator is scheduled which pulls the data using distcp from source cluster. And, for every instance that a feed is replicated Falcon sends a JMS message on the success or failure of the replication instance.
For example, in this feed two clusters are replicating data to a backup cluster:
<clusters>
<cluster name=Cluster1" type="source" partition="${cluster.name}" delay="days(2)">
<validity start="2011-11-01T00:00Z" end="2021-11-30T00:00Z"/>
</cluster>
<cluster name="Cluster2" type="source" partition="COUNTRY/${cluster.name}">
<validity start="2011-11-01T00:00Z" end="2021-11-30T00:00Z"/>
</cluster>
<cluster name="Backup" type="target">
<validity start="2011-11-01T00:00Z" end="2011-11-31T00:00Z"/>
</cluster>
</clusters>
![]() | Note |
|---|---|
We recommend that the data path be as granular as the frequency of the feed. For example, if you are specifying the feed frequency in hours, provide a data path that is/${YEAR}/${MONTH}/${DAY}/${HOUR}. |
In this example, two coordinators are scheduled to pull data in to the target, Backup, one coordinator pulls the data from a partition in Cluster1 and the other coordinator pulls from a partition in Cluster2. A replication delay of 2 days has been set for Cluster1, which means that it will run every 30 days with an offset of 2 days. This means that the feed instance that is scheduled for replication November 30 is elligible December 2nd.
If you are using Falcon for Data Replication, explore the following topics:
Falcon Community Documentation on Language Expression

![[Note]](../common/images/admon/note.png)
