Interface ExternalDepot

All Superinterfaces:
AutoCloseable, Closeable, RamaSerializable, Serializable, TaskGlobalObject

public interface ExternalDepot extends TaskGlobalObject
This interface is used to integrate external queue systems as depots inside Rama. This allows Rama topologies to consume those external queue systems directly. This interface exposes everything necessary to support the complete API for both stream and microbatch topologies. An ExternalDepot can be dynamically upscaled or downscaled, and Rama will automatically detect this to start consuming from new partitions or stop consuming from obsolete partitions. Rama references individual partitions of an external queue via a "partition index". It's critical that a partition index always identity the same partition of data (including after upscaling or downscaling) since Rama stores progress state for stream and microbatch topologies by partition index. Individual records on a partition are identified by an "offset". Rama expects offsets for records to monotonically increase by one for each record added. All methods on this interface return asynchronously via CompletableFuture. These methods cannot block the thread on which they're called, so if they must perform blocking operations, they should do so on a separate thread. This thread can be created and managed in the TaskGlobalObject.prepareForTask(int, com.rpl.rama.integration.TaskGlobalContext) method.
See Also:
  • Method Details

    • getNumPartitions

      CompletableFuture<Integer> getNumPartitions()
      Asynchronously returns the number of partitions in the external queue.
    • startOffset

      CompletableFuture<Long> startOffset(int partitionIndex)
      Asynchronously returns the start offset of the given partition index (inclusive). This method does not need to be implemented if consuming topologies do not use startFromBeginning() in their source options.
    • endOffset

      CompletableFuture<Long> endOffset(int partitionIndex)
      Asynchronously returns the end offset of the given partition index (exclusive).
    • offsetAfterTimestampMillis

      CompletableFuture<Long> offsetAfterTimestampMillis(int partitionIndex, long millis)
      Asynchronously returns the first offset of the given partition index after the given timestamp. If no such offset exists, this method should asynchronously return null. This method does not need to be implemented if consuming topologies do not use startFromOffsetAfterTimestamp(long millis) in their source options.
    • fetchFrom

      CompletableFuture<List> fetchFrom(int partitionIndex, long startOffset, long endOffset)
      Asynchronously returns the sequence of records between the given range of offsets.
      Parameters:
      partitionIndex - Partition to query
      startOffset - Start offset of range (inclusive)
      endOffset - End offset of range (exclusive)
    • fetchFrom

      CompletableFuture<List> fetchFrom(int partitionIndex, long startOffset)
      Asynchronously returns a chunk of records starting at the given offset. The implementation should decide what is a reasonable amount of data to fetch in a single call. This method is used by stream topologies to poll for new data or to catch up when far behind.
      Parameters:
      partitionIndex - Partition to query
      startOffset - Start offset of range (inclusive)