CPARK 1.0
A light-weighted, distributed computing framework for C++ that offers a fast and general-purpose large data processing solution.
|
#include <base_rdd.h>
Classes | |
class | Iterator |
Public Types | |
using | Base = BaseSplit< CachedSplit< DerivedSplit, DerivedSplitIterator > > |
using | ValueType = std::iter_value_t< DerivedSplitIterator > |
using | CacheType = std::vector< ValueType > |
Public Member Functions | |
CachedSplit (ExecutionContext *context) | |
template<typename T , typename U > | |
CachedSplit (const CachedSplit< T, U > &other, bool copy_id, bool copy_dependencies) | |
Public Member Functions inherited from cpark::BaseSplit< CachedSplit< DerivedSplit, DerivedSplitIterator > > | |
BaseSplit (ExecutionContext *context) | |
BaseSplit (const BaseSplit< S > &prev) | |
BaseSplit (const BaseSplit< T > &other, bool copy_id, bool copy_dependencies) | |
BaseSplit & | operator= (const BaseSplit< S > &prev) |
auto | begin () const |
auto | end () const |
auto | dependencies () const noexcept |
void | addDependency (ExecutionContext::SplitId split_id) |
void | addDependency (const BaseSplit< T > &split) |
ExecutionContext::SplitId | id () const noexcept |
Public Attributes | |
friend | Base |
Additional Inherited Members | |
Protected Attributes inherited from cpark::BaseSplit< CachedSplit< DerivedSplit, DerivedSplitIterator > > | |
ExecutionContext * | context_ |
ExecutionContext::SplitId | split_id_ |
std::vector< ExecutionContext::SplitId > | dependencies_ |
A general cached split class, who will either read the data by using the iterator from DerivedSplit, or read the data from the execution context's cache, depending on the caching information from the execution context.
DerivedSplit | The original split to be added with a cache. |
DerivedSplitIterator | The const iterator type of DerivedSplit. Limited by C++ template resolution details, this type can not be deduced from DerivedSplit, so we pass it explicitly here. The DerivedSplitIterator should be convertable from the type returned by DerivedSplit::beginImpl() const . Be EXTREMELY CAREFUL about the const-ness! |
CachedSplit is always a random_access_range, even if the DerivedSplitIterator
is not a random_access_iterator. It at lease requires the DerivedSplitIterator
to be a forward_iterator.
When the cache of the split is not calculated, and some operation that DerivedSplit
and DerivedSplitIterator
does not support is called, it will immediately start to calculate the cache using the DerivedSplitIterator
, and use the cache's iterator afterwards. This is why CachedSplit always supports all random_access_range's operations. For convenience, we call this behavior calculate-cache-on-miss
.
|
inline |
Copy from another CacheSplit (possibly with different DerivedSplit type). The new CacheSplit will have the same context. If copy_id
is true, the new split will have the same split id, otherwise it will have a new unique split id. if copy_dependencies
is true, the dependencies will also be copied.