CPARK 1.0
A light-weighted, distributed computing framework for C++ that offers a fast and general-purpose large data processing solution.
|
#include <cpark.h>
Public Types | |
using | RddId = uint32_t |
using | SplitId = uint32_t |
Public Member Functions | |
ExecutionContext ()=default | |
ExecutionContext (Config config) | |
void | setConfig (Config config) |
const Config & | getConfig () const noexcept |
RddId | getAndIncRddId () |
SplitId | getAndIncSplitId () |
bool | splitShouldCache (SplitId split_id) const noexcept |
bool | splitCached (SplitId split_id) const noexcept |
void | markDependency (SplitId from, SplitId to) noexcept |
const std::any & | getSplitCache (SplitId split_id) const |
template<typename CacheType , typename OriginalIterator > | |
std::shared_future< void > | startCalculationOrGetFuture (SplitId split_id, OriginalIterator begin, OriginalIterator end) |
An execution context (or environment) for a set of cpark tasks to run. It contains the information needed to evaluate the Rdd-s and run the cpark tasks, including the id information of Rdd-s and Splits, the cache information, the thread synchronization information, and the scheduler information. Each Rdd and Split will be included in one and only one execution context.
Users should be responsible to make sure the execution context is not out-of-lifetime when executing the cpark tasks. TODO: Consider whether to use smart pointers. As a fundamental library, smart pointers might not be a good choice.
Represents a unique id for each Rdd inside this execution context. Note that Rdd-s are copyable. Copied Rdd will have a same id.
Represents a unique id for each Split inside this execution context. Note that Splits are copyable. Copied Split will have a same id.
|
default |
Creates execution context with default config.
|
inlineexplicit |
Creates execution context from a config.
|
inline |
Returns the next unique Rdd id.
|
inline |
Returns the next unique Split id.
Returns the config of the execution context.
Returns the cache for the split, if it has already been cached.
Checks whether the split has already been cached.
Returns whether the split should be cached.