CPARK 1.0
A light-weighted, distributed computing framework for C++ that offers a fast and general-purpose large data processing solution.
Loading...
Searching...
No Matches
Welcome to cpark: Supercharge Your Parallel Computing in C++

CMake on multiple platforms codecov

About cpark

cpark is a high-performance, parallel computing library for C++ developed by passionate students at Columbia University. Inspired by Apache Spark, our goal is to empower developers with a lightning-fast, easy-to-use framework for fast and general-purpose large data computing in C++.

Authors

  • Mr. Shichen Xu (link)
  • Mr. Jiakai Xu (link)
  • Mr. Xintong Zhan (link)

Features

  • Blazing Fast: Benchmark-proven to be 80% faster than the standard C++ ranges library.
  • Local Multi-threading: Harness the power of parallel computing on your local machine with multi-threading support.
  • Ease of Use: Simple and intuitive API, making parallel computing accessible to all developers.
  • Scalability: Designed to scale effortlessly on a single machine.

Getting Started

Here is a quick guide to get you started with cpark. For more detailed information, please refer to our step-by-step guide.

  1. Check Documentation Website:
    www.alexxu.tech/cpark
  2. Get our CPARK:
    git clone https://github.com/Alex-XJK/cpark.git
  3. Write your Code:
    #include <iostream>
    #include "generator_rdd.h"
    #include "transformed_rdd.h"
    #include "filter_rdd.h"
    #include "reduce.h"
    int main() {
    cpark::Config default_config;
    default_config.setParallelTaskNum(8);
    cpark::ExecutionContext default_context{default_config};
    auto result =
    cpark::GeneratorRdd(1, 1000, [&](auto i) -> auto { return i; }, &default_context) |
    cpark::Transform([](auto x) { return x * x; }) |
    cpark::Filter([](auto x) { return x % 3 == 0; }) |
    cpark::Reduce([](auto x, auto y) { return x + y; });
    std::cout << "The computation result is " << result << std::endl;
    return 0;
    }
    Definition cpark.h:35
    Config & setParallelTaskNum(size_t num=0) noexcept
    Definition cpark.h:82
    Definition cpark.h:148
    Definition filter_rdd.h:161
    Definition generator_rdd.h:27
    Definition reduce.h:24
    Definition transformed_rdd.h:71
  4. Config Compiler
    Since our project makes the use of the standard library <ranges> in C++20. Your local compiler has to be able to support C++20, with correct compilation flag set to this or newer version, for example, -std=c++20.
  5. Include the Headers
    Manage your compiler to look for the correct header file location, for example, -I <path_to_cpark>/include.

Community and Support

  • Issue Tracker: Found a bug or have a feature request? Create an issue and let us know.
  • Contact: For administration issue, you can contact us by e-mail: cpark@alexxu.tech.
  • Contribute: If you are interested in our project, we would appreciate it if you star our project and submit a pull request for your code.

Test Status

Code Coverage

Coverage

Acknowledgments

We would like to express our gratitude to our project supervisor and designer of C++ Prof. Bjarne Stroustrup and everyone who has contributed to making cpark a reality.