![]() |
CPARK 1.0
A light-weighted, distributed computing framework for C++ that offers a fast and general-purpose large data processing solution.
|
CPARK develop team - December 7, 2023
Visit the cpark documentation website for comprehensive information, guides, and examples.
Especially, on the homepage, there is a general introduction of the whole project and a brief version of how to get started with this project. On the Related Pages, there is a screenshot of a real-world shell session to build your first program from scratch. The Topics page is the main user entry for detailed technical documentation of different types of Creations, Transformations, and Actions, as explained in our design documentation.
Clone the cpark repository to your local machine:
Make sure you are in the main-branch, which has been passing all the automation tests enforced by our branch policy everytime a pull request is trying to merge into main-branch.
With our delicate cpark library on your computer, we can now build some great program using it.
First, please create a C++ file, for example, cpark.cpp
, and add the following code:
Since our project makes the use of the C++ 20 standard library <ranges> and other new features, you have to ensure that your local compiler supports C++20 and set the correct compilation flag. For example: -std=c++20
.
You also have to manage your compiler to look for header files in the correct location.
So, for this example program, I can compile with g++
on my local computer with the following command:
Execute the compiled example:
And you can see the running result as
While this result number doesn't make sense, because it gets overflow, thus the running result is an undefined behavior, we purposely choose such a large number here for performance comparison below.
Noticed that, in your previous code, we were very conservative in choosing 2 cpu cores for the calculation. So now let's run the executable again and see how well it performs:
the time
command reports to us that it used 173% CPU, which is very close to the parameter we set for 2 cores, at which the program took a total of 0.036 seconds.
If your computer has more CPU cores, you can also change this parameter to try other parallelism levels.
Of course, there is another trick here. If we check the documentation for this function, we can see that if we use the default value for this function, we can have cpark library automatically check the number of parallelisms supported by the current hardware setup.
So after we changed the relevant parameters, we can experiment again.
It is obvious that it runs significantly faster and the CPU usage is really close to what my computer expected.
Congratulations! You've successfully embarked on your journey with cpark, a high-performance parallel computing library. As you explore the documentation, you'll discover a wealth of features and capabilities that empower you to tackle complex computations with ease. From parallelized data generation to seamless result reduction, cpark is designed to streamline your code and unlock the full potential of your machine's capabilities.
Whether you're a seasoned developer or new to parallel computing, cpark offers a user-friendly yet powerful framework to elevate your C++ projects. Dive into the documentation, experiment with different pipelines, and leverage the flexibility of cpark to optimize your computations.
Happy coding, and may your parallel computations be as efficient as they are exciting!