Improving documentation (835a1d82) · Commits · AGS_public / pythermogis

docs/images/parallelization.png

−10.2 KiB (21.5 KiB)

Loading image diff...

docs/reference/assess_optimal_chunk_size.md

0 → 100644

+3 −0

Original line number	Diff line number	Diff line
		# API Reference

		::: pythermogis.dask_utils.assess_optimal_chunk_size
		No newline at end of file

docs/usage/parallelization.md

+10 −12

Original line number	Diff line number	Diff line
		# Parallelization

		When running `calculate_doublet_performance` or `calculate_doublet_performance_stochastic` (described in more detail in [deterministic doublet](deterministic_doublet.md) and [stochastic_doublet](stochastic_doublet.md)) then each combination of input reservoir properties is an independent simulation.
		When running `calculate_doublet_performance` or `calculate_doublet_performance_stochastic` (described in more detail in [deterministic doublet](deterministic_doublet.md) and [stochastic_doublet](stochastic_doublet.md)) then each combination of input reservoir properties is an independent simulation, making this computation a good target for parallelization, where you use more hardware resources to run processes simultaneously to decrease the execution time.

		This makes this a good target for parallelization, where you use more hardware resources to run processes simultaneously to decrease the execution time.
		Traditionally trying to parallelize code in python has been tricky and modules such as [multiprocessing](https://docs.python.org/3/library/multiprocessing.html) have been developed to handle this task, still a lot of custom code usually has to be developed to get the right setup for your specific problem.

		Traditionally trying to parallelize code in python has been tricky and custom in built modules such as [multiprocessing](https://docs.python.org/3/library/multiprocessing.html) have been developed to handle this task, still a lot of custom code usually had to be developed to get the right setup for your specific problem.

		pythermogis however uses [xarray](https://docs.xarray.dev/en/latest/index.html) which under the hood uses [dask](https://www.dask.org/) to run paralel operations. For more details on how xarray utilizes dask for easy parallelization we direct the reader to the following: [Parallel Computing with Dask](https://docs.xarray.dev/en/latest/user-guide/dask.html).
		pythermogis however uses [xarray](https://docs.xarray.dev/en/latest/index.html) as its data framework which under the hood uses [dask](https://www.dask.org/) to run parallel operations. For more details on how xarray utilizes dask for easy parallelization we direct the reader to the following: [Parallel Computing with Dask](https://docs.xarray.dev/en/latest/user-guide/dask.html).

		The framework has already been implemented in pythermogis, the user simply has to define the `chunk_size` parameter when calling either `calculate_doublet_performance` or `calculate_doublet_performance_stochastic` and then the doublet simulations will occur in parallel.

		See below for an explaination of what `chunk_size` is and how to determine the optimal size.

		## What is chunk size and how to determine it?
		dask parallelization works by applying an operation (in this case `simulate_doublet`) across 'chunks' of input data, which are a collection of data that are run in parralel.
		## What is chunk size and how to determine the optimal value?
		dask parallelization works by applying an operation (in this case `simulate_doublet`) across 'chunks' of input data in parallel.

		Lets say we wish to compute 1000 doublet simulations our smallest `chunk_size` would be 1, meaning that every simulation is sent as an independent job to a processor, while the largest `chunk_size` is 1000, meaning one job is sent to a processor and the simulations are run in series.
		As an example, say we wish to compute 1000 doublet simulations our smallest possible `chunk_size` would be 1, meaning that every simulation is sent as an independent job to a processor, while the largest `chunk_size` is 1000, meaning one job is sent to a processor and the simulations are run in series.

		The first example would be inefficient as there is a computational cost to chunking when it comes to organising the input and the output, while the second example is also inefficient as each simulation is run in series, the optimal `chunk_size` is likely to be between these two values.
		The first example would be inefficient as there is a computational cost to chunking and de-chunking the input and the output, while the second example is also inefficient as each simulation is run in series, the optimal `chunk_size` will be between these two values.

		The following figure shows how different chunk sizes affects the overall compute time. It can be seen that the most efficient chunk size (for the hardware this example was run on) is by having 100 simulations per chunk.
		The following figure shows how different chunk sizes affects the overall compute time. It can be seen that the most efficient chunk size (for the hardware this example was run on) is by having 100-200 simulations per chunk.

		![parallelization](../images/parallelization.png)

		To aid in assessing the optimal chunk size and to make the above figure pythermogis has a function you can run on your hardware: `assess_optimal_chunk_size` (see [assess optimal chunk size](../reference/assess_optimal_chunk_size.md) for usage).

mkdocs.yml

+1 −0

Original line number	Diff line number	Diff line
		@@ -68,6 +68,7 @@ nav:
		- utc_properties: reference/utc_properties.md
		- deterministic doublet simulation: reference/doublet_simulation_deterministic.md
		- stochastic doublet simulation: reference/doublet_simulation_stochastic.md
		- assess optimal chunk size: reference/doublet_simulation_stochastic.md

src/pythermogis/dask_utils/assess_optimal_chunk_size.py

+27 −17

Original line number	Diff line number	Diff line
		@@ -8,37 +8,45 @@ import matplotlib.pyplot as plt

		from pythermogis.doublet_simulation.deterministic_doublet import calculate_doublet_performance

		def assess_optimal_chunk_size(n_simulations: int = 1000, plot_outfile : str \| Path = None):
		"""Run the same simulation with different chunk sizes to find the optimal chunk size for this hardware"""
		def assess_optimal_chunk_size(n_simulations: int = 1000, chunk_step_size: int = 100, plot_outfile : str \| Path = None):
		"""
		This runs the same set of doublet simulations using different chunk sizes and prints the results to the terminal to show find which chunk size is optimal.
		It runs each chunk size 3 times and takes the average of their times to find the time taken for that chunk size.

		# generate simulation samples across desired reservoir properties
		thickness_samples = np.random.uniform(low=150, high=300, size=n_simulations)
		porosity_samples = np.random.uniform(low=0.5, high=0.8, size=n_simulations)
		ntg_samples = np.random.uniform(low=0.25, high=0.5, size=n_simulations)
		depth_samples = np.random.uniform(low=1800, high=2200, size=n_simulations)
		permeability_samples = np.random.uniform(low=200, high=800, size=n_simulations)

		Parameters
		----------
		n_simulations : int
		The number of simulations which are run in parralel
		chunk_step_size : int
		The step size of chunks to test; going from 1 to n_simulations
		plot_outfile : str \| Path
		If provided a plot is made showing the time taken per-chunk size and saved to this path

		"""
		reservoir_properties = xr.Dataset(
		{
		"thickness": (["sample"], thickness_samples),
		"porosity": (["sample"], porosity_samples),
		"ntg": (["sample"], ntg_samples),
		"depth": (["sample"], depth_samples),
		"permeability": (["sample"], permeability_samples),
		"thickness": (["sample"], np.ones(n_simulations) * 200),
		"porosity": (["sample"], np.ones(n_simulations)),
		"ntg": (["sample"], np.ones(n_simulations)),
		"depth": (["sample"], np.ones(n_simulations) * 1000),
		"permeability": (["sample"], np.ones(n_simulations) * 500),
		},
		coords={"sample": np.arange(n_simulations)}
		)
		n_attempts = 3 # do the same operation n_attempts time and take an average of their times.

		# normal
		n_attempts = 2 # do the same operation n_attempts and take an average of their times.
		sample_chunks = np.arange(1, n_simulations + 2, chunk_step_size)

		# run in series
		time_attempt = []
		for attempt in range(n_attempts):
		start = timeit.default_timer()
		simulation_benchmark = calculate_doublet_performance(reservoir_properties)
		time_attempt.append(timeit.default_timer() - start)
		normal_time = np.mean(time_attempt)
		print(f"non-parralel simulation took {np.mean(time_attempt):.1f} seconds, {n_simulations / np.mean(time_attempt):.1f} samples per second")
		print(f"non-parralel simulation took {normal_time:.1f} seconds, {n_simulations / normal_time:.1f} samples per second")

		sample_chunks = np.arange(1, n_simulations, 100)

		# run in parallel:
		mean_time = []
		@@ -50,8 +58,10 @@ def assess_optimal_chunk_size(n_simulations: int = 1000, plot_outfile : str \| Pa
		simulations_parallel = calculate_doublet_performance(reservoir_properties, chunk_size=sample_chunk, print_execution_duration=False)
		time_attempt.append(timeit.default_timer() - start)

		# additional check that the results are identical
		xr.testing.assert_allclose(simulation_benchmark, simulations_parallel)
		xr.testing.assert_equal(simulation_benchmark, simulations_parallel)

		mean_time.append(np.mean(time_attempt))
		std_time.append(np.std(time_attempt))
		print(f"parralel simulation, chunk size: {sample_chunk}, took {np.mean(time_attempt):.1f} seconds to run {n_simulations} simulations, {n_simulations / mean_time[-1]:.1f} samples per second")