Dynamic Pools

Frequently allocating and deallocating memory can be quite costly, especially when you are making large allocations or allocating on different memory resources. To mitigate this, Umpire provides allocation strategies that can be used to customize how data is obtained from the system.

In this example, we will look at the umpire::strategy::DynamicPool strategy. This is a simple pooling algorithm that can fulfill requests for allocations of any size. To create a new Allocator using the umpire::strategy::DynamicPool strategy:

  auto allocator = rm.getAllocator(resource);

  auto pooled_allocator = 
    rm.makeAllocator<umpire::strategy::DynamicPool>(resource + "_pool",
                                                    allocator);

We have to provide a new name for the Allocator, as well as the underlying Allocator we wish to use to grab memory.

Once you have an Allocator, you can allocate and deallocate memory as before, without needing to worry about the underlying algorithm used for the allocations:

  double* data = static_cast<double*>(
      pooled_allocator.allocate(SIZE*sizeof(double)));

  std::cout << "Allocated " << (SIZE*sizeof(double)) << " bytes using the "
    << pooled_allocator.getName() << " allocator...";

  pooled_allocator.deallocate(data);

Don’t forget, these strategies can be created on top of any valid Allocator:

#if defined(UMPIRE_ENABLE_CUDA)
  allocate_and_deallocate_pool("DEVICE");
  allocate_and_deallocate_pool("UM");
  allocate_and_deallocate_pool("PINNED");
#endif

Most Umpire users will make alloctations that use the GPU via the umpire::strategy::DynamicPool, to help mitigate the cost of allocating memory on these devices.

You can tune the way that umpire::strategy::DynamicPool allocates memory using two parameters: the initial size, and the minimum size. The initial size controls how large the first underly allocation made will be, regardless of the requested size. The minimum size controls the minimum size of any future underlying allocations. These two parameters can be passed when constructing a pool:

  auto pooled_allocator = 
    rm.makeAllocator<umpire::strategy::DynamicPool>(resource + "_pool",
                                                    allocator,
                                                    initial_size, /* default = 512Mb*/
                                                    min_block_size /* default = 1Mb */);

Depending on where you are allocating data, you might want to use different sizes. It’s easy to construct multiple pools with different configurations:

  allocate_and_deallocate_pool("HOST", 65536, 512);
#if defined(UMPIRE_ENABLE_CUDA)
  allocate_and_deallocate_pool("DEVICE", (1024*1024*1024), (1024*1024));
  allocate_and_deallocate_pool("UM", (1024*64), 1024);
  allocate_and_deallocate_pool("PINNED", (1024*16), 1024);
#endif

There are lots of different strategies that you can use, we will look at some of them in this tutorial. A complete list of strategies can be found here.

//////////////////////////////////////////////////////////////////////////////
// Copyright (c) 2018, Lawrence Livermore National Security, LLC.
// Produced at the Lawrence Livermore National Laboratory
//
// Created by David Beckingsale, david@llnl.gov
// LLNL-CODE-747640
//
// All rights reserved.
//
// This file is part of Umpire.
//
// For details, see https://github.com/LLNL/Umpire
// Please also see the LICENSE file for MIT license.
//////////////////////////////////////////////////////////////////////////////
#include "umpire/Allocator.hpp"
#include "umpire/ResourceManager.hpp"

#include "umpire/strategy/DynamicPool.hpp"

void allocate_and_deallocate_pool(const std::string& resource)
{
  constexpr size_t SIZE = 1024;

  auto& rm = umpire::ResourceManager::getInstance();

  auto allocator = rm.getAllocator(resource);

  auto pooled_allocator = 
    rm.makeAllocator<umpire::strategy::DynamicPool>(resource + "_pool",
                                                    allocator);

  double* data = static_cast<double*>(
      pooled_allocator.allocate(SIZE*sizeof(double)));

  std::cout << "Allocated " << (SIZE*sizeof(double)) << " bytes using the "
    << pooled_allocator.getName() << " allocator...";

  pooled_allocator.deallocate(data);

  std::cout << " deallocated." << std::endl;
}

int main(int, char**) {
  allocate_and_deallocate_pool("HOST");

#if defined(UMPIRE_ENABLE_CUDA)
  allocate_and_deallocate_pool("DEVICE");
  allocate_and_deallocate_pool("UM");
  allocate_and_deallocate_pool("PINNED");
#endif

  return 0;
}
//////////////////////////////////////////////////////////////////////////////
// Copyright (c) 2018, Lawrence Livermore National Security, LLC.
// Produced at the Lawrence Livermore National Laboratory
//
// Created by David Beckingsale, david@llnl.gov
// LLNL-CODE-747640
//
// All rights reserved.
//
// This file is part of Umpire.
//
// For details, see https://github.com/LLNL/Umpire
// Please also see the LICENSE file for MIT license.
//////////////////////////////////////////////////////////////////////////////
#include "umpire/Allocator.hpp"
#include "umpire/ResourceManager.hpp"

#include "umpire/strategy/DynamicPool.hpp"

void allocate_and_deallocate_pool(
    const std::string& resource,
    std::size_t initial_size,
    std::size_t min_block_size)
{
  constexpr size_t SIZE = 1024;

  auto& rm = umpire::ResourceManager::getInstance();

  auto allocator = rm.getAllocator(resource);

  auto pooled_allocator = 
    rm.makeAllocator<umpire::strategy::DynamicPool>(resource + "_pool",
                                                    allocator,
                                                    initial_size, /* default = 512Mb*/
                                                    min_block_size /* default = 1Mb */);

  double* data = static_cast<double*>(
      pooled_allocator.allocate(SIZE*sizeof(double)));

  std::cout << "Allocated " << (SIZE*sizeof(double)) << " bytes using the "
    << pooled_allocator.getName() << " allocator...";

  pooled_allocator.deallocate(data);

  std::cout << " deallocated." << std::endl;
}

int main(int, char**) {
  allocate_and_deallocate_pool("HOST", 65536, 512);
#if defined(UMPIRE_ENABLE_CUDA)
  allocate_and_deallocate_pool("DEVICE", (1024*1024*1024), (1024*1024));
  allocate_and_deallocate_pool("UM", (1024*64), 1024);
  allocate_and_deallocate_pool("PINNED", (1024*16), 1024);
#endif

  return 0;
}