Logging and Replay of Umpire Events

Logging

When debugging memory operation problems, it is sometimes helpful to enable Umpire’s logging facility. The logging functionality is enabled for default builds unless -DENABLE_LOGGING=’Off’ has been specified in which case it is disabled.

If Umpire logging is enabled, it may be controlled by setting the UMPIRE_LOG_LEVEL environment variable to Error, Warning, Info, or Debug. The Debug value is the most verbose.

When UMPIRE_LOG_LEVEL has been set, events will be logged to the standard output.

Replay

Umpire provides a lightweight replay capability that can be used to investigate performance of particular allocation patterns and reproduce bugs. By running an executable that uses Umpire with the environment variable UMPIRE_REPLAY set to On, Umpire will emit information for the following Umpire events:

Running with Replay

To enable Umpire replay, one may execute as follows:

UMPIRE_REPLAY="On" ./my_umpire_using_program > replay_log.json

will write Umpire replay events to the file replay_log.json that will contain the following kinds of information:

Interpretting Results - Version Event

The first event captured is the version event which shows the version information as follows:

{ "kind":"replay", "uid":27494, "timestamp":1558388052595911583, "event": "allocate", "payload": { "allocator_ref": "0x108a8730", "size": 0 } }

Each line contains the following set of common elements:

kind

Always set to replay

uid

This is the MPI rank of the process generating the event for mpi programs or the PID for non-mpi.

timestamp

Set to the time when the event occurred.

event

Set to one of: version, makeMemoryResource, makeAllocator, allocate, or deallocate

payload

Optional and varies upon event type

result

Optional and varies upon event type

As can be seen, the major, minor, and patch version numbers are captured within the payload for this event.

makeMemoryResource Event

Next you will see events for the creation of the default memory resources provided by Umpire with the makeMemoryResource event:

{ "kind":"replay", "uid":27494, "timestamp":1558388052595934822, "event": "allocate", "payload": { "allocator_ref": "0x108a8730", "size": 0 }, "result": { "memory_ptr": "0x200040000010" } }
{ "kind":"replay", "uid":27494, "timestamp":1558388052595939623, "event": "allocate", "payload": { "allocator_ref": "0x108a8730", "size": 134 } }
{ "kind":"replay", "uid":27494, "timestamp":1558388052595943793, "event": "allocate", "payload": { "allocator_ref": "0x108a8730", "size": 134 }, "result": { "memory_ptr": "0x200040000010" } }
{ "kind":"replay", "uid":27494, "timestamp":1558388052595947408, "event": "allocate", "payload": { "allocator_ref": "0x108a8730", "size": 774 } }
{ "kind":"replay", "uid":27494, "timestamp":1558388052595951548, "event": "allocate", "payload": { "allocator_ref": "0x108a8730", "size": 774 }, "result": { "memory_ptr": "0x2000400000a0" } }

The payload shows that a memory resource was created for HOST, DEVICE, PINNED, UM, and DEVICE_CONST respectively. The result is a reference to the object that was created within Umpire for that resource.

makeAllocator Event

The makeAllocator event occurs whenever a new allocator instance is being created. Each call to makeAllocator will generate a pair of JSON lines. The first line will show the intent of the call and the second line will show both the intent and the result. This is because the makeAllocator call can fail and keeping both the intent and result allows us to reproduce this failure later.

umpire::Allocator:

{ "kind":"replay", "uid":27494, "timestamp":1558388052595954839, "event": "allocate", "payload": { "allocator_ref": "0x108a8730", "size": 470 } }
{ "kind":"replay", "uid":27494, "timestamp":1558388052595958585, "event": "allocate", "payload": { "allocator_ref": "0x108a8730", "size": 470 }, "result": { "memory_ptr": "0x2000400003b0" } }

The payload shows how the allocator was constructed. The result shows the reference to the allocated object.

allocate Event

Like the makeAllocator event, the allocate event is captured as an intention/result pair so that an error may be replayed in the event that there is an allocation failure.

{ "kind":"replay", "uid":27494, "timestamp":1558388052595961448, "event": "allocate", "payload": { "allocator_ref": "0x108a8730", "size": 546 } }
{ "kind":"replay", "uid":27494, "timestamp":1558388052595964866, "event": "allocate", "payload": { "allocator_ref": "0x108a8730", "size": 546 }, "result": { "memory_ptr": "0x200040000590" } }

The payload shows the object reference of the allocator and the size of the allocation request. The result shows the pointer to the memory allocated.

deallocate Event

{ "kind":"replay", "uid":27494, "timestamp":1558388052596397388, "event": "deallocate", "payload": { "allocator_ref": "0x108a8730", "memory_ptr": "0x200040000b90" } }

The payload shows the reference to the allocator object and the pointer to the allocated memory that is to be freed.

Replaying the session

Loading this file with the replay program will replay this sequence of umpire::Allocator creation, allocations, and deallocations:

./bin/replay -i replay_log.json