Skip to main content

taichi.profiler#

class taichi.profiler.CuptiMetric(name='', header='unnamed_header', val_format='     {:8.0f} ', scale=1.0)#

A class to add CUPTI metric for KernelProfiler.

This class is designed to add user selected CUPTI metrics. Only available for the CUDA backend now, i.e. you need ti.init(kernel_profiler=True, arch=ti.cuda). For usage of this class, see examples in func set_kernel_profiler_metrics() and collect_kernel_profiler_metrics().

Parameters:

Example:

>>> import taichi as ti

>>> ti.init(kernel_profiler=True, arch=ti.cuda)
>>> num_elements = 128*1024*1024

>>> x = ti.field(ti.f32, shape=num_elements)
>>> y = ti.field(ti.f32, shape=())
>>> y[None] = 0

>>> @ti.kernel
>>> def reduction():
>>>     for i in x:
>>>         y[None] += x[i]

>>> global_op_atom = ti.profiler.CuptiMetric(
>>>     name='l1tex__t_set_accesses_pipe_lsu_mem_global_op_atom.sum',
>>>     header=' global.atom ',
>>>     val_format='    {:8.0f} ')

>>> # add and set user defined metrics
>>> profiling_metrics = ti.profiler.get_predefined_cupti_metrics('global_access') + [global_op_atom]
>>> ti.profiler.set_kernel_profile_metrics(profiling_metrics)

>>> for i in range(16):
>>>     reduction()
>>> ti.profiler.print_kernel_profiler_info('trace')

Note

For details about using CUPTI in Taichi, please visit https://docs.taichi-lang.org/docs/profiler#advanced-mode.

taichi.profiler.clear_kernel_profiler_info()#

Clear all KernelProfiler records.

taichi.profiler.clear_scoped_profiler_info()#

Clear profiler’s records about time elapsed on the host tasks.

Call function imports from C++ : _ti_core.clear_profile_info()

taichi.profiler.collect_kernel_profiler_metrics(metric_list=default_cupti_metrics)#

Set temporary metrics that will be collected by the CUPTI toolkit within this context.

Parameters:

metric_list (list) – a list of CuptiMetric() instances, default value: default_cupti_metrics.

Example:

>>> import taichi as ti

>>> ti.init(kernel_profiler=True, arch=ti.cuda)
>>> ti.profiler.set_kernel_profiler_toolkit('cupti')
>>> num_elements = 128*1024*1024

>>> x = ti.field(ti.f32, shape=num_elements)
>>> y = ti.field(ti.f32, shape=())
>>> y[None] = 0

>>> @ti.kernel
>>> def reduction():
>>>     for i in x:
>>>         y[None] += x[i]

>>> # In the case of not parameter, Taichi will print its pre-defined metrics list
>>> ti.profiler.get_predefined_cupti_metrics()
>>> # get Taichi pre-defined metrics
>>> profiling_metrics = ti.profiler.get_predefined_cupti_metrics('device_utilization')

>>> global_op_atom = ti.profiler.CuptiMetric(
>>>     name='l1tex__t_set_accesses_pipe_lsu_mem_global_op_atom.sum',
>>>     header=' global.atom ',
>>>     format='    {:8.0f} ')
>>> # add user defined metrics
>>> profiling_metrics += [global_op_atom]

>>> # metrics setting is temporary, and will be clear when exit from this context.
>>> with ti.profiler.collect_kernel_profiler_metrics(profiling_metrics):
>>>     for i in range(16):
>>>         reduction()
>>>     ti.profiler.print_kernel_profiler_info('trace')

Note

The configuration of the metric_list will be clear when exit from this context.

taichi.profiler.get_kernel_profiler_total_time()#

Get elapsed time of all kernels recorded in KernelProfiler.

Returns:

total time in second.

Return type:

time (float)

taichi.profiler.get_predefined_cupti_metrics(name='')#

Returns the specified cupti metric.

Accepted arguments are ‘global_access’, ‘shared_access’, ‘atomic_access’, ‘cache_hit_rate’, ‘device_utilization’.

Parameters:

name (str) – cupti metri name.

taichi.profiler.print_kernel_profiler_info(mode='count')#

Print the profiling results of Taichi kernels.

To enable this profiler, set kernel_profiler=True in ti.init(). 'count' mode: print the statistics (min,max,avg time) of launched kernels, 'trace' mode: print the records of launched kernels with specific profiling metrics (time, memory load/store and core utilization etc.), and defaults to 'count'.

Parameters:

mode (str) – the way to print profiling results.

Example:

>>> import taichi as ti

>>> ti.init(ti.cpu, kernel_profiler=True)
>>> var = ti.field(ti.f32, shape=1)

>>> @ti.kernel
>>> def compute():
>>>     var[0] = 1.0

>>> compute()
>>> ti.profiler.print_kernel_profiler_info()
>>> # equivalent calls :
>>> # ti.profiler.print_kernel_profiler_info('count')

>>> ti.profiler.print_kernel_profiler_info('trace')

Note

Currently the result of KernelProfiler could be incorrect on OpenGL backend due to its lack of support for ti.sync().

For advanced mode of KernelProfiler, please visit https://docs.taichi-lang.org/docs/profiler#advanced-mode.

taichi.profiler.print_memory_profiler_info()#

Memory profiling tool for LLVM backends with full sparse support.

This profiler is automatically on.

taichi.profiler.print_scoped_profiler_info()#

Print time elapsed on the host tasks in a hierarchical format.

This profiler is automatically on.

Call function imports from C++ : _ti_core.print_profile_info()

Example:

>>> import taichi as ti
>>> ti.init(arch=ti.cpu)
>>> var = ti.field(ti.f32, shape=1)
>>> @ti.kernel
>>> def compute():
>>>     var[0] = 1.0
>>>     print("Setting var[0] =", var[0])
>>> compute()
>>> ti.profiler.print_scoped_profiler_info()
taichi.profiler.query_kernel_profiler_info(name)#

Query kernel elapsed time(min,avg,max) on devices using the kernel name.

To enable this profiler, set kernel_profiler=True in ti.init.

Parameters:

name (str) – kernel name.

Returns:

with member variables(counter, min, max, avg)

Return type:

KernelProfilerQueryResult (class)

Example:

>>> import taichi as ti

>>> ti.init(ti.cpu, kernel_profiler=True)
>>> n = 1024*1024
>>> var = ti.field(ti.f32, shape=n)

>>> @ti.kernel
>>> def fill():
>>>     for i in range(n):
>>>         var[i] = 0.1

>>> fill()
>>> ti.profiler.clear_kernel_profiler_info() #[1]
>>> for i in range(100):
>>>     fill()
>>> query_result = ti.profiler.query_kernel_profiler_info(fill.__name__) #[2]
>>> print("kernel executed times =",query_result.counter)
>>> print("kernel elapsed time(min_in_ms) =",query_result.min)
>>> print("kernel elapsed time(max_in_ms) =",query_result.max)
>>> print("kernel elapsed time(avg_in_ms) =",query_result.avg)

Note

[1] To get the correct result, query_kernel_profiler_info() must be used in conjunction with clear_kernel_profiler_info().

[2] Currently the result of KernelProfiler could be incorrect on OpenGL backend due to its lack of support for ti.sync().

taichi.profiler.set_kernel_profiler_metrics(metric_list=default_cupti_metrics)#

Set metrics that will be collected by the CUPTI toolkit.

Parameters:

metric_list (list) – a list of CuptiMetric() instances, default value: default_cupti_metrics.

Example:

>>> import taichi as ti

>>> ti.init(kernel_profiler=True, arch=ti.cuda)
>>> ti.profiler.set_kernel_profiler_toolkit('cupti')
>>> num_elements = 128*1024*1024

>>> x = ti.field(ti.f32, shape=num_elements)
>>> y = ti.field(ti.f32, shape=())
>>> y[None] = 0

>>> @ti.kernel
>>> def reduction():
>>>     for i in x:
>>>         y[None] += x[i]

>>> # In the case of not parameter, Taichi will print its pre-defined metrics list
>>> ti.profiler.get_predefined_cupti_metrics()
>>> # get Taichi pre-defined metrics
>>> profiling_metrics = ti.profiler.get_predefined_cupti_metrics('shared_access')

>>> global_op_atom = ti.profiler.CuptiMetric(
>>>     name='l1tex__t_set_accesses_pipe_lsu_mem_global_op_atom.sum',
>>>     header=' global.atom ',
>>>     format='    {:8.0f} ')
>>> # add user defined metrics
>>> profiling_metrics += [global_op_atom]

>>> # metrics setting will be retained until the next configuration
>>> ti.profiler.set_kernel_profiler_metrics(profiling_metrics)
>>> for i in range(16):
>>>     reduction()
>>> ti.profiler.print_kernel_profiler_info('trace')

Note

Metrics setting will be retained until the next configuration.

taichi.profiler.set_kernel_profiler_toolkit(toolkit_name='default')#

Set the toolkit used by KernelProfiler.

Currently, we only support toolkits: 'default' and 'cupti'.

Parameters:

toolkit_name (str) – string of toolkit name.

Returns:

whether the setting is successful or not.

Return type:

status (bool)

Example:

>>> import taichi as ti

>>> ti.init(arch=ti.cuda, kernel_profiler=True)
>>> x = ti.field(ti.f32, shape=1024*1024)

>>> @ti.kernel
>>> def fill():
>>>     for i in x:
>>>         x[i] = i

>>> ti.profiler.set_kernel_profiler_toolkit('cupti')
>>> for i in range(100):
>>>     fill()
>>> ti.profiler.print_kernel_profiler_info()

>>> ti.profiler.set_kernel_profiler_toolkit('default')
>>> for i in range(100):
>>>     fill()
>>> ti.profiler.print_kernel_profiler_info()