Xperf Basics: Recording a Trace (the easy way)(转)

2023年5月31日下午2:58 • 技术杂谈 • 阅读 84

Some time ago I wrote a long and detailed post about how to record traces using xperf. The steps needed to record a trace were daunting. However more recent versions of the Windows Performance Toolkit (WPT, the proper name for the xperf suite of tools) have made it a lot easier.

The new method for recording traces should handle most scenarios, and it is enough simpler that this post is far shorter than the previous one. In this post I describe the steps needed to record xperf/ETW traces so you can start using this impressive (and free) whole system profiling tool.

A web search on “windows performance toolkit installer” finds a lot of discussions on how to install various old versions of WPT/xperf. Even the Microsoft download pages for old versions of WPT have not been updated to acknowledge the existence of newer versions. It’s easy to accidentally install an obsolete version.

As of today what you need to do is install the Windows Software Development Kit for Windows 8 RTM. The installer, available here, will let you install whatever components you want. In addition to Windows Performance Toolkit I also recommend installing Application Verifier and Debugging Tools:

The redistributable installers for these three components also get installed – you can find them (on 64-bit machines) in C:\Program Files (x86)\Windows Kits\8.0. Apparently the confusingly named “WPTx64-x86_en-us.msi” is a 32-bit installer that installs the 64-bit version of WPT. Wacky. The redistributables can be handy for sharing WPT with your coworkers, or even customers.

I’m sure that new versions will continue to be released, probably in the Windows SDK, so check for newer SDKs. You can check what version you have installed by running WPA and looking at the about box. Here is the about box for the version of WPA that I am currently running.

You can skip this step if you want – it’s only needed for recording traces with custom ETW providers. But if you use xperf significantly you’ll definitely want to do this – getting custom events into ETW traces makes them far easier to analyze. However you don’t need this for your first trace.

You can download my sample user-mode ETW providers from . This .zip file has been updated since the original post. Configuration files for wprui and wpa were added, and a – inputlogger option was added that turns the executable into a very handy key logger – all mouse and keyboard events are emitted as ETW events which can be recorded using the steps in this article. This can be very helpful in trace investigations, but don’t use this illicitly.

When you unzip the file you’ll find a Visual Studio 2010 solution file. Build either configuration. You might want to poke around and look at the ReadMe.txt file and the provider manifest file (etwprovider.man). If you want to use these providers then after you build the project you’ll have to run etwregister.bat from an elevated command prompt. For more details (some of them obsolete) look here.

Apparently wprui can set the DisablePagingExecutive registry key for you so you don’t need to do this step. Just click OK if it says you need to.

~~If you are running 64-bit Windows (you are aren’t you?) then there is a registry key that you need to set. And then you need to reboot. The registry key tells Windows to keep information needed for stack walking in non-pageable memory. If you run the command below from an elevated command prompt (yes, it is all one line that is excessively word-wrapped) and then reboot then your call stacks will thank you. Setting this registry key wastes a little bit of memory, but I don’t think it’s enough to matter so I always leave it set.~~

<strike>REG ADD "HKLM\System\CurrentControlSet\Control\Session Manager\Memory Management" -v DisablePagingExecutive -d 0x1 -t REG_DWORD -f</strike>

Recording traces is now much easier than it used to be because of wprui. This tool – ask for it by name in the start menu search box – is an actual UI for recording traces. The default settings for wprui are quite good, but for best results you should enable some custom user providers. This lets you see events like frame start times, keyboard input, or whatever custom events you want to put in your traces.

To add custom user providers you should run wprui and then click on the Add Profiles…_button. This lets you configure arbitrary data (from system, user/event, or heap providers) to be recorded in the trace. The profile definition format is rather _lightly_documented, with only a couple of samples (look in the _wprui install directory for example .wprp files), but with a bit of hacking and experimentation and some help from some friends at Microsoft I created the XML configuration file that I needed. It is in the .zip file referenced above and is called MultiProviderProfile.wprp. When you add this profile to wprui (click Add Profiles… and select the file) and enable it then the four custom providers used by the MultiProvider project are enabled in wprui. For details on viewing the recorded events in wpa see step 7. With MultiProviderProfile.wprp loaded and enabled wprui should look something like this:

When you click Start in wprui then tracing begins. By default the trace data is recorded to circular buffers in memory, which is usually ideal. In this mode xperf/wprui is constantly monitoring your system and it can be left running like this 24/7. The length of time recorded will depend on how busy your system is and how much memory is devoted to buffering.

Recording to a file is potentially useful – it lets you record arbitrarily long sessions – but this is also its downside. It’s easy to accidentally leave tracing enabled, and then your drive fills up with an ETL file so large you can’t ever load it. You can also do Light profiling instead of the default Verbose profiling. Generally this means that stacks are not recorded, so the overhead is lower, but it’s also harder to do analysis.

For initial experimentation you might want to run the MultiProvider sample from the .zip file listed above while recording the trace. This will emit custom events for wprui to record, if you have configured it to do that, and this will give you more interesting data to look at in the trace.

If you’re recording to memory then you can save those circular buffers to disk whenever you want. If you hit a performance problem – in any program – just click on Save (the Start button changes to Save while recording) to save the buffers to disk. Retroactive profiling is truly awesome.

You can also use the system-global Ctrl+Win+C shortcut to trigger saving of the buffers. I recommend using this, especially when recording traces of full-screen games, since it lets you trigger the trace without the delay of alt+tabbing to wprui. Either way, after you ask to have the trace saved you’ll be taken to the window shown below. Take your time and write up a nice verbose description of what was happening when you recorded the trace. The trace is saved to disk in the background while you’re typing, and your comment will be added as a Mark (visible in the Marks table in the System Activity section) once you hit Save. This is your chance to annotate the trace with a description of what was happening.

Once you’ve saved the trace you can navigate to where it was saved and open it in wpa, which is now the recommended trace viewer. Details of how to analyze it are beyond the scope of this post, but there are a few points worth mentioning. The default view in wpa is quite austere (completely blank!), but you can configure that. If you copy the Startup.wpaProfile included in MultiProvider.zip to the WPA Files folder in your documents directory then WPA will start up with (in my opinion) more useful defaults:

The top graph will show the Multi Provider Generic Events which lets you see frame rate events or whatever else your program is emitting. I configured it to filter out any providers that don’t contain “Multi-” in order to reduce unwanted clutter. You can display all events by using the View Preset dropdown to change views, and you can display the data as a table to see event details more easily
The next graph shows Window in Focus information, which is another useful way of orienting yourself in a trace – note that this graph will sometimes be blank due to bugs in this provider
The next graph shows CPU Usage (Precise) data which is an extremely accurate measure of CPU consumption, derived from context switch records. I configured the columns so that if you display the table associated with this graph you can do idle-time analysis
The final graph shows CPU Usage (Sampled) data. I configured it to display call stacks that are grouped by Process and Thread ID. This can be used for CPU busy analysis. You can use the View Preset dropdown to display data grouped by the Module and Function based preset that I created

You don’t have to use this startup file, but I strongly recommend configuring some sort of defaults to make trace analysis initially easier. You can save your customizations with Profiles-> Save Startup Profile.

For more information on using WPA, including a list of known bugs, see the article I wrote last year titled _ WPA-Xperf Trace Analysis Reimagined._

I have to confess that I don’t use wprui. I wrote my own trace recording UI which is tuned to my own needs and wprui is not good enough to tempt me to switch. Some of the features that wprui needs to acquire before I’ll consider switching include:

an (optional) input monitor to insert input events into traces – key loggers are very helpful. This could always be done as an external program
a way to manage (rename, annotate, view) the traces it has recorded
transparent compression and decompression of traces
one-click changing of the CPU sampling frequency
a way to adjust the buffer size, since on large-memory machines wprui‘s buffers are excessively huge (2 GB? Really?)
automatic installation of the symbol loading DLLs – preferably the versions that don’t take hours to load symbols
Trace file names with the year before the date so that they sort sensibly

Adding these features, and perhaps a few others, would get rid of the existing clunkiness which makes wprui slightly frustrating to use.

wprui does have some great advantages however. It supports symbols for managed code and JavaScript, and it’s a huge improvement over using xperf from the command-line to record traces. Recommended.

If you search for documentation on wprui you won’t find much. You need to search for wpr_instead as this covers documentation for _wprui as well as the command line wpr tool. I’m not a fan of the documentation – too much clicking around for too little content – but you can find it here. It looks like WPT was supposed to ship with a documentation file – wpt.chm – but it is missing in action. Oops.

Michael Milirud’s Build 2011 talks, available here, show some of the features of wprui and wpa.

Microsoft recommends not checking too many of the wprui check boxes, in order to avoid increasing the data rate too much. However they also don’t give details about what each checkbox controls and I don’t want to experiment a lot so I typically check the top three boxes, plus the Multi Provider one. Your mileage may vary.

You can use xperf -loggers for debugging configuration files. It dumps detailed information about all enabled providers and you can search through it to see if the providers you requested got enabled. If it says no loggers enabled then you’re probably running it from a non-administrator command prompt.

You can use “wpr -profiles” and “wpr -profiledetails” to see what the built-in profiles contain.

I ranted last week about the importance of discoverability – there’s not much point having keyboard shortcuts if your users can’t find them. The problem then was that the options for zooming in wpa were needlessly hidden. I think wprui takes a lack of discoverability to new heights. In order to find the global keyboard shortcut for wprui all I had to do was look in this file:

C:\Program Files (x86)\Windows Kits\8.0\Windows Performance Toolkit\wpr.config.xml

and in there I found this line:

Of course! Ctrl+Win+C is the global keyboard shortcut for recording a trace. Displaying it in the UI would have been too obvious.

Use wprui. Out of the box it is the easiest way of recording xperf traces. With a bit of configuration you can get it to record custom user events which make trace navigation much easier. Use the Ctrl+Win+C shortcut to trigger the recording of traces

Original: https://www.cnblogs.com/Clingingboy/p/3519595.html
Author: Clingingboy
Title: Xperf Basics: Recording a Trace (the easy way)(转)

原创文章受到原创版权保护。转载请注明出处：https://www.johngo689.com/551455/

转载文章受原作者版权保护。转载请注明原作者出处！

技术杂谈

【自取】最近整理的，有需要可以领取学习：

Linux核心资料大放送~

全栈面试题汇总（持续更新&可下载）

一个提高学习100%效率的工具！

【超详细】深度学习面试题目！

LeetCode Python刷题答案下载！

LeetCode Java版刷题答案下载！

LeetCode C++ 版本，抓紧保存！

LeetCode GO语言刷题答案下载！

Testlink for linux by Xampp

Testlink 1．环境：（1）需要的环境配置: ① Linux system. ② Mysql ③ apache ④ Php （2）上面的 2,3,4 我们使用简易的X…

技术杂谈 2023年6月21日
0078
完全二叉树结点数

完全二叉树结点数问题描述给你一棵完全二叉树的根节点 root ，求出该树的节点个数。完全二叉树的定义如下：在完全二叉树中，除了最底层节点可能没填满外，其余每层节点数都达到最大…

技术杂谈 2023年7月25日
0054
基础篇：一文讲懂树莓派命令行文本编辑工具Vim的使用

简介众所周知，在Linux系统下的命令行调试界面，经常会遇到需要文本编辑的情况，而树莓派官方系统默认自带了Nano编辑器，Nano的操作门槛更低，但却不如Vim编辑器方便。Vim…

技术杂谈 2023年7月23日
0061
JAVA8-Lambda-List转Map

List转Map需要注意点是在收集map时Collectors.toMap()建议选三个入参的方法。示例如下：(注意list中的”张三”有两个我们将其作为…

技术杂谈 2023年7月24日
0048
【干货】整理分布式技术框架常用的算法及策略

将一些零散的知识点进行整理，以便加深理解，方便查阅，也希望能帮到大家。通过系统随机函数，根据后端服务器列表的大小值来随机选择其中一台进行访问。由概率统计理论可以得知，随着调用量…

技术杂谈 2023年6月1日
0067
Swoole——创建TCP服务

启动TCP服务代码 <?php /** * 智慧公厕 */ namespace Toilet\Action; class IndexAction extends Common…

技术杂谈 2023年5月31日
0078
ElasticSearch这些坑记得避开

一、管理方式二、结构维护三、数据调度 1、同步方案 2、中断和恢复四、刷新策略五、深度分页六、参考源码 Index用不好，麻烦事不会少；一、管理方式 ElasticSe…

技术杂谈 2023年7月24日
0070
全网最新的nacos 2.1.0集群多节点部署教程

原文链接：全网最新的nacos 2.1.0集群多节点部署教程-语雀基本信息进度整理中版本 2.1.0 版本发布日期 2022-04-29 git revision numbe…

技术杂谈 2023年7月11日
0055
Java线程的6种状态转换

Java线程的生命周期与操作系统中线程的五种状态区分开，Java线程有以下6种状态： New 新建 Runnable 可运行 Blocked 阻塞 Waiting 等待 Time…

技术杂谈 2023年7月24日
0066
SpringRetry重试

重试的使用场景比较多，比如调用远程服务时，由于网络或者服务端响应慢导致调用超时，此时可以多重试几次。用定时任务也可以实现重试的效果，但比较麻烦，用Spring Retry的话一个注…

技术杂谈 2023年7月24日
0072
mstar gdb调试

当进程崩溃出现coredump提示时，可以利用gdb来定位出错函数。首先，把core_dump.XXX.gz文件从设备上拷贝出来，放到编译环境下，另外，还要把代码目录下的symb…

技术杂谈 2023年5月31日
0077
统计数组中的元素

1.1 统计元素出现的次数为了统计元素出现的次数，我们肯定需要一个 map来记录每个数组以及对应数字出现的频次。这里 map的选择比较有讲究：可参考代码： for(int i …

技术杂谈 2023年6月21日
0064
mysql踩坑(二)-字符集与排序规则

字符集 mysql数据库支持多种字符集，并且在支持服务器、数据库、表、列和字符串常量等不同层次单独指定字符集。查看字符集查看服务器的默认字符集 show variables l…

技术杂谈 2023年7月24日
0084
Mocks Aren’t Stubs

The term ‘Mock Objects’ has become a popular one to describe special case obje…

技术杂谈 2023年5月30日
0066
Vue3项目运行时报错误：TypeError Cannot read properties of undefined (reading ‘filter’)

let matched = this.$route.mached.filter(item => item.name);方法报错：TypeError Cannot read p…

技术杂谈 2023年6月1日
00104
让你的app在iPhoneX中全屏显示

如果你的项目什么也不修改,直接把你的app运行在 iPhone X 模拟器下,很有可能就会出现下面的情形: 上下都有黑边,没有全屏显示为了让app能够全屏显示,你需要准备以下的内…

技术杂谈 2023年6月1日
0091

2024 年 4 月
一	二	三	四	五	六	日
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

Xperf Basics: Recording a Trace (the easy way)(转)

大家都在看