Age | Commit message (Collapse) | Author | Files | Lines |
|
The synopsis part of the btrace documentation gets highlighted with
wrong format. Let's fix the format. There is no change to the contents.
Signed-off-by: Fukui Daichi <a.dog.will.talk@akane.waseda.jp>
Link: https://lore.kernel.org/r/20240117083651.954-1-a.dog.will.talk@akane.waseda.jp
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The memset call in check_cpu_map always clears sizeof(unsigned long *)
regardless of what size was allocated. Use calloc instead to allocate
the map so it's zeroed properly regardless of the size requested.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
When we're using pipe input, we don't track online CPUs and don't have a
cpu_map. When we start to show entries, check_sequence will be invoked.
If the first entry isn't sequence 1 (perhaps it's been dropped?), we'll
proceed to check_cpu_map. Since we haven't tracked online CPUs,
pdi->cpu_map_max will be 0 and we'll do a malloc(0). Then we'll start
setting bits corresponding to CPU numbers in memory we don't own. Since
there's nothing to check here, let's skip it on pipe input.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
We found blktrace got stuck when cgroup restricts blktrace to use cpu,
the messages and stack is:
[root@localhost ~]# blktrace -w 10 -o- /dev/sda
FAILED to start thread on CPU 1: 22/Invalid argument
FAILED to start thread on CPU 2: 22/Invalid argument
[root@localhost ~]# cat /proc/1385110/stack
[<0>] __switch_to+0xe8/0x150
[<0>] futex_wait_queue_me+0xd4/0x158
[<0>] futex_wait+0xf4/0x230
[<0>] do_futex+0x470/0x900
[<0>] __arm64_sys_futex+0x13c/0x188
[<0>] el0_svc_common+0x80/0x200
[<0>] el0_svc_handler+0x78/0xe0
[<0>] el0_svc+0x10/0x260
[<0>] 0xffffffffffffffff
Blktrace failed to start thread is caused by thread can't lock on the
Restricted cpu. In this case, blktrace would't schedule an alarm after
defined time to set variable 'done' as 1.
We debug the code and found the call trace as bellow:
main()
==>run_tracers()
==>wait_tracers()
==>process_trace_bufs()
==>wait_empty_entries()
==>t_pthread_cond_wait()
Blktrace was set to piped output, so the process is stuck in
wait_empty_entries() for wait variable 'done' have been set as 1.
We set variable 'done' as 1 when 'nthreads_running' is not equal to
'ncpus' in run_tracers() to fix the problem.
Signed-off-by: lijinlin <lijinlin3@huawei.com>
Signed-off-by: Zhiqiang Liu <liuzhiqiang26@huawei.com>
Signed-off-by: Lixiaokeng <lixiaokeng@huawei.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
For correlating blktrace data with other information, it is useful to
know when the trace has been captured. Since the absolute timestamp
is contained in the blktrace file, just output it.
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Link: https://lore.kernel.org/r/20210113112643.12893-1-jack@suse.cz
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Use more inclusive terminology in a couple places.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The kernel now provides PID information for TN_MESSAGE events. Print it.
Old kernels fill 0 there so the behavior is unaffected for them.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Since Linux kernel commit 35fe6d763229 "block: use standard blktrace API
to output cgroup info for debug notes" the kernel can pass
__BLK_TA_CGROUP flag in the action field of generated events. Teach
iowatcher to ignore this information.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Use blktrace_api.h header instead of redefining the constants once more
in blkparse.c.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Since Linux kernel commit 35fe6d763229 "block: use standard blktrace API
to output cgroup info for debug notes" the kernel can pass
__BLK_TA_CGROUP flag in the action field of generated events. blkparse
does not count with this and so it will get confused by such events and
either ignore them or misreport them. Teach blkparse how to properly
process events with __BLK_TA_CGROUP flag.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
When a split io completes, the sector and length of the completion event refer
to the last part of the original request. This is in conflict with the
blkparse manual page, makes the blkparse output difficult to read, and leads to
incorrect statistics. Fix up the sector and length of split completion events
to match the original request.
To achieve that, slightly extend the existing event tracking infrastructure to
track all parts of a split request. We could almost get by tracking only the
last part of a split, but that wouldn't quite work correctly for splits of
splits.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Currently, event tracking timestamps aren't initialized at all even though some
places in the code assume that a value of 0 indicates 'undefined'. However, 0
is the timestamp of the first event, so use -1ULL for 'undefined' instead.
In addition, make sure timestamps are only initialized once, and always check
if timestamps are defined before using them.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Fix queue to completion tracking on devices other than md/dm: without this fix,
enabling tracking with the -t option on a non-md/dm device leads to "complete
not found" errors.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
For some reason, dev in struct per_dev_info isn't set in the log_track_
functions, and so the error messages report (0,0) as the device. Fix by using
device in struct blk_io_trace instead.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
to automatically handle close()
Signed-off-by: Vincent Legoll <vincent.legoll@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Comparison to None should be 'expr is None'
Signed-off-by: Vincent Legoll <vincent.legoll@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Do not use `len(SEQUENCE)` to determine if a sequence is empty
Signed-off-by: Vincent Legoll <vincent.legoll@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Unnecessary "elif" after "return"
Signed-off-by: Vincent Legoll <vincent.legoll@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Comparison to None should be 'expr is None'
Signed-off-by: Vincent Legoll <vincent.legoll@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Do not use `len(SEQUENCE)` to determine if a sequence is empty
Signed-off-by: Vincent Legoll <vincent.legoll@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
C0411: standard import "import getopt, glob, os, sys" should be placed
before "import six"
Signed-off-by: Vincent Legoll <vincent.legoll@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
handle close()
Signed-off-by: Vincent Legoll <vincent.legoll@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Signed-off-by: Vincent Legoll <vincent.legoll@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Signed-off-by: Vincent Legoll <vincent.legoll@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
rbtree.c is used by both binaries. It is possible that when make -C btt
is invoked rbtree.o does not exist yet, but is already schedule by the
compilation of blkiomon. That could result in recompiling rbtree.o again
for btt/btt.
In that case, at install time, make will recompile blkiomon which can
fail in gentoo, because CC variable is not overriden by ebuild script at
install time. (see https://bugs.gentoo.org/705594)
Add a dependency on SUBDIRS to wait for all binary in . to be compiled.
It will guarante rbtree.o exists.
Signed-off-by: Gwendal Grignou <gwendal@chromium.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The functionality of printing out absolute timestamps has been
implemented in code but not documented in doc/blktrace.tex.
Signed-off-by: Hiroaki Mihara <hmihara@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
This patch fixes the wrong absolute timestamps when blkparse reads
data from files.
The blkparse command prints out wrong timestamps if all the following
conditions are met,
* The blkparse command reads data from files created by blktrace.
* "z" format option is set as OUTPUT DESCRIPTION.
ex.) blkparse xxx.blktrace.0 -f "%z\n"
* start_timestamp(=blktrace command started) != genesis_time(=first
I/O traced)
When blkparse reads data from pipe instead, it yields correct
timestamps.
The root cause of this issue comes from the fact that the time
difference between start_timestamp and genesis_time is not added when
blkparse reads data from files. When blkparse reads data from pipe,
the time-difference is added through find_genesis() function.
The following test cases show the contradictions in absolute
timestams. Also the Step 4 shows that the issue is fixed with the
blkparse command with the suggesting patch.
* Step 1: After invoking blktrace command, test I/O traffic was
generated by dd command as follows,
# date +%Y%m%d_%H%M%S_%N; dd if=/dev/sda3 of=/dev/null count=1 iflag=direct
20190919_092726_077032490
1+0 records in
1+0 records out
512 bytes copied, 0.00122329 s, 419 kB/s
The timestamp was recorded just before executing dd command. The
test I/O would have been traced right after 09:27:26.077032490 .
* Step 2: The blkparse command reads data from "pipe".
$ cat test.blktrace.* | blkparse - -f "%T.%t %z %C\n"
0.000000000 09:27:22.427592 kworker/0:0
0.000002080 09:27:22.427594 kworker/0:0
.
.
3.652263118 09:27:26.079855 dd
3.652265818 09:27:26.079857 dd
3.652274742 09:27:26.079866 dd
3.652277266 09:27:26.079869 dd
The first I/O by dd command showed the relative timestamp as
3.652263118 and the absolute timestamp as 09:27:26.079855, which is
right after the timestamp shown in the Step 1.
* Step 3: The blkparse command reads from the trace "file" created in
the Step 1.
$ blkparse test -f "%T.%t %z %C\n"
Input file test.blktrace.0 added
Input file test.blktrace.1 added
Input file test.blktrace.2 added
Input file test.blktrace.3 added
0.000000000 09:27:21.187304 kworker/0:0
0.000002080 09:27:21.187306 kworker/0:0
.
.
3.652263118 09:27:24.839567 dd
3.652265818 09:27:24.839570 dd
3.652274742 09:27:24.839578 dd
3.652277266 09:27:24.839581 dd
In the previous step (Step 2), the data was passed via pipe. In this
case, the blkparse command reads data from the same file, instead.
The first I/O by dd command showed the relative timestamp as
3.652263118 and the absolute timestamp as 09:27:24.839567, which is
a few seconds earlier than the absolute timestamp recorded in the
Step 1. The order of events and the absolute timestamps contradict.
* Step 4: The blkparse command with the suggesting patch
(./blkparse_with_patch) reads data from the trace file created in
the Step 1.
$ ./blkparse_with_patch test -f "%T.%t %z %C\n"
Input file test.blktrace.0 added
Input file test.blktrace.1 added
Input file test.blktrace.2 added
Input file test.blktrace.3 added
0.000000000 09:27:22.427592 kworker/0:0
0.000002080 09:27:22.427594 kworker/0:0
.
.
3.652263118 09:27:26.079855 dd
3.652265818 09:27:26.079857 dd
3.652274742 09:27:26.079866 dd
3.652277266 09:27:26.079869 dd
In this case, the absolute timestamps showed the same value as shown
in the Step 2(the case with pipe).
The time gap between the genesis_ time and the start_timestamp was
corrected even if the blkparse reads data from files.
Signed-off-by: Hiroaki Mihara <hmihara@redhat.com>
the#
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
find_genesis() function has code to correct abs_start_time, which is
later used to calculate the absolute timestamps of each traced
records.
Put this code in a separate function, so that it can be used later by
the blkparse code. No functional change.
Signed-off-by: Hiroaki Mihara <hmihara@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The functionality of printing out absolute timestamps has been
implemented in code but not documented in man pages.
When comparing the timings of related events with block I/O traces,
the absolute timestams play a key role. I think that the
documentation of this might be beneficial to blktrace users.
The related commit was done in 2006 as follows,
> commit 7bd4fd0a4fca645bb50a641afac1e460a4e32dfd
> Author: Olaf Kirch <okir@lst.de>
> Date: Fri Dec 1 10:34:11 2006 +0100
>
> [PATCH] Add timestamp support
>
> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
>
URL of the above patch,
https://git.kernel.org/pub/scm/linux/kernel/git/axboe/blktrace.git/commit/?id=7bd4fd0a4fca645bb50a641afac1e460a4e32dfd
Acked-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Hiroaki Mihara <hmihara@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Commit dd093eb1c48e ("Fix warnings on newer gcc") moved string buffers holding
device names during map file parse stage to stack. However, only pointers to
them are being stored in the allocated "struct map_dev" structure. These
pointers are invalid outside of scope of this function and in a different
thread context. Also "release_map_devs" function still tries to "free" them
later as if they were allocated on the heap.
Moving the buffers back to the heap by instructing "fscanf" to allocate them
while parsing the file.
Alternatively, we could redefine the "struct map_dev" to include the whole
buffers instead of just pointers to them and free them as part of releasing the
whole "struct map_dev".
Fixes: dd093eb1c48e ("Fix warnings on newer gcc")
Signed-off-by: Ignat Korchagin <ignat@cloudflare.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Displays each program's data sorted by program name or io event, like
Queued, Read, Write and Complete. When -S is specified the -s will be ignored.
The capital letters Q,R,W,C stand for KB, then q/r/w/c stand for IO.
The N is used for sorting programs by name, same to -s.
If you want to sort programs by how many data they queued, you can use:
blkparse -i sda.blktrace. -q -S Q -o sda.parse
Signed-off-by: Weiping Zhang <zhangweiping@didiglobal.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
iowatcher currently always spawns 8 rsvg-convert processes, no matter
how many CPUs a system has. I did some limited testing of different
numbers of rsvg-convert processes. Here are the results:
8 processes:
real 4m2.194s
user 23m36.665s
sys 0m38.523s
20 processes:
real 2m28.935s
user 24m51.817s
sys 0m49.227s
40 processes:
real 2m28.150s
user 24m56.994s
sys 0m49.621s
Note that this is the time it takes for a full run of iowatcher -- I
didn't separate out just the rsvg-convert portion.
Given the above results, it seems like a reasonable thing to spawn one
rsvg-convert process per cpu.
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Hi,
Bryan Gurney reported iowatcher taking a *really* long time to generate
a movie for 16GiB worth of trace data. I took a look, and the io hash
was growing without bounds. The reason was that the I/O pattern looks
like this:
259,0 5 8 0.000208501 31708 A W 5435592 + 8 <- (259,1) 5433544
259,1 5 9 0.000209537 31708 Q W 5435592 + 8 [kvdo30:bioQ0]
259,1 5 10 0.000209880 31708 G W 5435592 + 8 [kvdo30:bioQ0]
259,0 5 11 0.000211064 31708 A W 5435600 + 8 <- (259,1) 5433552
259,1 5 12 0.000211347 31708 Q W 5435600 + 8 [kvdo30:bioQ0]
259,1 5 13 0.000212957 31708 M W 5435600 + 8 [kvdo30:bioQ0]
259,0 5 14 0.000213379 31708 A W 5435608 + 8 <- (259,1) 5433560
259,1 5 15 0.000213629 31708 Q W 5435608 + 8 [kvdo30:bioQ0]
259,1 5 16 0.000213937 31708 M W 5435608 + 8 [kvdo30:bioQ0]
...
259,1 5 107 0.000246274 31708 D W 5435592 + 256 [kvdo30:bioQ0]
For each of those Q events, an entry was created in the io_hash. Then,
upon I/O completion, only the first event (with the right starting
sector) was removed! The runtime overhead of just iterating the hash
chains was enormous.
The solution is to simply ignore the Q events, so long as there are D
events in the trace. If there are no D events, then go ahead and hash
the Q events as before. I'm hoping that if we only have Q and C, that
they will actually be aligned. If that's an incorrect assumption, we
could account merges in an rbtree. I'll defer that work until someone
can show me blktrace data that needs it.
The comments should be self explanatory. Review would be appreciated
as the code isn't well documented, and I don't know if I'm missing some
hidden assumption about the data.
Before applying this patch, iowatcher would take more than 12 hours to
complete. After the patch:
real 9m44.476s
user 41m35.426s
sys 3m29.106s
'nuf said.
Cheers,
Jeff
Reviewed-by: Chris Mason <clm@fb.com>
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Many distributions are moving to python3 by default. Here's
an attempt to make the python scripts in blktrace python3-ready.
Most of this was done with automated tools. I hand fixed some
space-vs tab issues, and cast an array index to integer. It
passes rudimentary testing when run under python2.7 as well
as python3.
This doesn't do anything with the shebangs, it leaves them both
invoking whatever "env python" coughs up on the system.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Herbo Zhang reports:
I found a bug in blktrace/btt/devmap.c. The code is just as follows:
https://git.kernel.org/pub/scm/linux/kernel/git/axboe/blktrace.git/tree/btt/devmap.c?id=8349ad2f2d19422a6241f94ea84d696b21de4757
struct devmap {
struct list_head head;
char device[32], devno[32]; // #1
};
LIST_HEAD(all_devmaps);
static int dev_map_add(char *line)
{
struct devmap *dmp;
if (strstr(line, "Device") != NULL)
return 1;
dmp = malloc(sizeof(struct devmap));
if (sscanf(line, "%s %s", dmp->device, dmp->devno) != 2) { //#2
free(dmp);
return 1;
}
list_add_tail(&dmp->head, &all_devmaps);
return 0;
}
int dev_map_read(char *fname)
{
char line[256]; // #3
FILE *fp = my_fopen(fname, "r");
if (!fp) {
perror(fname);
return 1;
}
while (fscanf(fp, "%255[a-zA-Z0-9 :.,/_-]\n", line) == 1) {
if (dev_map_add(line))
break;
}
fclose(fp);
return 0;
}
The line length is 256, but the dmp->device, dmp->devno max length
is only 32. We can put strings longer than 32 into dmp->device and
dmp->devno , and then they will be overflowed.
we can trigger this bug just as follows:
$ python -c "print 'A'*256" > ./test
$ btt -M ./test
*** Error in btt': free(): invalid next size (fast): 0x000055ad7349b250 ***
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x777e5)[0x7f7f158ce7e5]
/lib/x86_64-linux-gnu/libc.so.6(+0x7fe0a)[0x7f7f158d6e0a]
/lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c)[0x7f7f158da98c]
btt(+0x32e0)[0x55ad7306f2e0]
btt(+0x2c5f)[0x55ad7306ec5f]
btt(+0x251f)[0x55ad7306e51f]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7f7f15877830]
btt(+0x26b9)[0x55ad7306e6b9]
======= Memory map: ========
55ad7306c000-55ad7307f000 r-xp 00000000 08:14 3698139
/usr/bin/btt
55ad7327e000-55ad7327f000 r--p 00012000 08:14 3698139
/usr/bin/btt
55ad7327f000-55ad73280000 rw-p 00013000 08:14 3698139
/usr/bin/btt
55ad73280000-55ad73285000 rw-p 00000000 00:00 0
55ad7349a000-55ad734bb000 rw-p 00000000 00:00 0
[heap]
7f7f10000000-7f7f10021000 rw-p 00000000 00:00 0
7f7f10021000-7f7f14000000 ---p 00000000 00:00 0
7f7f15640000-7f7f15656000 r-xp 00000000 08:14 14942237
/lib/x86_64-linux-gnu/libgcc_s.so.1
7f7f15656000-7f7f15855000 ---p 00016000 08:14 14942237
/lib/x86_64-linux-gnu/libgcc_s.so.1
7f7f15855000-7f7f15856000 r--p 00015000 08:14 14942237
/lib/x86_64-linux-gnu/libgcc_s.so.1
7f7f15856000-7f7f15857000 rw-p 00016000 08:14 14942237
/lib/x86_64-linux-gnu/libgcc_s.so.1
7f7f15857000-7f7f15a16000 r-xp 00000000 08:14 14948477
/lib/x86_64-linux-gnu/libc-2.23.so
7f7f15a16000-7f7f15c16000 ---p 001bf000 08:14 14948477
/lib/x86_64-linux-gnu/libc-2.23.so
7f7f15c16000-7f7f15c1a000 r--p 001bf000 08:14 14948477
/lib/x86_64-linux-gnu/libc-2.23.so
7f7f15c1a000-7f7f15c1c000 rw-p 001c3000 08:14 14948477
/lib/x86_64-linux-gnu/libc-2.23.so
7f7f15c1c000-7f7f15c20000 rw-p 00000000 00:00 0
7f7f15c20000-7f7f15c46000 r-xp 00000000 08:14 14948478
/lib/x86_64-linux-gnu/ld-2.23.so
7f7f15e16000-7f7f15e19000 rw-p 00000000 00:00 0
7f7f15e42000-7f7f15e45000 rw-p 00000000 00:00 0
7f7f15e45000-7f7f15e46000 r--p 00025000 08:14 14948478
/lib/x86_64-linux-gnu/ld-2.23.so
7f7f15e46000-7f7f15e47000 rw-p 00026000 08:14 14948478
/lib/x86_64-linux-gnu/ld-2.23.so
7f7f15e47000-7f7f15e48000 rw-p 00000000 00:00 0
7ffdebe5c000-7ffdebe7d000 rw-p 00000000 00:00 0
[stack]
7ffdebebc000-7ffdebebe000 r--p 00000000 00:00 0
[vvar]
7ffdebebe000-7ffdebec0000 r-xp 00000000 00:00 0
[vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0
[vsyscall]
[1] 6272 abort btt -M test
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Signed-off-by: Weiping Zhang <zhangweiping@didichuxing.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
remove dupliated entry 'M' for man page of blkparse.
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Weiping Zhang <zhangweiping@didichuxing.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
if we run blktrace on same device twice, the second time will failed
to ioctl(BLKTRACESETUP), then it will call __stop_tracer, which lead
the first blktrace failed to access debugfs entries. So this patch add
a check to handle this case, to avoid stop tracer uncondionally.
Signed-off-by: weiping zhang <zhangweiping@didichuxing.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
When building in parallel, the btreplay/btrecord and btreplay/btreplay
targets cause make to kick off two jobs for `make -C btreplay` and they
sometimes end up clobbering each other. We could fix this by making one
a dependency of the other, but it's a bit cleaner to refactor things to
be based on subdirs. This way changes in subdirs also get noticed:
$ touch btreplay/*.[ch]
$ make
<btreplay is now correctly updated>
Signed-off-by: Robin H. Johnson <robbat2@gentoo.org>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Signed-off-by: Robin H. Johnson <robbat2@gentoo.org>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Keep scanning the tree for overlapping IO otherwise Q2G and process
traces will be incorrect.
Let assume we have 2 IOs:
A A+a
|---------------------------------------|
B B+b
|-----------------|
In the red/black tree we have:
o -> [A,A+a]
/ \
left right
/ \
[...]o o -> [B, B+b]
In the current code, if we would not be able to find [B+b] in the tree:
B is greater than A, so we won't go left
B+b is smaller than A+a, so we are not going right either.
When we have a [X, X+x] IO to look for:
We need to check for right when either:
X+x >= A+a (for merged IO)
and
X > A (for overlapping IO)
TEST=Check with a trace with overlapping IO: Q2C and Q2G are expected.
Signed-off-by: Gwendal Grignou <gwendal@chromium.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
If we fail doing the BLKTRACESETUP ioctl, blktrace still marches on
and sets up the rest. This results in errors like the below:
blktrace /dev/sdf
BLKTRACESETUP(2) /dev/sdf failed: 5/Input/output error
Thread 1 failed open /sys/kernel/debug/block/(null)/trace1: 2/No such file or directory
Thread 3 failed open /sys/kernel/debug/block/(null)/trace3: 2/No such file or directory
Thread 2 failed open /sys/kernel/debug/block/(null)/trace2: 2/No such file or directory
[...]
FAILED to start thread on CPU 0: 1/Operation not permitted
FAILED to start thread on CPU 1: 1/Operation not permitted
FAILED to start thread on CPU 2: 1/Operation not permitted
and blktrace continues to run, though it can't do anything in this
state.
If the ioctl setup fails, just abort.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
When CPU number space is sparse, we don't start threads for non-existent
CPUs. As a result, there are no output files created for these CPUs
which confuses tools like blkparse which expect that CPU numbers are
contiguous. Create fake empty files for non-existent CPUs so that other
tools don't have to bother.
Note that in network mode, the server will create all files in the range
0..max_cpus automatically.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
We would like to generate output file name without having corresponding
iop structure. Reorganize the function to allow that. Also fix couple of
overflows possible when generating the file name when we are modifying
the code anyway.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
On some machines CPU numbers do not form a contiguous interval. In such
cases blktrace will fail to start threads for missing CPUs and exit
effectively rendering itself unusable.
Add support into blktrace to handle systems with sparse CPU numbers.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Some C libraries (notably uClibc) have the posix_spawn*() functions in
librt, so let's link iowatcher with -lrt.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
An earlier commit:
fb7f8667 blktrace: disable kill option - take 2
removed the "-k" option documentation, but left
it in the synopsis.
This is a bit unusual and unhelpful and probably
unintended; remove it from the synopsis as well.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Proper graph name is queue-depth, not queue_depth.
Signed-off-by: Jan Kara <jack@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Trace label isn't properly separated with space from suffix (Read /
Write). Fix it.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
When user specifies trace files directly via -t option, it doesn't make
sense to prepend blktrace destination directory to them (it is
especially confusing if you specify absolute path names with -t option
and this logic breaks the path names). So avoid that.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Currently btt keeps the original IO in its RB-tree even if it sees new
IO that is beginning at the same sector. However such IO most likely
means that we have just lost the completion event for the IO that is
still in the tree. So in such case replacing the IO in RB-tree makes
more sense to avoid bogus IOs being reported as taking huge amount of
time.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Currently queue depth and latency graphs are generated from ISSUE and
COMPLETE events. For traces which miss the ISSUE events (e.g. from
device mapper) use QUEUE events instead. The result won't be as great
but it still conveys some useful information.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
When parsing blktrace data, process notify events even outside the
specified interval. This way we can learn about time stamps, process
names etc.
Signed-off-by: Jan Kara <jack@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
So far we used maximum of the first trace for the maximum range of the
queue depth graph. Use maximum over all traces similarly as for other
line graphs.
Signed-off-by: Jan Kara <jack@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Use maximum of rolling average as the upper range end for the line graph
to use better the available space in the plot.
Signed-off-by: Jan Kara <jack@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Just replace the malloc/memset with a calloc().
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Using __DATE__ and __TIME__ will break reproducible builds. The
resulting binary will change with each rebuild even if the source and
toolchain is identical.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
is_reap_done() must also check that SIGINT or SIGTERM have come, or
we hang forever with such backtraces after Ctrl-C:
(gdb) thr a a bt
Thread 3 (Thread 0x7fbff8ff9700 (LWP 12607)):
#0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1 0x0000000000402698 in replay_rec () at btreplay.c:1035
#2 0x00007fc001fe5454 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#3 0x00007fc001d1eecd in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#4 0x0000000000000000 in ?? ()
Thread 2 (Thread 0x7fbfea7fc700 (LWP 12611)):
#0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1 0x0000000000402698 in replay_rec () at btreplay.c:1035
#2 0x00007fc001fe5454 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0
#3 0x00007fc001d1eecd in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#4 0x0000000000000000 in ?? ()
Thread 1 (Thread 0x7fc00282e700 (LWP 12597)):
#0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
#1 0x0000000000402303 in __wait_cv () at btreplay.c:413
#2 0x0000000000401ae8 in main () at btreplay.c:426
Signed-off-by: Roman Pen <r.peniaev@gmail.com>
Cc: Jens Axboe <axboe@fb.com>
Cc: <linux-btrace@vger.kernel.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
getpid() is a pid of a process, at least tid must be provided.
But if zero is passed, then calling thread will be used.
That exactly what is needed.
Signed-off-by: Roman Pen <r.peniaev@gmail.com>
Cc: Jens Axboe <axboe@fb.com>
Cc: <linux-btrace@vger.kernel.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Size should be provided, not cpus number.
Signed-off-by: Roman Pen <r.peniaev@gmail.com>
Cc: Jens Axboe <axboe@fb.com>
Cc: <linux-btrace@vger.kernel.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Currently, blktrace uses _SC_NPROCESSORS_CONF to find out the number of
CPUs. This is a problem, because if you reduce the number of online
CPUs by passing kernel parameter maxcpus, then blktrace fails to start
with the error:
FAILED to start thread on CPU 4: 22/Invalid argument
FAILED to start thread on CPU 5: 22/Invalid argument
...
The attached patch fixes it to use _SC_NPROCESSORS_ONLN.
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Avoids the build failures when sys/types.h does not get included
indirectly through other headers.
Signed-off-by: Khem Raj <raj.khem@gmail.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
In get_ncpus, we default to using 4096 CPUs if _SC_NPROCESSORS_CONF isn't
enabled. If that is insufficient, sched_getaffinity will fail and we
retry after doubling the size of the cpu_set_t allocation. There's a typo
in there that means we don't actually double the size and will loop
forever allocating the same sized cpu_set_t instead.
Signed-off-by: Josef Cejka <jcejka@suse.com>
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Kills the errors on unchecked return of system()
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Andrew says:
Here are some trivial tweaks which I found were needed or desirable while
adding iowatcher to the blktrace packaging in Fedora. They improve the
integration of iowatcher into the tree and reduce duplication of docs.
|
|
Signed-off-by: Jens Axboe <axboe@fb.com>
Conflicts:
iowatcher/Makefile
|
|
Merge the requirements bits of iowatcher/README into README
Signed-off-by: Andrew Price <anprice@redhat.com>
|
|
Signed-off-by: Chris Mason <clm@fb.com>
|
|
We were setting C=gcc instead of CC=gcc, and using -O0. Fix both.
Signed-off-by: Chris Mason <clm@fb.com>
|
|
This README is getting out-of-date and its contents are duplicated in
the iowatcher manpage which is up-to-date, so remove it to reduce
duplication of effort.
Signed-off-by: Andrew Price <anprice@redhat.com>
|
|
iowatcher's manpage wasn't being installed with the other manpages so
add it to the doc directory.
Signed-off-by: Andrew Price <anprice@redhat.com>
|
|
Signed-off-by: Andrew Price <anprice@redhat.com>
|
|
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Bump it up to a full 1.1 since we now include iowatcher.
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Signed-off-by: Chris Mason <clm@fb.com>
|
|
|
|
Signed-off-by: Chris Mason <clm@fb.com>
|
|
Fix an unchecked strcpy and strcat in plot_io_movie():
$ ./iowatcher -t foo --movie -o foo.ogv -l $(printf 'x%.0s' {1..300})
[...]
*** buffer overflow detected ***: ./iowatcher terminated
There was also very similar code in plot_io() so a new function
plot_io_legend() was added to factor out the common string building code
and replace the buggy code with asprintf().
Also add a closedir() call to an error path in traces_list() to plug a
resource leak and make iowatcher Coverity-clean (ignoring some
false-positives).
Signed-off-by: Andrew Price <anprice@redhat.com>
|
|
Adding -Wmissing-prototypes showed some functions could be made static
and my 'findunused' script showed some functions weren't being called.
This patch was tested by building from scratch and running with various
combinations of options.
Signed-off-by: Andrew Price <anprice@redhat.com>
|
|
Bring the man page and usage string up-to-date with the new -p behaviour
and improve the formatting and content of the man page.
Signed-off-by: Andrew Price <anprice@redhat.com>
|
|
For consistency and deduplication, use run_program in start_mpstat. Add
the ability to pass a path to run_program, which will be opened in the
spawned process and used as stdout, in order to capture mpstat output.
This fixes a tricky descriptor leak in start_mpstat which could have
caused a race condition if it was fixed with close().
Some output formatting tweaks have also been added and a bug from a
previous patch, where tracers were killed immediately when -p wasn't
specified, has been fixed.
Signed-off-by: Andrew Price <anprice@redhat.com>
|
|
Rework start_blktrace and use run_program to launch blktrace. Move the
argv-building into the function so that it's easier to work with and
clean it up a bit. Add a signal parameter to wait_program to optionally
kill the pid with a given signal before waiting for it.
Signed-off-by: Andrew Price <anprice@redhat.com>
|
|
Previously the --prog option required the program-to-be-run to be
specified as a single string. This meant that shell escaping would be
lost in translation and a sub-shell would be run. Rework --prog to not
take an argument and accept the arguments left after option processing
has ended as the argv for the program-to-be-run.
As we have the program as an argv, run_program2() can now be used to run
it, and now that run_program() is no longer used we can remove it and
remove the '2' from run_program2.
New usage example:
# iowatcher -p -t foo -d /dev/sda3 sleep 10
running blktrace blktrace -b 8192 -a queue -a complete -a issue -a notify -D . -d /dev/sda3 -o foo
running 'sleep' '10'
sleep exited with 0
...
Docs have been updated accordingly.
Signed-off-by: Andrew Price <anprice@redhat.com>
|
|
Until now run_program2() was a replacement for system() so it always
waited for the process to end before returning. To make this function
more useful move the waiting code into a separate function and add a
mechanism to expect a specific exit code.
Signed-off-by: Andrew Price <anprice@redhat.com>
|
|
(Caught by Coverity.) tf->gdd_writes and tf->gdd_reads are arrays of
pointers so update their allocations to use the correct element size.
Signed-off-by: Andrew Price <anprice@redhat.com>
|
|
plot_io_movie() was calling create_movie_temp_dir() which unnecessarily
strdup()ed a string constant leaving plot_io_movie() to free it. Replace
the strdup() with a mutable char array and get rid of the free(). Merge
the few remaining lines which create the movie dir into plot_io_movie().
Also prune a duplicate declaration of start_mpstat() in tracers.h
Signed-off-by: Andrew Price <anprice@redhat.com>
|
|
Now that combine_blktrace_devs() takes a list of traces it's fairly
generic so we might as well merge blktrace_to_dump() into it. The latter
can be replaced with a call using a list with a single entry.
combine_blktrace_devs() is renamed dump_traces() because that's what it
does.
Also eradicate the big global char array 'line' that was being used in a
bunch of places along with some more unnecessary strdup()s.
Signed-off-by: Andrew Price <anprice@redhat.com>
|
|
The return value of posix_spawnp() was being checked but the exit status
of the child process was being ignored. This adds checks and error
reporting based on the status that waitpid returns.
Signed-off-by: Andrew Price <anprice@redhat.com>
|
|
Similar to the fix for spaces in file names in commit 5d845e3, this
patch fixes processing of directories with spaces in their names by
using posix_spawnp() to run the blkparse command instead of system(). In
doing so, combine_blktrace_devs() and match_trace() have been reworked
to use a list structure instead of doing a lot of strdup()ing and string
appending.
Also make sure that trailing slashes are removed from the directory name
before attempting to use it as the base of the .dump filename.
Update the -t entry in the manpage to mention directory behaviour, too.
Signed-off-by: Andrew Price <anprice@redhat.com>
|
|
When requeue event happens we have to decrease number of in-flight
requests. Otherwise they drift away.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
Compiler was giving some warnings about signed vs unsigned comparisons.
Although these were harmless, make seconds unsigned because they really
are.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
Skip events beyond max_seconds. This not only saves CPU time but also
prevents memory corruption because not all functions were checking that
given time is in the expected range. Also remove now unnecessary checks
in the called functions.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
If the event is out of time range that should be plotted, do not add it.
It will corrupt memory...
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
|
|
blktrace_to_dump passes filenames containing spaces to blkparse via
system() so only the first chunk of the string is taken to be the
filename by the subprocess.
This switches to using posix_spawnp() so that we can present the
filename as an element of argv and avoid iowatcher failing in these
cases.
Signed-off-by: Andrew Price <anprice@redhat.com>
|
|
Check the value of cur_mini_step is sane before using it as an index to
mini_step array.
Signed-off-by: Andrew Price <anprice@redhat.com>
|
|
The length of the label option wasn't being checked before strcpy()ing
it into a char[256]. Use strncpy instead.
Signed-off-by: Andrew Price <anprice@redhat.com>
|
|
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
Thanks to Andrew Price for sending me the corrected version
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
in the mpstat output
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
Add -C option to specify ffmped video code to use.
Allow ffmpeg video codec to be specified on the command line. "libx264" is
default.
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
Adds a man page from iowatcher. Borrows some documentation from the
README file but covers all of the options found in main.c
Signed-off-by: Andrew Price <anprice@redhat.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
set_gdd_bit makes sure that we don't try to set bits past the max offset
we used to allocate our gdd array.
But, it only does this when the function is first called, and the whole
byte range for the IO we're recording may go past max offset. This adds
a check to be sure we stay in the right range.
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
When queue action was missing from a trace, handling of dispatch didn't
quite get things right and crashed due to NULL pointer dereference.
Fix it.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
We use an IO hash table to keep track of the IOs in flight, and this is
used to calculate the latencies from when we issue the IO to when
we complete the IO.
But if there are no completion events, io is never removed from the hash
table. It grows very large and slows down the run.
Since we already scan all the events looking for outliers, this commit
checks for each major type of event during the scan. If there are
no completion and no issue events, we don't bother inserting things
into the hash table.
If there are no completion events, we clean up during the issue event.
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
ffmpeg is not available on all distributions, so include Theora
as an option, via png2theora, if the output movie filename ends
in .ogg or .ogv
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
directory
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
Current code allocates buffer for path based on strdup, which would let
the size of path equals to the size of blktrace_dest_dir. But the code
next that joins it with the filename of dump file, which would overwrite
the buffer, and triggered an issue like following:
$ ./iowatcher -t trace.dump -o trace.svg
Unable to find trace file ./trace.dumpY
^
Refactoring join_path a bit to fix this issue.
Cc: Liu Bo <liub.liubo@gmail.com>
Signed-off-by: Yuanhan Liu <yliu.null@gmail.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
Add 'D' for blktrace destination options so that we can save trace
in the destination directory.
Signed-off-by: Liu Bo <liub.liubo@gmail.com>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
stop_tracers modifies tp->is_done and thus must signal the condition
variable tracer_wait_unblock is waiting on to monitor tp->is_done.
Not doing so might cause the tool to deadlock if stop_tracers is
called while a tracer thread is in tracer_wait_unblock.
Signed-off-by: Robert Schiele <rschiele@gmail.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
While looking for things that needs porting for Aarch64,
barrier.h from blktrace was identified. However, a deeper
look shows that this file is not actually used anymore
in blktrace.
Remove unused file to avoid future confusion.
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
|
|
Currently, bno_plot.py uses os.execvp which does not show enough information
when executed command is not found. For example, when gnuplot is not found
bno_plot.py shows the following messages:
Traceback (most recent call last):
File "/usr/local/bin/bno_plot.py", line 123, in <module>
os.execvp(cmd[0], cmd)
File "/usr/lib64/python2.7/os.py", line 344, in execvp
_execvpe(file, args)
File "/usr/lib64/python2.7/os.py", line 368, in _execvpe
func(file, *argrest)
OSError: [Errno 2] No such file or directory
Users can't understand what happend directly from the message.
Instead of os.execvp, this patch uses os.system which shows the following
messages when gnuplot not found:
sh: gnuplot: command not found
Signed-off-by: Eiichi Tsukata <devel@etsukata.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Some distros have changed CPU_SETSIZE in glibc to 4096 since that matches
the NR_CPUS in the linux kernel config file. Some distros have decided to
leave CPU_SETSIZE at 1024. This is a problem if you want to run that distro
on a very large machine.
CPU_SETSIZE is use by the struct cpu_set_t. This means you to deal with cpus
greater the 1024 you must use the dynamic cpu sets, which involves converting
from things like CPU_SET to CPU_SET_S.
Cc: Jens Axboe <axboe@kernel.dk>
Modified by Jens to fix the CPU_{SET,ZERO}_S pointer mixup.
Signed-off-by: Nathan Zimmer <nzimmer@sgi.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
We want to run on all online processors. However is there is a hole in the
online cpumask this won't happen. We need the number of configured processors
instead of online.
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Nathan Zimmer <nzimmer@sgi.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
We should use the standard methods for getting the number of cpus in the
system when they are available. It is good practice to leave the old ways in
place for people stuck on older systems.
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Nathan Zimmer <nzimmer@sgi.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
The current method fails if once we hit the first offlined cpu. This
will correct that case. However this still underreports the number cpus if
the last cpu are offlined.
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Nathan Zimmer <nzimmer@sgi.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
verify_blkpars has troubles with systems larger then 512.
Also there is issue in the scanning code causing the cpu number to be
truncated to the first two digits. i.e cpu 542 would be read as 54.
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Nathan Zimmer <nzimmer@sgi.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
If block device has many request with size less than 1K,
blkparse ignores such requests because it treats each request
in Kb.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Add support for displaying different processes with different color in
the IO graph and movie.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
Jan Kara's updates for xzoom and yzoom
Conflicts:
main.c
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
Sometimes this is useful to see how IO scheduler or storage itself
changes the IO submitted by the application.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
Ticks on x axis used integral step and fixed number of ticks. That generates
wrong results e.g. for 13s long trace with 10 ticks... Allow the code to
somewhat alter the number of ticks and also use non-integral step.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
Currently we report error when we find a trace record beyond max_seconds.
When we allow user to set end of displayed period, records after the end
of period are no longer a bug so just ignore them.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
Later we will add min_seconds to complement this.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
There are lots of trace actions which do not carry a sector with them (e.g.
plug, unplug, ...). Thus sector is 0 for them and that results in trimming
of outliers from below never working. Fix the problem by accounting only
Queue events in the outlier statistics.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
constant
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
Short variant of --movie is -m, not -p.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
The movie mode is updated to put extra plots on
the side.
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
This is incomplete, but it will catch messages from
the flash driver to find the actual chip an IO
was sent to.
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
The --movie option defaults to spindle mode now,
but you can choose --movie=rect or --movie=spindle
as well.
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
better.
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
|
|
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
Signed-off-by: Chris Mason <chris.mason@fusionio.com>
|
|
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
In our experiments blktrace/blkparse file names encode a lot of
infomation about the particular experiment. We noticed that for long
enough file names blkparse does not work.
The reason is that per_cpu_info->fname[] is of 128 bytes. As a result,
in setup_file() function only part of the file name gets to ->fname[].
Then stat() fails and we exit the function. Notice, that no error is
printed in this case.
In the following patch ->fname[] size is increased to POSIX defined
PATH_MAX.
Signed-off-by: Vasily Tarasov <tarasov@vasily.name
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
One was a real bug, assigned i_time twice instead of c_time (which was
left unitialized).
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Several places using strcpy would benefit from strncpy
for safety.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
sp was being incremented w/o initialization, but thankfully
not used otherwise. Just remove it.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
We malloc'd cpu_map, and then did:
cpu_map[CPU_IDX(cpu)] |= (1UL << CPU_BIT(cpu));
... not sure how that ever worked if cpu_map was not initialized!
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
Close the file used for btt's -M argument after
processing.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
In several cases space is allocated for a filename but
not freed if open of that file fails.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
No point in malloc()ing space if we just immediately overwrite
the pointer via strdup. That'll leak some space.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
The file containing the list of devices was never closed
after processing was complete.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
On this error path, pdu_buf was never freed.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
The acts[] array is only N_ACTS elements, so we should not
ever set acts[N_ACTS]
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
Check for setvbuf failure.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
|
|
This patch fixes two bugs in blktrace.
1. realloc is called on a wrong memory address (glibc reports heap
corruption if the user sends the output to a pipe, for example "blktrace
/dev/sdc -o -").
2. errno 0 is actually reported if debugfs is not mounted
Mikulas
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
Add FLUSH/FUA support to blktrace. As FLUSH precedes WRITE and/or
FUA follows WRITE, use the same 'F' flag for both cases and
distinguish them by their (relative) position. The end results
look like (other flags might be shown also):
- WRITE: W
- WRITE_FLUSH: FW
- WRITE_FUA: WF
- WRITE_FLUSH_FUA: FWF
Note that we reuse TC_BARRIER due to lack of bit space of act_mask.
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
|
|
I noticed in some traces that I was seeing summaries like the following:
Total (sde):
Reads Queued: 76, 304KiB Writes Queued: 16,384, 1,048MiB
Read Dispatches: 76, 304KiB Write Dispatches: 2,210, 1,048MiB
Reads Requeued: 0 Writes Requeued: 0
Reads Completed: 76, 304KiB Writes Completed: 2,210, 1,048MiB
Read Merges: 0, 0KiB Write Merges: 14,174, 907,136KiB
PC Reads Queued: 0, 0KiB PC Writes Queued: 0, 0KiB
PC Read Disp.: 4, 0KiB PC Write Disp.: 0, 0KiB
PC Reads Req.: 0 PC Writes Req.: 0
PC Reads Compl.: 4 PC Writes Compl.: 2,210
IO unplugs: 2,124 Timer unplugs: 0
Note how there were no PC Writes dispatched, but there were 2210
completed. It turns out to be a minor typo in the code. The attached
patch fixes the reporting for me.
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
blk_io_trace->cpu is u32, so use be32_to_cpu instead.
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
Currently we only check the magic number to see whether
a blktrace is valid or not, but Bill Broadley did meet
with a case that the cpu info is wrong with a number
of 1725552676. So in resize_cpu_info, we meet with a
overflow when calculating
size = new_count * sizeof(struct per_cpu_info);
And the program will be either segfault or has the error
of out of memory. Although this is more likely a kernel
problem, the blkparse shoudn't segfault for it.
So this patch just check whether the cpu stored in the
trace is the same as the file, if not, just warn it out
and skip it.
Cc: Jens Axboe <axboe@kernel.dk>
Reported-by: Bill Broadley <bill@broadley.org>
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
To help users better deal with the log message
"You have dropped events, consider using a larger buffer size (-b)",
it's helpful to list the defaults for sub buffer management, without
flags.
Signed-off-by: Justin TerAvest <teravest@google.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
Add blkiomon, btreplay/btrecord and btreplay/btreplay to
.gitignore so that they don't show up in "Untracked files.
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
idx isn't used, so remove it.
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
In 38-rc2, there is a bug in mlock which will return
error in mlock of blktrace(I have sent the corresponding
patch to the lkml). So when we try to break the blktrace
by "ctrl+c", mlock will loop forever and in the end, I
have to use "kill -9" to kill it and then run "blktrace -k"
to stop the tracer. I don't think it is good.
How to reproduce it is simple:
Use a 38-rc kernel, and run
blktrace /dev/sdx
then use "ctrl+c", it doesn't exit.
So this patch adds the check for tp->is_done. In
case of is_done is set, break mlock so that we don't
deadloop in the mlock. In case of the real mlock error,
I will let it to retry 10 times and it should succeed
after 10 tries in case of tp->is_done. If tp isn't set
or tp->is_done isn't set, it works like the original
design.
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
When we give out some statistics in blkiomon, we don't consider
the situation that the device has no correspoinding action. See
if there is no disk read during the interval, the output in my box is
like:
sizes read (bytes): num 0, min -1, max 0, sum 0, squ 0, avg nan, var nan
With the fix, now it looks like:
sizes read (bytes): num 0, min -1, max 0, sum 0, squ 0, avg 0.0, var 0.0
Cc: Martin Peschke <mpeschke@linux.vnet.ibm.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
|
|
With the newest kernel(say 2.6.37, some older one should also have the
similar problem), some cfq messages are added to blktrace, so it makes
the old blkparse broken.
See a simple example:
1. blktrace /dev/sdb -o -|blkparse -i -
2. Run the following command(/dev/sdb1 is mounted at /mnt/test_dir):
dd if=/mnt/test_dir/test of=/dev/null bs=4k count=1 iflag=direct
There are only 2 lines of output there:
8,16 0 1 0.000000000 13183 A R 114759 + 8 <- (8,17) 114696
8,16 0 2 0.000000491 13183 Q R 114759 + 8 [dd]
And even we run a command line like:
for((i=0;i<100;i++))do dd if=/mnt/ocfs2/test of=/dev/null bs=4k count=1 iflag=direct;done
We are only given the same 2 lines of output.
While the really one should look like:
8,16 0 1 0.000000000 13319 A R 114759 + 8 <- (8,17) 114696
8,16 0 2 0.000000376 13319 Q R 114759 + 8 [dd]
8,16 0 0 0.000005931 0 m N cfq13319 alloced
8,16 0 3 0.000006259 13319 G R 114759 + 8 [dd]
8,16 0 4 0.000007143 13319 P N [dd]
8,16 0 5 0.000007817 13319 I R 114759 + 8 [dd]
8,16 0 0 0.000008491 0 m N cfq13319 insert_request
8,16 0 0 0.000009029 0 m N cfq13319 add_to_rr
...
The main reason is that in show_entries_rb, we test sequences every time,
but actually with some messages like cfq, the sequence number is always
0 which makes the old sequence check refuses all the logs after it.
So only check/store sequence number if it isn't a message.
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
btreplay.c:1332: warning: comparison between signed and unsigned integer
expressions
|
|
Fixup for RH bugzilla 595628.
Document btt options:
-m (--seeks-per-second);
-X (--easy-parse-avgs).
Signed-off-by: Edward Shishkin <edward@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
Fixup for RH bugzilla 595623
Document btreplay option -x (--acc-factor)
Signed-off-by: Edward Shishkin <edward@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
Fixup for RH bugzilla 595620.
Document undocumented blktrace options.
Update the man pages.
Signed-off-by: Edward Shishkin <edward@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
Fixup for RH bugzilla 595615.
Document blkparse options:
-A, --set-mask,
-a, --act-mask,
-D. --input-directory
Signed-off-by: Edward Shishkin <edward@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|
|
Fixup for RH bugzilla 595419.
Document blkiomon option -d (--dump-lldd).
Add drv_data mask description to blktrace man page.
Signed-off-by: Edward Shishkin <edward@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
|