pax_global_header00006660000000000000000000000064120772655420014524gustar00rootroot0000000000000052 comment=069b9565b5a4b75e6d484f269ad5d9bd02dbd17f iowatcher-1.0/000077500000000000000000000000001207726554200133515ustar00rootroot00000000000000iowatcher-1.0/COPYING000066400000000000000000000432541207726554200144140ustar00rootroot00000000000000 GNU GENERAL PUBLIC LICENSE Version 2, June 1991 Copyright (C) 1989, 1991 Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Preamble The licenses for most software are designed to take away your freedom to share and change it. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change free software--to make sure the software is free for all its users. This General Public License applies to most of the Free Software Foundation's software and to any other program whose authors commit to using it. (Some other Free Software Foundation software is covered by the GNU Lesser General Public License instead.) You can apply it to your programs, too. When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for this service if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs; and that you know you can do these things. To protect your rights, we need to make restrictions that forbid anyone to deny you these rights or to ask you to surrender the rights. These restrictions translate to certain responsibilities for you if you distribute copies of the software, or if you modify it. For example, if you distribute copies of such a program, whether gratis or for a fee, you must give the recipients all the rights that you have. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights. We protect your rights with two steps: (1) copyright the software, and (2) offer you this license which gives you legal permission to copy, distribute and/or modify the software. Also, for each author's protection and ours, we want to make certain that everyone understands that there is no warranty for this free software. If the software is modified by someone else and passed on, we want its recipients to know that what they have is not the original, so that any problems introduced by others will not reflect on the original authors' reputations. Finally, any free program is threatened constantly by software patents. We wish to avoid the danger that redistributors of a free program will individually obtain patent licenses, in effect making the program proprietary. To prevent this, we have made it clear that any patent must be licensed for everyone's free use or not licensed at all. The precise terms and conditions for copying, distribution and modification follow. GNU GENERAL PUBLIC LICENSE TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION 0. This License applies to any program or other work which contains a notice placed by the copyright holder saying it may be distributed under the terms of this General Public License. The "Program", below, refers to any such program or work, and a "work based on the Program" means either the Program or any derivative work under copyright law: that is to say, a work containing the Program or a portion of it, either verbatim or with modifications and/or translated into another language. (Hereinafter, translation is included without limitation in the term "modification".) Each licensee is addressed as "you". Activities other than copying, distribution and modification are not covered by this License; they are outside its scope. The act of running the Program is not restricted, and the output from the Program is covered only if its contents constitute a work based on the Program (independent of having been made by running the Program). Whether that is true depends on what the Program does. 1. You may copy and distribute verbatim copies of the Program's source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice and disclaimer of warranty; keep intact all the notices that refer to this License and to the absence of any warranty; and give any other recipients of the Program a copy of this License along with the Program. You may charge a fee for the physical act of transferring a copy, and you may at your option offer warranty protection in exchange for a fee. 2. You may modify your copy or copies of the Program or any portion of it, thus forming a work based on the Program, and copy and distribute such modifications or work under the terms of Section 1 above, provided that you also meet all of these conditions: a) You must cause the modified files to carry prominent notices stating that you changed the files and the date of any change. b) You must cause any work that you distribute or publish, that in whole or in part contains or is derived from the Program or any part thereof, to be licensed as a whole at no charge to all third parties under the terms of this License. c) If the modified program normally reads commands interactively when run, you must cause it, when started running for such interactive use in the most ordinary way, to print or display an announcement including an appropriate copyright notice and a notice that there is no warranty (or else, saying that you provide a warranty) and that users may redistribute the program under these conditions, and telling the user how to view a copy of this License. (Exception: if the Program itself is interactive but does not normally print such an announcement, your work based on the Program is not required to print an announcement.) These requirements apply to the modified work as a whole. If identifiable sections of that work are not derived from the Program, and can be reasonably considered independent and separate works in themselves, then this License, and its terms, do not apply to those sections when you distribute them as separate works. But when you distribute the same sections as part of a whole which is a work based on the Program, the distribution of the whole must be on the terms of this License, whose permissions for other licensees extend to the entire whole, and thus to each and every part regardless of who wrote it. Thus, it is not the intent of this section to claim rights or contest your rights to work written entirely by you; rather, the intent is to exercise the right to control the distribution of derivative or collective works based on the Program. In addition, mere aggregation of another work not based on the Program with the Program (or with a work based on the Program) on a volume of a storage or distribution medium does not bring the other work under the scope of this License. 3. You may copy and distribute the Program (or a work based on it, under Section 2) in object code or executable form under the terms of Sections 1 and 2 above provided that you also do one of the following: a) Accompany it with the complete corresponding machine-readable source code, which must be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange; or, b) Accompany it with a written offer, valid for at least three years, to give any third party, for a charge no more than your cost of physically performing source distribution, a complete machine-readable copy of the corresponding source code, to be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange; or, c) Accompany it with the information you received as to the offer to distribute corresponding source code. (This alternative is allowed only for noncommercial distribution and only if you received the program in object code or executable form with such an offer, in accord with Subsection b above.) The source code for a work means the preferred form of the work for making modifications to it. For an executable work, complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the executable. However, as a special exception, the source code distributed need not include anything that is normally distributed (in either source or binary form) with the major components (compiler, kernel, and so on) of the operating system on which the executable runs, unless that component itself accompanies the executable. If distribution of executable or object code is made by offering access to copy from a designated place, then offering equivalent access to copy the source code from the same place counts as distribution of the source code, even though third parties are not compelled to copy the source along with the object code. 4. You may not copy, modify, sublicense, or distribute the Program except as expressly provided under this License. Any attempt otherwise to copy, modify, sublicense or distribute the Program is void, and will automatically terminate your rights under this License. However, parties who have received copies, or rights, from you under this License will not have their licenses terminated so long as such parties remain in full compliance. 5. You are not required to accept this License, since you have not signed it. However, nothing else grants you permission to modify or distribute the Program or its derivative works. These actions are prohibited by law if you do not accept this License. Therefore, by modifying or distributing the Program (or any work based on the Program), you indicate your acceptance of this License to do so, and all its terms and conditions for copying, distributing or modifying the Program or works based on it. 6. Each time you redistribute the Program (or any work based on the Program), the recipient automatically receives a license from the original licensor to copy, distribute or modify the Program subject to these terms and conditions. You may not impose any further restrictions on the recipients' exercise of the rights granted herein. You are not responsible for enforcing compliance by third parties to this License. 7. If, as a consequence of a court judgment or allegation of patent infringement or for any other reason (not limited to patent issues), conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot distribute so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not distribute the Program at all. For example, if a patent license would not permit royalty-free redistribution of the Program by all those who receive copies directly or indirectly through you, then the only way you could satisfy both it and this License would be to refrain entirely from distribution of the Program. If any portion of this section is held invalid or unenforceable under any particular circumstance, the balance of the section is intended to apply and the section as a whole is intended to apply in other circumstances. It is not the purpose of this section to induce you to infringe any patents or other property right claims or to contest validity of any such claims; this section has the sole purpose of protecting the integrity of the free software distribution system, which is implemented by public license practices. Many people have made generous contributions to the wide range of software distributed through that system in reliance on consistent application of that system; it is up to the author/donor to decide if he or she is willing to distribute software through any other system and a licensee cannot impose that choice. This section is intended to make thoroughly clear what is believed to be a consequence of the rest of this License. 8. If the distribution and/or use of the Program is restricted in certain countries either by patents or by copyrighted interfaces, the original copyright holder who places the Program under this License may add an explicit geographical distribution limitation excluding those countries, so that distribution is permitted only in or among countries not thus excluded. In such case, this License incorporates the limitation as if written in the body of this License. 9. The Free Software Foundation may publish revised and/or new versions of the General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. Each version is given a distinguishing version number. If the Program specifies a version number of this License which applies to it and "any later version", you have the option of following the terms and conditions either of that version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of this License, you may choose any version ever published by the Free Software Foundation. 10. If you wish to incorporate parts of the Program into other free programs whose distribution conditions are different, write to the author to ask for permission. For software which is copyrighted by the Free Software Foundation, write to the Free Software Foundation; we sometimes make exceptions for this. Our decision will be guided by the two goals of preserving the free status of all derivatives of our free software and of promoting the sharing and reuse of software generally. NO WARRANTY 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. END OF TERMS AND CONDITIONS How to Apply These Terms to Your New Programs If you develop a new program, and you want it to be of the greatest possible use to the public, the best way to achieve this is to make it free software which everyone can redistribute and change under these terms. To do so, attach the following notices to the program. It is safest to attach them to the start of each source file to most effectively convey the exclusion of warranty; and each file should have at least the "copyright" line and a pointer to where the full notice is found. Copyright (C) This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. Also add information on how to contact you by electronic and paper mail. If the program is interactive, make it output a short notice like this when it starts in an interactive mode: Gnomovision version 69, Copyright (C) year name of author Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. This is free software, and you are welcome to redistribute it under certain conditions; type `show c' for details. The hypothetical commands `show w' and `show c' should show the appropriate parts of the General Public License. Of course, the commands you use may be called something other than `show w' and `show c'; they could even be mouse-clicks or menu items--whatever suits your program. You should also get your employer (if you work as a programmer) or your school, if any, to sign a "copyright disclaimer" for the program, if necessary. Here is a sample; alter the names: Yoyodyne, Inc., hereby disclaims all copyright interest in the program `Gnomovision' (which makes passes at compilers) written by James Hacker. , 1 April 1989 Ty Coon, President of Vice This General Public License does not permit incorporating your program into proprietary programs. If your program is a subroutine library, you may consider it more useful to permit linking proprietary applications with the library. If this is what you want to do, use the GNU Lesser General Public License instead of this License. iowatcher-1.0/Makefile000066400000000000000000000012341207726554200150110ustar00rootroot00000000000000C = gcc CFLAGS = -Wall -O0 -g -W ALL_CFLAGS = $(CFLAGS) -D_GNU_SOURCE -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 PROGS = iowatcher INSTALL = install prefix = /usr/local bindir = $(prefix)/bin export prefix INSTALL ALL = $(PROGS) $(PROGS): | depend all: $(ALL) %.o: %.c $(CC) -o $*.o -c $(ALL_CFLAGS) $< iowatcher: blkparse.o plot.o main.o tracers.o mpstat.o fio.o $(CC) $(ALL_CFLAGS) -o $@ $(filter %.o,$^) -lm depend: @$(CC) -MM $(ALL_CFLAGS) *.c 1> .depend clean: -rm -f *.o $(PROGS) .depend install: all $(INSTALL) -m 755 -d $(DESTDIR)$(bindir) $(INSTALL) -m 755 $(ALL) $(DESTDIR)$(bindir) ifneq ($(wildcard .depend),) include .depend endif iowatcher-1.0/README000066400000000000000000000061421207726554200142340ustar00rootroot00000000000000iowatcher graphs the results of a blktrace run. It has a few different modes: * Graph the result of an existing blktrace * Start a new blktrace * Start a new blktrace and a benchmark run * Make a movie of the IO from a given trace (only mp4 for now) Output: iowatcher can produce either svg files or mp4 movies. Most browsers can view the svg files, or you can use rsvg-view-3 from librsvg. rsvg-convert can turn the svgs into many other formats. Building: Type make and make install. We need ffmpeg or png2theora, and librsvg to make movies, otherwise there are no dependencies. The basic options: -d controls which device you are tracing. You can only trace one device at a time for now. It is sent directly to blktrace, and only needed when you are making a new trace. -t controls the name of the blktrace file. iowatcher uses a dump from blkparse, so -t tries to guess the name of the corresponding per CPU blktrace data files if the dump file doesn't already exist. If you want more than one trace in a given graph, you can specify -t more than once. -l Sets a label in the graph for a trace file. The labels are added in the same order the trace files are added. -m Create a movie. The file format depends on the extension used in the -o filename.* option. If you specify an .ogv or .ogg extension, the result will be Ogg Theora video, if png2theora is available. If you use an .mp4 extension, the result will be an mp4 video if ffmpeg is avilable. You can use any other extension, but the end result will be an mp4. You can use --movie=spindle or --movie=rect, which changes the style of the IO mapping. -T Set a title for the graph. This goes at the top of the image. -o output filename. The default is trace.svg. iowatcher is only able to create svg for now. -r control the duration in seconds for the rolling average. iowatcher tries to smooth out bumpy graphs by averaging the current second with seconds from the past. Longer numbers here give you flatter graphs. -O add a single graph to the output. By default all the graphs are included, but with -O you get only the graphs you ask for. -O may be used more than once. -N remove a single graph from the output. This may also be used more than once. Choices for -O and -N are: io, tput, latency, queue_depth, iops, cpu-sys, cpu-io, cpu-irq, cpu-user, cpu-soft Examples: # generate graph from the existing trace.dump iowatcher -t trace.dump -o trace.svg # skip the IO graph iowatcher -t trace.dump -o trace.svg -N io # only graph tput and latency iowatcher -t trace.dump -o trace.svg -O tput -O latency # generate a graph from two runs, and label them iowatcher -t ext4.dump -t xfs.dump -l Ext4 -l XFS -o trace.svg # Run a fio benchmark and store the trace in trace.dump # add a title to the top. Use /dev/sda for blktrace iowatcher -d /dev/sda -t trace.dump -T 'Fio Benchmark' -p 'fio some_job_file' # Make a movie from an existing trace iowatcher -t trace --movie -o trace.mp4 Please email chris.mason@fusionio.com with any questions iowatcher-1.0/blkparse.c000066400000000000000000000634011207726554200153240ustar00rootroot00000000000000/* * Copyright (C) 2012 Fusion-io * * This program is free software; you can redistribute it and/or * modify it under the terms of the GNU General Public * License v2 as published by the Free Software Foundation. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA * * Parts of this file were imported from Jens Axboe's blktrace sources (also GPL) */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "plot.h" #include "blkparse.h" #include "list.h" #include "tracers.h" #define IO_HASH_TABLE_BITS 11 #define IO_HASH_TABLE_SIZE (1 << IO_HASH_TABLE_BITS) static struct list_head io_hash_table[IO_HASH_TABLE_SIZE]; static u64 ios_in_flight = 0; #define PROCESS_HASH_TABLE_BITS 7 #define PROCESS_HASH_TABLE_SIZE (1 << PROCESS_HASH_TABLE_BITS) static struct list_head process_hash_table[PROCESS_HASH_TABLE_SIZE]; extern int plot_io_action; extern int io_per_process; static const int line_len = 1024; static char line[1024]; /* * Trace categories */ enum { BLK_TC_READ = 1 << 0, /* reads */ BLK_TC_WRITE = 1 << 1, /* writes */ BLK_TC_FLUSH = 1 << 2, /* flush */ BLK_TC_SYNC = 1 << 3, /* sync */ BLK_TC_QUEUE = 1 << 4, /* queueing/merging */ BLK_TC_REQUEUE = 1 << 5, /* requeueing */ BLK_TC_ISSUE = 1 << 6, /* issue */ BLK_TC_COMPLETE = 1 << 7, /* completions */ BLK_TC_FS = 1 << 8, /* fs requests */ BLK_TC_PC = 1 << 9, /* pc requests */ BLK_TC_NOTIFY = 1 << 10, /* special message */ BLK_TC_AHEAD = 1 << 11, /* readahead */ BLK_TC_META = 1 << 12, /* metadata */ BLK_TC_DISCARD = 1 << 13, /* discard requests */ BLK_TC_DRV_DATA = 1 << 14, /* binary driver data */ BLK_TC_FUA = 1 << 15, /* fua requests */ BLK_TC_END = 1 << 15, /* we've run out of bits! */ }; #define BLK_TC_SHIFT (16) #define BLK_TC_ACT(act) ((act) << BLK_TC_SHIFT) #define BLK_DATADIR(a) (((a) >> BLK_TC_SHIFT) & (BLK_TC_READ | BLK_TC_WRITE)) /* * Basic trace actions */ enum { __BLK_TA_QUEUE = 1, /* queued */ __BLK_TA_BACKMERGE, /* back merged to existing rq */ __BLK_TA_FRONTMERGE, /* front merge to existing rq */ __BLK_TA_GETRQ, /* allocated new request */ __BLK_TA_SLEEPRQ, /* sleeping on rq allocation */ __BLK_TA_REQUEUE, /* request requeued */ __BLK_TA_ISSUE, /* sent to driver */ __BLK_TA_COMPLETE, /* completed by driver */ __BLK_TA_PLUG, /* queue was plugged */ __BLK_TA_UNPLUG_IO, /* queue was unplugged by io */ __BLK_TA_UNPLUG_TIMER, /* queue was unplugged by timer */ __BLK_TA_INSERT, /* insert request */ __BLK_TA_SPLIT, /* bio was split */ __BLK_TA_BOUNCE, /* bio was bounced */ __BLK_TA_REMAP, /* bio was remapped */ __BLK_TA_ABORT, /* request aborted */ __BLK_TA_DRV_DATA, /* binary driver data */ }; #define BLK_TA_MASK ((1 << BLK_TC_SHIFT) - 1) /* * Notify events. */ enum blktrace_notify { __BLK_TN_PROCESS = 0, /* establish pid/name mapping */ __BLK_TN_TIMESTAMP, /* include system clock */ __BLK_TN_MESSAGE, /* Character string message */ }; /* * Trace actions in full. Additionally, read or write is masked */ #define BLK_TA_QUEUE (__BLK_TA_QUEUE | BLK_TC_ACT(BLK_TC_QUEUE)) #define BLK_TA_BACKMERGE (__BLK_TA_BACKMERGE | BLK_TC_ACT(BLK_TC_QUEUE)) #define BLK_TA_FRONTMERGE (__BLK_TA_FRONTMERGE | BLK_TC_ACT(BLK_TC_QUEUE)) #define BLK_TA_GETRQ (__BLK_TA_GETRQ | BLK_TC_ACT(BLK_TC_QUEUE)) #define BLK_TA_SLEEPRQ (__BLK_TA_SLEEPRQ | BLK_TC_ACT(BLK_TC_QUEUE)) #define BLK_TA_REQUEUE (__BLK_TA_REQUEUE | BLK_TC_ACT(BLK_TC_REQUEUE)) #define BLK_TA_ISSUE (__BLK_TA_ISSUE | BLK_TC_ACT(BLK_TC_ISSUE)) #define BLK_TA_COMPLETE (__BLK_TA_COMPLETE| BLK_TC_ACT(BLK_TC_COMPLETE)) #define BLK_TA_PLUG (__BLK_TA_PLUG | BLK_TC_ACT(BLK_TC_QUEUE)) #define BLK_TA_UNPLUG_IO (__BLK_TA_UNPLUG_IO | BLK_TC_ACT(BLK_TC_QUEUE)) #define BLK_TA_UNPLUG_TIMER (__BLK_TA_UNPLUG_TIMER | BLK_TC_ACT(BLK_TC_QUEUE)) #define BLK_TA_INSERT (__BLK_TA_INSERT | BLK_TC_ACT(BLK_TC_QUEUE)) #define BLK_TA_SPLIT (__BLK_TA_SPLIT) #define BLK_TA_BOUNCE (__BLK_TA_BOUNCE) #define BLK_TA_REMAP (__BLK_TA_REMAP | BLK_TC_ACT(BLK_TC_QUEUE)) #define BLK_TA_ABORT (__BLK_TA_ABORT | BLK_TC_ACT(BLK_TC_QUEUE)) #define BLK_TA_DRV_DATA (__BLK_TA_DRV_DATA | BLK_TC_ACT(BLK_TC_DRV_DATA)) #define BLK_TN_PROCESS (__BLK_TN_PROCESS | BLK_TC_ACT(BLK_TC_NOTIFY)) #define BLK_TN_TIMESTAMP (__BLK_TN_TIMESTAMP | BLK_TC_ACT(BLK_TC_NOTIFY)) #define BLK_TN_MESSAGE (__BLK_TN_MESSAGE | BLK_TC_ACT(BLK_TC_NOTIFY)) #define BLK_IO_TRACE_MAGIC 0x65617400 #define BLK_IO_TRACE_VERSION 0x07 /* * The trace itself */ struct blk_io_trace { __u32 magic; /* MAGIC << 8 | version */ __u32 sequence; /* event number */ __u64 time; /* in nanoseconds */ __u64 sector; /* disk offset */ __u32 bytes; /* transfer length */ __u32 action; /* what happened */ __u32 pid; /* who did it */ __u32 device; /* device identifier (dev_t) */ __u32 cpu; /* on what cpu did it happen */ __u16 error; /* completion error */ __u16 pdu_len; /* length of data after this trace */ }; struct pending_io { /* sector offset of this IO */ u64 sector; /* dev_t for this IO */ u32 device; /* time this IO was dispatched */ u64 dispatch_time; /* time this IO was finished */ u64 completion_time; struct list_head hash_list; /* process which queued this IO */ u32 pid; }; struct pid_map { struct list_head hash_list; u32 pid; int index; char name[0]; }; #define MINORBITS 20 #define MINORMASK ((1 << MINORBITS) - 1) #define SECONDS(x) ((unsigned long long)(x) / 1000000000) #define NANO_SECONDS(x) ((unsigned long long)(x) % 1000000000) #define DOUBLE_TO_NANO_ULL(d) ((unsigned long long)((d) * 1000000000)) #define CHECK_MAGIC(t) (((t)->magic & 0xffffff00) == BLK_IO_TRACE_MAGIC) void init_io_hash_table(void) { int i; struct list_head *head; for (i = 0; i < IO_HASH_TABLE_SIZE; i++) { head = io_hash_table + i; INIT_LIST_HEAD(head); } } /* taken from the kernel hash.h */ static inline u64 hash_sector(u64 val) { u64 hash = val; /* Sigh, gcc can't optimise this alone like it does for 32 bits. */ u64 n = hash; n <<= 18; hash -= n; n <<= 33; hash -= n; n <<= 3; hash += n; n <<= 3; hash -= n; n <<= 4; hash += n; n <<= 2; hash += n; /* High bits are more random, so use them. */ return hash >> (64 - IO_HASH_TABLE_BITS); } static int io_hash_table_insert(struct pending_io *ins_pio) { u64 sector = ins_pio->sector; u32 dev = ins_pio->device; int slot = hash_sector(sector); struct list_head *head; struct pending_io *pio; head = io_hash_table + slot; list_for_each_entry(pio, head, hash_list) { if (pio->sector == sector && pio->device == dev) return -EEXIST; } list_add_tail(&ins_pio->hash_list, head); return 0; } static struct pending_io *io_hash_table_search(u64 sector, u32 dev) { int slot = hash_sector(sector); struct list_head *head; struct pending_io *pio; head = io_hash_table + slot; list_for_each_entry(pio, head, hash_list) { if (pio->sector == sector && pio->device == dev) return pio; } return NULL; } static struct pending_io *hash_queued_io(struct blk_io_trace *io) { struct pending_io *pio; int ret; pio = calloc(1, sizeof(*pio)); pio->sector = io->sector; pio->device = io->device; pio->pid = io->pid; ret = io_hash_table_insert(pio); if (ret < 0) { /* crud, the IO is there already */ free(pio); return NULL; } return pio; } static struct pending_io *hash_dispatched_io(struct blk_io_trace *io) { struct pending_io *pio; pio = io_hash_table_search(io->sector, io->device); if (!pio) { pio = hash_queued_io(io); if (!pio) return NULL; } pio->dispatch_time = io->time; return pio; } static struct pending_io *hash_completed_io(struct blk_io_trace *io) { struct pending_io *pio; pio = io_hash_table_search(io->sector, io->device); if (!pio) return NULL; return pio; } void init_process_hash_table(void) { int i; struct list_head *head; for (i = 0; i < PROCESS_HASH_TABLE_SIZE; i++) { head = process_hash_table + i; INIT_LIST_HEAD(head); } } static u32 hash_pid(u32 pid) { u32 hash = pid; hash ^= pid >> 3; hash ^= pid >> 3; hash ^= pid >> 4; hash ^= pid >> 6; return (hash & (PROCESS_HASH_TABLE_SIZE - 1)); } static struct pid_map *process_hash_search(u32 pid) { int slot = hash_pid(pid); struct list_head *head; struct pid_map *pm; head = process_hash_table + slot; list_for_each_entry(pm, head, hash_list) { if (pm->pid == pid) return pm; } return NULL; } static struct pid_map *process_hash_insert(u32 pid, char *name) { int slot = hash_pid(pid); struct pid_map *pm; int old_index = 0; char buf[16]; pm = process_hash_search(pid); if (pm) { /* Entry exists and name shouldn't be changed? */ if (!name || !strcmp(name, pm->name)) return pm; list_del(&pm->hash_list); old_index = pm->index; free(pm); } if (!name) { sprintf(buf, "[%u]", pid); name = buf; } pm = malloc(sizeof(struct pid_map) + strlen(name) + 1); pm->pid = pid; pm->index = old_index; strcpy(pm->name, name); list_add_tail(&pm->hash_list, process_hash_table + slot); return pm; } static void handle_notify(struct trace *trace) { struct blk_io_trace *io = trace->io; void *payload = (char *)io + sizeof(*io); u32 two32[2]; if (io->action == BLK_TN_PROCESS) { if (io_per_process) process_hash_insert(io->pid, payload); return; } if (io->action != BLK_TN_TIMESTAMP) return; if (io->pdu_len != sizeof(two32)) return; memcpy(two32, payload, sizeof(two32)); trace->start_timestamp = io->time; trace->abs_start_time.tv_sec = two32[0]; trace->abs_start_time.tv_nsec = two32[1]; if (trace->abs_start_time.tv_nsec < 0) { trace->abs_start_time.tv_sec--; trace->abs_start_time.tv_nsec += 1000000000; } } int next_record(struct trace *trace) { int skip = trace->io->pdu_len; u64 offset; trace->cur += sizeof(*trace->io) + skip; offset = trace->cur - trace->start; if (offset >= trace->len) return 1; trace->io = (struct blk_io_trace *)trace->cur; return 0; } void first_record(struct trace *trace) { trace->cur = trace->start; trace->io = (struct blk_io_trace *)trace->cur; } int is_io_event(struct blk_io_trace *test) { char *message; if (!(test->action & BLK_TC_ACT(BLK_TC_NOTIFY))) return 1; if (test->action == BLK_TN_MESSAGE) { int len = test->pdu_len; if (len < 3) return 0; message = (char *)(test + 1); if (strncmp(message, "fio ", 4) == 0) { return 1; } } return 0; } u64 find_last_time(struct trace *trace) { char *p = trace->start + trace->len; struct blk_io_trace *test; int search_len = 0; u64 found = 0; if (trace->len < sizeof(*trace->io)) return 0; p -= sizeof(*trace->io); while (p >= trace->start) { test = (struct blk_io_trace *)p; if (CHECK_MAGIC(test) && is_io_event(test)) { u64 offset = p - trace->start; if (offset + sizeof(*test) + test->pdu_len == trace->len) { return test->time; } } p--; search_len++; if (search_len > 8192) { break; } } /* searching backwards didn't work out, we'll have to scan the file */ first_record(trace); while (1) { if (is_io_event(trace->io)) found = trace->io->time; if (next_record(trace)) break; } first_record(trace); return found; } int parse_fio_bank_message(struct trace *trace, u64 *bank_ret, u64 *offset_ret, u64 *num_banks_ret) { char *s; char *next; char *message; struct blk_io_trace *test = trace->io; int len = test->pdu_len; u64 bank; u64 offset; u64 num_banks; if (!(test->action & BLK_TC_ACT(BLK_TC_NOTIFY))) return -1; if (test->action != BLK_TN_MESSAGE) return -1; /* the message is fio rw bank offset num_banks */ if (len < 3) return -1; message = (char *)(test + 1); if (strncmp(message, "fio r ", 6) != 0) return -1; message = strndup(message, len); s = strchr(message, ' '); if (!s) goto out; s++; s = strchr(s, ' '); if (!s) goto out; bank = strtoll(s, &next, 10); if (s == next) goto out; s = next; offset = strtoll(s, &next, 10); if (s == next) goto out; s = next; num_banks = strtoll(s, &next, 10); if (s == next) goto out; *bank_ret = bank; *offset_ret = offset; *num_banks_ret = num_banks; return 0; out: free(message); return -1; } static struct dev_info *lookup_dev(struct trace *trace, struct blk_io_trace *io) { u32 dev = io->device; int i; struct dev_info *di = NULL; for (i = 0; i < trace->num_devices; i++) { if (trace->devices[i].device == dev) { di = trace->devices + i; goto found; } } i = trace->num_devices++; if (i >= MAX_DEVICES_PER_TRACE) { fprintf(stderr, "Trace contains too many devices (%d)\n", i); exit(1); } di = trace->devices + i; di->device = dev; found: return di; } static void map_devices(struct trace *trace) { struct dev_info *di; u64 found; u64 map_start = 0; int i; first_record(trace); while (1) { if (!(trace->io->action & BLK_TC_ACT(BLK_TC_NOTIFY))) { di = lookup_dev(trace, trace->io); found = trace->io->sector << 9; if (found < di->min) di->min = found; found += trace->io->bytes; if (di->max < found) di->max = found; } if (next_record(trace)) break; } first_record(trace); for (i = 0; i < trace->num_devices; i++) { di = trace->devices + i; di->map = map_start; map_start += di->max - di->min; } } u64 map_io(struct trace *trace, struct blk_io_trace *io) { struct dev_info *di = lookup_dev(trace, io); u64 val = trace->io->sector << 9; return di->map + val - di->min; } void find_extreme_offsets(struct trace *trace, u64 *min_ret, u64 *max_ret, u64 *max_bank_ret, u64 *max_offset_ret) { u64 found = 0; u64 max = 0, min = ~(u64)0; u64 max_bank = 0; u64 max_bank_offset = 0; u64 num_banks = 0; map_devices(trace); first_record(trace); while (1) { if (!(trace->io->action & BLK_TC_ACT(BLK_TC_NOTIFY))) { found = map_io(trace, trace->io); if (found < min) min = found; found += trace->io->bytes; if (max < found) max = found; } else { u64 bank; u64 offset; if (!parse_fio_bank_message(trace, &bank, &offset, &num_banks)) { if (bank > max_bank) max_bank = bank; if (offset > max_bank_offset) max_bank_offset = offset; } } if (next_record(trace)) break; } first_record(trace); *min_ret = min; *max_ret = max; *max_bank_ret = max_bank; *max_offset_ret = max_bank_offset; } static void check_io_types(struct trace *trace) { struct blk_io_trace *io = trace->io; int action = io->action & BLK_TA_MASK; if (!(io->action & BLK_TC_ACT(BLK_TC_NOTIFY))) { switch (action) { case __BLK_TA_COMPLETE: trace->found_completion = 1; break; case __BLK_TA_ISSUE: trace->found_issue = 1; break; case __BLK_TA_QUEUE: trace->found_queue = 1; break; }; } } int filter_outliers(struct trace *trace, u64 min_offset, u64 max_offset, u64 *yzoom_min, u64 *yzoom_max) { int hits[11]; u64 max_per_bucket[11]; u64 min_per_bucket[11]; u64 bytes_per_bucket = (max_offset - min_offset + 1) / 10; int slot; int fat_count = 0; memset(hits, 0, sizeof(int) * 11); memset(max_per_bucket, 0, sizeof(u64) * 11); memset(min_per_bucket, 0xff, sizeof(u64) * 11); first_record(trace); while (1) { check_io_types(trace); if (!(trace->io->action & BLK_TC_ACT(BLK_TC_NOTIFY)) && (trace->io->action & BLK_TA_MASK) == __BLK_TA_QUEUE) { u64 off = map_io(trace, trace->io) - min_offset; slot = (int)(off / bytes_per_bucket); hits[slot]++; if (off < min_per_bucket[slot]) min_per_bucket[slot] = off; off += trace->io->bytes; slot = (int)(off / bytes_per_bucket); hits[slot]++; if (off > max_per_bucket[slot]) max_per_bucket[slot] = off; } if (next_record(trace)) break; } first_record(trace); for (slot = 0; slot < 11; slot++) { if (hits[slot] > fat_count) { fat_count = hits[slot]; } } *yzoom_max = max_offset; for (slot = 10; slot >= 0; slot--) { double d = hits[slot]; if (d >= (double)fat_count * .05) { *yzoom_max = max_per_bucket[slot] + min_offset; break; } } *yzoom_min = min_offset; for (slot = 0; slot < 10; slot++) { double d = hits[slot]; if (d >= (double)fat_count * .05) { *yzoom_min = min_per_bucket[slot] + min_offset; break; } } return 0; } static char footer[] = ".blktrace.0"; static int footer_len = sizeof(footer); static void match_trace(char *name, char **traces) { int match_len; char *match; int footer_start; match_len = strlen(name); if (match_len <= footer_len) return; footer_start = match_len - footer_len; if (strcmp(name + footer_start + 1, footer) != 0) return; match = strdup(name); if (!match) goto enomem; match[footer_start + 1] = '\0'; snprintf(line, line_len, "%s -i '%s'", *traces ? *traces : "", match); free(match); match = strdup(line); if (!match) goto enomem; free(*traces); *traces = match; return; enomem: perror("memory allocation failed"); exit(1); return; } static char *combine_blktrace_devs(char *dir_name) { DIR *dir; char *traces = NULL; struct dirent *d; int len; int ret; dir = opendir(dir_name); if (!dir) return NULL; while (1) { d = readdir(dir); if (!d) break; len = strlen(d->d_name); if (len > footer_len) match_trace(d->d_name, &traces); } closedir(dir); if (!traces) return NULL; snprintf(line, line_len, "blkparse -O %s -D %s -d '%s.%s'", traces, dir_name, dir_name, "dump"); ret = system(line); if (ret) { fprintf(stderr, "blkparse failure %s\n", line); exit(1); } snprintf(line, line_len, "%s.%s", dir_name, "dump"); return strdup(line); } static char *find_trace_file(char *filename) { int ret; struct stat st; char *dot; char *try; int found_dir = 0; /* look for an exact match of whatever they pass in. * If it is a file, assume it is the dump file. * If a directory, remember that it existed so we * can combine traces in that directory later */ ret = stat(filename, &st); if (ret == 0) { if (S_ISREG(st.st_mode)) return strdup(filename); if (S_ISDIR(st.st_mode)) found_dir = 1; } /* * try tacking .dump onto the end and see if that already * has been generated */ snprintf(line, line_len, "%s.%s", filename, "dump"); ret = stat(line, &st); if (ret == 0) return strdup(line); /* * try to generate the .dump from all the traces in * a single dir. */ if (found_dir) { try = combine_blktrace_devs(filename); if (try) return try; } /* * try to generate the .dump from all the blktrace * files for a named trace */ try = strdup(filename); dot = strrchr(try, '.'); if (!dot || strcmp(".dump", dot) != 0) { if (dot && dot != try) *dot = '\0'; snprintf(line, line_len, "%s%s", try, ".blktrace.0"); ret = stat(line, &st); if (ret == 0) { blktrace_to_dump(try); snprintf(line, line_len, "%s.%s", try, "dump"); ret = stat(line, &st); if (ret == 0) { free(try); return strdup(line); } } } free(try); return NULL; } struct trace *open_trace(char *filename) { int fd; char *p; struct stat st; int ret; struct trace *trace; char *found_filename; trace = calloc(1, sizeof(*trace)); if (!trace) { fprintf(stderr, "unable to allocate memory for trace\n"); return NULL; } found_filename = find_trace_file(filename); if (!found_filename) { fprintf(stderr, "Unable to find trace file %s\n", filename); goto fail; } filename = found_filename; fd = open(filename, O_RDONLY); if (fd < 0) { fprintf(stderr, "Unable to open trace file %s err %s\n", filename, strerror(errno)); goto fail; } ret = fstat(fd, &st); if (ret < 0) { fprintf(stderr, "stat failed on %s err %s\n", filename, strerror(errno)); goto fail_fd; } p = mmap(NULL, st.st_size, PROT_READ, MAP_PRIVATE, fd, 0); if (p == MAP_FAILED) { fprintf(stderr, "Unable to mmap trace file %s, err %s\n", filename, strerror(errno)); goto fail_fd; } trace->fd = fd; trace->len = st.st_size; trace->start = p; trace->cur = p; trace->io = (struct blk_io_trace *)p; return trace; fail_fd: close(fd); fail: free(trace); return NULL; } static inline int tput_event(struct trace *trace) { if (trace->found_completion) return __BLK_TA_COMPLETE; if (trace->found_issue) return __BLK_TA_ISSUE; if (trace->found_queue) return __BLK_TA_QUEUE; return __BLK_TA_COMPLETE; } int action_char_to_num(char action) { switch (action) { case 'Q': return __BLK_TA_QUEUE; case 'D': return __BLK_TA_ISSUE; case 'C': return __BLK_TA_COMPLETE; } return -1; } static inline int io_event(struct trace *trace) { if (plot_io_action) return plot_io_action; if (trace->found_queue) return __BLK_TA_QUEUE; if (trace->found_issue) return __BLK_TA_ISSUE; if (trace->found_completion) return __BLK_TA_COMPLETE; return __BLK_TA_COMPLETE; } void add_tput(struct trace *trace, struct graph_line_data *writes_gld, struct graph_line_data *reads_gld) { struct blk_io_trace *io = trace->io; struct graph_line_data *gld; int action = io->action & BLK_TA_MASK; int seconds; if (io->action & BLK_TC_ACT(BLK_TC_NOTIFY)) return; if (action != tput_event(trace)) return; if (BLK_DATADIR(io->action) & BLK_TC_READ) gld = reads_gld; else gld = writes_gld; seconds = SECONDS(io->time); if (seconds > gld->max_seconds) return; gld->data[seconds].sum += io->bytes; gld->data[seconds].count = 1; if (gld->data[seconds].sum > gld->max) gld->max = gld->data[seconds].sum; } #define GDD_PTR_ALLOC_STEP 16 static struct pid_map *get_pid_map(struct trace_file *tf, u32 pid) { struct pid_map *pm; if (!io_per_process) { if (!tf->io_plots) tf->io_plots = 1; return NULL; } pm = process_hash_insert(pid, NULL); /* New entry? */ if (!pm->index) { if (tf->io_plots == tf->io_plots_allocated) { tf->io_plots_allocated += GDD_PTR_ALLOC_STEP; tf->gdd_reads = realloc(tf->gdd_reads, tf->io_plots_allocated * sizeof(struct graph_dot_data *)); if (!tf->gdd_reads) abort(); tf->gdd_writes = realloc(tf->gdd_writes, tf->io_plots_allocated * sizeof(struct graph_dot_data *)); if (!tf->gdd_writes) abort(); memset(tf->gdd_reads + tf->io_plots_allocated - GDD_PTR_ALLOC_STEP, 0, GDD_PTR_ALLOC_STEP * sizeof(struct graph_dot_data *)); memset(tf->gdd_writes + tf->io_plots_allocated - GDD_PTR_ALLOC_STEP, 0, GDD_PTR_ALLOC_STEP * sizeof(struct graph_dot_data *)); } pm->index = tf->io_plots++; return pm; } return pm; } void add_io(struct trace *trace, struct trace_file *tf) { struct blk_io_trace *io = trace->io; int action = io->action & BLK_TA_MASK; u64 offset; int index; char *label; struct pid_map *pm; if (io->action & BLK_TC_ACT(BLK_TC_NOTIFY)) return; if (action != io_event(trace)) return; offset = map_io(trace, io); pm = get_pid_map(tf, io->pid); if (!pm) { index = 0; label = ""; } else { index = pm->index; label = pm->name; } if (BLK_DATADIR(io->action) & BLK_TC_READ) { if (!tf->gdd_reads[index]) tf->gdd_reads[index] = alloc_dot_data(tf->min_seconds, tf->max_seconds, tf->min_offset, tf->max_offset, tf->stop_seconds, pick_color(), strdup(label)); set_gdd_bit(tf->gdd_reads[index], offset, io->bytes, io->time); } else if (BLK_DATADIR(io->action) & BLK_TC_WRITE) { if (!tf->gdd_writes[index]) tf->gdd_writes[index] = alloc_dot_data(tf->min_seconds, tf->max_seconds, tf->min_offset, tf->max_offset, tf->stop_seconds, pick_color(), strdup(label)); set_gdd_bit(tf->gdd_writes[index], offset, io->bytes, io->time); } } void add_pending_io(struct trace *trace, struct graph_line_data *gld) { int seconds; struct blk_io_trace *io = trace->io; int action = io->action & BLK_TA_MASK; double avg; struct pending_io *pio; if (io->action & BLK_TC_ACT(BLK_TC_NOTIFY)) return; if (action == __BLK_TA_QUEUE) { if (trace->found_issue || trace->found_completion) hash_queued_io(trace->io); return; } if (action != __BLK_TA_ISSUE) return; seconds = SECONDS(io->time); if (seconds > gld->max_seconds) return; pio = hash_dispatched_io(trace->io); if (!pio) return; if (!trace->found_completion) { list_del(&pio->hash_list); free(pio); } ios_in_flight++; gld->data[seconds].sum += ios_in_flight; gld->data[seconds].count++; avg = (double)gld->data[seconds].sum / gld->data[seconds].count; if (gld->max < (u64)avg) { gld->max = avg; } } void add_completed_io(struct trace *trace, struct graph_line_data *latency_gld) { struct blk_io_trace *io = trace->io; int seconds; int action = io->action & BLK_TA_MASK; struct pending_io *pio; double avg; u64 latency; if (io->action & BLK_TC_ACT(BLK_TC_NOTIFY)) return; if (action != __BLK_TA_COMPLETE) return; seconds = SECONDS(io->time); pio = hash_completed_io(trace->io); if (!pio) return; if (ios_in_flight > 0) ios_in_flight--; if (io->time >= pio->dispatch_time) { latency = io->time - pio->dispatch_time; latency_gld->data[seconds].sum += latency; latency_gld->data[seconds].count++; } list_del(&pio->hash_list); free(pio); avg = (double)latency_gld->data[seconds].sum / latency_gld->data[seconds].count; if (latency_gld->max < (u64)avg) { latency_gld->max = avg; } } void add_iop(struct trace *trace, struct graph_line_data *gld) { struct blk_io_trace *io = trace->io; int action = io->action & BLK_TA_MASK; int seconds; if (io->action & BLK_TC_ACT(BLK_TC_NOTIFY)) return; /* iops and tput use the same events */ if (action != tput_event(trace)) return; seconds = SECONDS(io->time); if (seconds > gld->max_seconds) return; gld->data[seconds].sum += 1; gld->data[seconds].count = 1; if (gld->data[seconds].sum > gld->max) gld->max = gld->data[seconds].sum; } void check_record(struct trace *trace) { handle_notify(trace); } iowatcher-1.0/blkparse.h000066400000000000000000000075331207726554200153350ustar00rootroot00000000000000/* * Copyright (C) 2012 Fusion-io * * This program is free software; you can redistribute it and/or * modify it under the terms of the GNU General Public * License v2 as published by the Free Software Foundation. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA * * Parts of this file were imported from Jens Axboe's blktrace sources (also GPL) */ #ifndef __IOWATCH_BLKPARSE__ #define __IOWATCH_BLKPARSE__ #define MINORBITS 20 #define MINORMASK ((1 << MINORBITS) - 1) #define SECONDS(x) ((unsigned long long)(x) / 1000000000) #define NANO_SECONDS(x) ((unsigned long long)(x) % 1000000000) #define DOUBLE_TO_NANO_ULL(d) ((unsigned long long)((d) * 1000000000)) #define CHECK_MAGIC(t) (((t)->magic & 0xffffff00) == BLK_IO_TRACE_MAGIC) struct dev_info { u32 device; u64 min; u64 max; u64 map; }; #define MAX_DEVICES_PER_TRACE 64 struct trace { int fd; u64 len; char *start; char *cur; struct blk_io_trace *io; u64 start_timestamp; struct timespec abs_start_time; /* * flags for the things we find in the stream * we prefer different events for different things */ int found_issue; int found_completion; int found_queue; char *mpstat_start; char *mpstat_cur; u64 mpstat_len; int mpstat_fd; int mpstat_seconds; int mpstat_num_cpus; char *fio_start; char *fio_cur; u64 fio_len; int fio_fd; int fio_seconds; int num_devices; struct dev_info devices[MAX_DEVICES_PER_TRACE]; }; struct trace_file { struct list_head list; char *filename; char *label; struct trace *trace; int stop_seconds; /* Time when trace stops */ int min_seconds; /* Beginning of the interval we should plot */ int max_seconds; /* End of the interval we should plot */ u64 min_offset; u64 max_offset; char *reads_color; char *writes_color; char *line_color; struct graph_line_data *tput_writes_gld; struct graph_line_data *tput_reads_gld; struct graph_line_data *iop_gld; struct graph_line_data *latency_gld; struct graph_line_data *queue_depth_gld; int fio_trace; struct graph_line_data *fio_gld; /* Number of entries in gdd_writes / gdd_reads */ int io_plots; /* Allocated array size for gdd_writes / gdd_reads */ int io_plots_allocated; struct graph_dot_data **gdd_writes; struct graph_dot_data **gdd_reads; int mpstat_min_seconds; int mpstat_max_seconds; int mpstat_stop_seconds; struct graph_line_data **mpstat_gld; }; static inline unsigned int MAJOR(unsigned int dev) { return dev >> MINORBITS; } static inline unsigned int MINOR(unsigned int dev) { return dev & MINORMASK; } void init_io_hash_table(void); void init_process_hash_table(void); struct trace *open_trace(char *filename); u64 find_last_time(struct trace *trace); void find_extreme_offsets(struct trace *trace, u64 *min_ret, u64 *max_ret, u64 *max_bank_ret, u64 *max_offset_ret); int filter_outliers(struct trace *trace, u64 min_offset, u64 max_offset, u64 *yzoom_min, u64 *yzoom_max); int action_char_to_num(char action); void add_iop(struct trace *trace, struct graph_line_data *gld); void check_record(struct trace *trace); void add_completed_io(struct trace *trace, struct graph_line_data *latency_gld); void add_io(struct trace *trace, struct trace_file *tf); void add_tput(struct trace *trace, struct graph_line_data *writes_gld, struct graph_line_data *reads_gld); void add_pending_io(struct trace *trace, struct graph_line_data *gld); int next_record(struct trace *trace); void first_record(struct trace *trace); #endif iowatcher-1.0/fio.c000066400000000000000000000100511207726554200142670ustar00rootroot00000000000000/* * Copyright (C) 2013 Fusion-io * * This program is free software; you can redistribute it and/or * modify it under the terms of the GNU General Public * License v2 as published by the Free Software Foundation. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA */ #include #include #include #include #include #include #include #include #include #include #include #include #include #include #include "plot.h" #include "blkparse.h" #include "list.h" #include "tracers.h" #include "fio.h" static int past_eof(struct trace *trace, char *cur) { if (cur >= trace->fio_start + trace->fio_len) return 1; return 0; } static int parse_fio_line(struct trace *trace, int *time, int *rate, int *dir, int *bs) { char *cur = trace->fio_cur; char *p; int *res[] = { time, rate, dir, bs, NULL }; int val; int i = 0; int *t; char *end = index(cur, '\n'); char *tmp; if (!end) return 1; tmp = strndup(cur, end - cur); if (!tmp) return 1; p = strtok(tmp, ","); while (p && *res) { val = atoi(p); t = res[i++]; *t = val; p = strtok(NULL, ","); } free(tmp); if (i < 3) return 1; return 0; } int next_fio_line(struct trace *trace) { char *next; char *cur = trace->fio_cur; next = strchr(cur, '\n'); if (!next) return 1; next++; if (past_eof(trace, next)) return 1; trace->fio_cur = next; return 0; } char *first_fio(struct trace *trace) { trace->fio_cur = trace->fio_start; return trace->fio_cur; } static void find_last_fio_time(struct trace *trace) { double d; int time, rate, dir, bs; int ret; int last_time = 0; if (trace->fio_len == 0) return; first_fio(trace); while (1) { ret = parse_fio_line(trace, &time, &rate, &dir, &bs); if (ret) break; if (dir <= 1 && time > last_time) last_time = time; ret = next_fio_line(trace); if (ret) break; } d = (double)time / 1000; trace->fio_seconds = ceil(d); return; } int read_fio(struct trace *trace, char *trace_name) { int fd; struct stat st; int ret; char *p; fd = open(trace_name, O_RDONLY); if (fd < 0) return 1; ret = fstat(fd, &st); if (ret < 0) { fprintf(stderr, "stat failed on %s err %s\n", trace_name, strerror(errno)); goto fail_fd; } p = mmap(NULL, st.st_size, PROT_READ, MAP_PRIVATE, fd, 0); if (p == MAP_FAILED) { fprintf(stderr, "Unable to mmap trace file %s, err %s\n", trace_name, strerror(errno)); goto fail_fd; } trace->fio_start = p; trace->fio_len = st.st_size; trace->fio_cur = p; trace->fio_fd = fd; find_last_fio_time(trace); first_fio(trace); return 0; fail_fd: close(fd); return 1; } struct trace *open_fio_trace(char *path) { int ret; struct trace *trace; trace = calloc(1, sizeof(*trace)); if (!trace) { fprintf(stderr, "unable to allocate memory for trace\n"); exit(1); } ret = read_fio(trace, path); if (ret) { free(trace); return NULL; } return trace; } int read_fio_event(struct trace *trace, int *time_ret, u64 *bw_ret, int *dir_ret) { char *cur = trace->fio_cur; int time, rate, dir, bs; int ret; if (past_eof(trace, cur)) return 1; ret = parse_fio_line(trace, &time, &rate, &dir, &bs); if (ret) return 1; time = floor((double)time / 1000); *time_ret = time; *bw_ret = (u64)rate * 1024; *dir_ret = dir; return 0; } int add_fio_gld(int time, u64 bw, struct graph_line_data *gld) { double val; if (time > gld->max_seconds) return 0; gld->data[time].sum += bw; gld->data[time].count++; val = ((double)gld->data[time].sum) / gld->data[time].count; if (val > gld->max) gld->max = ceil(val); return 0; } iowatcher-1.0/fio.h000066400000000000000000000017401207726554200143010ustar00rootroot00000000000000/* * Copyright (C) 2013 Fusion-io * * This program is free software; you can redistribute it and/or * modify it under the terms of the GNU General Public * License v2 as published by the Free Software Foundation. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License * along with this program; if not, write to the Free Software * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA */ #ifndef __FIO__ #define __FIO__ int read_fio_event(struct trace *trace, int *time, u64 *bw, int *dir); int add_fio_gld(int time, u64 bw, struct graph_line_data *gld); int next_fio_line(struct trace *trace); struct trace *open_fio_trace(char *path); char *first_fio(struct trace *trace); #endif iowatcher-1.0/iowatcher.1000066400000000000000000000075251207726554200154310ustar00rootroot00000000000000.TH iowatcher 1 .SH NAME iowatcher - Create visualizations from blktrace results .SH SYNOPSIS .B iowatcher \fIOPTIONS...\fR .SH DESCRIPTION iowatcher graphs the results of a blktrace run. It can graph the result of an existing blktrace, start a new blktrace, or start a new blktrace and a benchmark run. It can then create an image or movie of the IO from a given trace. iowatcher can produce either SVG files or movies in mp4 format (with ffmpeg) or ogg format (with png2theora). .SH OPTIONS .TP \fB--help\fP Print a brief usage summary. .TP \fB-d, --device\fP Controls which device you are tracing. You can only trace one device at a time for now. It is sent directly to blktrace, and only needed when you are making a new trace. .TP \fB-D, --blktrace-destination\fP Destination for blktrace. .TP \fB-p, --prog\fP Program to run while blktrace is run. .TP \fB-K, --keep-movie-svgs\fP Keep the SVG files generated for movie mode. .TP \fB-t, --trace\fP Controls the name of the blktrace file. iowatcher uses a dump from blkparse, so -t tries to guess the name of the corresponding per CPU blktrace data files if the dump file doesn't already exist. If you want more than one trace in a given graph, you can specify -t more than once. .TP \fB-l, --label\fP