aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2015-01-21Add victim to replace simple_procee and page_typeHEADmasterzhilongx.liu3-0/+260
Victim will be used as new test case for error injection. It provides an unified interface to export physical address for CE/PFA/IFU/DCU test, even for eMCA. Signed-off-by: zhilongx.liu <zhilongx.liu@intel.com> Signed-off-by: Chen, Gong <gong.chen@linux.intel.com>
2014-04-09Add new doc to introduce how to create case fileChen, Gong2-2/+10
casefile is used to save what test cases will be used finally. So a proper introduction is necessary. BTW, fix a spell mistake in runmcetest. Signed-off-by: Chen, Gong <gong.chen@linux.intel.com>
2014-04-09Fix the bugs in hwpoison related test casesChen, Gong2-4/+4
Fix two bugs in two test cases. 1) In the test for disk file soft off-line, it often fails because it is mmaped via shared mode. Now chaning it to private mode to fix wider test environment. 2) in run_soft.sh there is one spell mistake so that some test case will fail. Signed-off-by: Chen, Gong <gong.chen@linux.intel.com>
2014-04-09core_recovery: Fix the bug in DCU/IFU test caseChen, Gong1-2/+5
If BIOS is bogus so that error injection can't be executed as expected, curent test case will fail. Fix this bug. Signed-off-by: Chen, Gong <gong.chen@linux.intel.com>
2014-04-09Enable notrigger for PFA test caseChen, Gong1-0/+1
Too many BIOSes are bogus so that we have to disable auto trigger mechanism for PFA test case. Signed-off-by: Chen, Gong <gong.chen@linux.intel.com>
2014-03-19Add eMCA test caseChen, Gong4-0/+190
eMCA is a kind of new mechanism to report H/W errors since IVB-EX platform. By now only eMCA Gen1 is supported, which means only CE error can be reported from this path. Signed-off-by: Liu, ZhilongX <zhilongx.liu@intel.com> Signed-off-by: Chen, Gong <gong.chen@linux.intel.com>
2013-12-18Add extra hwpoison-inject load checkChen, Gong4-0/+20
Add load checker of hwpoison-inject module for all other hwpoison test cases besides run_hugepage_overcommit.sh. Signed-off-by: Chen, Gong <gong.chen@linux.intel.com>
2013-12-18Add hwpoison-inject load checkNaoya Horiguchi2-0/+18
Add load checker of hwpoison-inject module for test case run_hugepage_overcommit.sh. NOTE: Gong revisits this patch a little bit. Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Chen, Gong <gong.chen@linux.intel.com>
2013-12-18Add test case 'hugepage_overcommit'Naoya Horiguchi4-1/+71
After a successful hugetlb page migration by soft offline, the source page will either be freed into hugepage_freelists or buddy (over-commit page). If page is in buddy, page_hstate(page) will be NULL. It will hit a NULL pointer dereference in dequeue_hwpoisoned_huge_page(). [ 890.677918] BUG: unable to handle kernel NULL pointer dereference at 0000000000000058 [ 890.685741] IP: [<ffffffff81163761>] dequeue_hwpoisoned_huge_page+0x131/0x1d0 [ 890.692861] PGD c23762067 PUD c24be2067 PMD 0 [ 890.697314] Oops: 0000 [#1] SMP This test case is targeted for the bug reported by Jianguo Wu, where we have NULL pointer access when we have to free source hugepage under overcommitting situation. Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Chen, Gong <gong.chen@linux.intel.com>
2013-07-18Improve test reliability for SRAR caseChen Gong1-0/+13
Remove possible EDAC driver to avoid interference. Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
2013-04-01fix "syntax error near unexpected token `>'"Shaoyong Wang2-4/+4
"&>>" can't be recognized on some Linux OS such as SuSE because it uses older BASH version, So use substitute mode. Signed-off-by: Shaoyong Wang <shaoyongx.wang@intel.com> Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
2013-04-01Update the test script for BSP test caseShaoyong Wang2-11/+13
1. Don't use $ROOT to locate BSP directory, $TMP_DIR instead 2. Change the invoke sequence of variables (NUM_FAIL_CPU/NUM_PASS_CPU) to avoid any complaint. Signed-off-by: Shaoyong Wang <shaoyongx.wang@intel.com> Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
2013-04-01check TMP_DIR path for all runtest.sh scriptsShaoyong Wang10-2/+66
To avoid temporary files are saved in wrong directory when test script is executed under its own directory, TMP_DIR path should be identified before the test. Signed-off-by: Shaoyong Wang <shaoyongx.wang@intel.com> Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
2013-02-06Fix "too many arguments" issue in mcemenuShaoyong Wang1-2/+2
The lack of double quotation leads to a grammar mistake. Signed-off-by: Shaoyong Wang <shaoyongx.wang@intel.com> Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
2013-02-06Update apei-inj test caseShaoyong Wang2-22/+43
Fix incomplete dmesg information which is used for result analysis. Put related dmesg/mcelog log under path/to/apei-inj/log/. Signed-off-by: Shaoyong Wang <shaoyongx.wang@intel.com> Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
2013-02-06Add test case for ACPI5.0 extension support for EINJShaoyong Wang3-0/+208
This test includes regular EINJ error injection test and Vendor Extension Specific Error Injection test with ACPI5.0 enabled BIOS. Signed-off-by: Shaoyong Wang <shaoyongx.wang@intel.com> Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
2013-02-06hwpoison: check with CLD_KILLED|CLD_DUMPED instead of just CLD_KILLEDNaoya Horiguchi2-3/+5
tinjpage and ttranshuge can get SIGCHLD(CLD_DUMPED) from their child processes, but now they only check CLD_KILLED, so tests fail. This behavior of the kernel might not be wrong, because the defalut action of the SIGBUS is 'coredump', not 'terminate' (see comments in include/linux/signal.h). With this patch, we accept SIGCHLD(CLD_DUMPED) as a correct behavior. Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
2013-02-06Clean up hwpoison functional testsNaoya Horiguchi7-314/+341
Code displacement: - moved common code into helper.sh to avoid duplicates, - merged run-huge-test.sh into run_hugepage.sh and run-transhuge-test.sh into run_thp.sh. Minor improvements: - added sysctl vm.memory_failure_early_kill=0 in the setup of each testcase (some testcases change this global parameter, so it's safe to reset it to 0 to avoid interference between testcases), - added freeing resources (shmems, semaphores) and unpoisoning in the cleanup of each testcase, - added counter check ("HardwareCorrupted:" in /proc/meminfo) Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
2013-02-04Update page-types toolChen Gong1-15/+116
New page-types fixes some bugs and support THP, so update this tool for mce-test. Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
2013-01-18Update Makefile for hwpoison test caseChen Gong1-1/+1
One dot is missed in the Makefile so that GDB can't get symbol table from the binary when debugging. Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
2013-01-18Fix hugetlbfs mount path detectionLans Zhang1-2/+3
The type parameter in mount entry is random especially for pseudo filesystem, thus, we don't want a hardcode on it. Signed-off-by: Lans Zhang <jia.zhang@windriver.com> Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
2013-01-18Mount hugetlbfs for hwposion-hard testLans Zhang1-0/+6
anonymous hugepage, file backed hugepage and shared memory hugepage need a mounted hugetlbfs. Signed-off-by: Lans Zhang <jia.zhang@windriver.com> Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
2013-01-15Update missed file attributionChen Gong2-0/+0
Add missed file attribution for BSP test case. Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
2013-01-09Add BSP online/offline test caseShaoyong Wang3-0/+358
Basic BSP online/offline tests include 3 modes: PER-CPU mode GROUP-CPU mode and S3/S4 with CPU0 onlined or offlined, respectively. Signed-off-by: Shaoyong Wang <shaoyongx.wang@intel.com> Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
2012-12-14Remove coverage test casesChen Gong2-2/+3
Coverage test cases are only for white-box test during development of some RAS features in the kernel. By now it is totally obsolete. Mask these test cases to avoid confusing users. It will be removed from the test suite after some time, if no one has complainant. Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
2012-12-14mce-test: Add file attributes to openThomas Renninger1-1/+1
This fixes a compile warning. open(2) manpage says: ... mode specifies the permissions to use in case a new file is created. This argument must be supplied when O_CREAT is specified in flags; ... Signed-off-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
2012-12-14mce-test: OpenSUSE Build Service check wants to have the she-bang on topThomas Renninger1-1/+1
Signed-off-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
2012-12-14mce-test: Fix pread and MAP_ANONYMOUS usageThomas Renninger2-2/+8
_XOPEN_SOURCE=500 must be defined for pread but this will result in MAP_ANONYMOUS not being defined -> also define _BSD_SOURCE for MAP_ANONYMOUS Signed-off-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
2012-12-14mce-test: Fix she-bang in fs-metadata.sh and k-thread.shThomas Renninger2-4/+4
Some she-bang are missed in the bash header. Signed-off-by: Thomas Renninger <trenn@suse.de> Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
2012-11-08Update test result report information for ERST testChen Gong1-2/+3
Add some information to remind one possible reasons when meeting failures. Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
2012-11-07Add workaround for dialog output because of its versionChen Gong1-0/+5
The output from special dialog version has double quote even if --separate-output is used. If so, rip them to ensure the output is like regular dialog output. Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
2012-11-07add check for parameter notrigger in APEI/SRAR test caseChen Gong1-1/+3
On some platforms OS doesn't support parameter notrigger. Under this kind of situation, injection procedure is dangerous because it maybe causes sytem oops/crash. If no this parameter, the test should be teminated. Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
2012-09-18Add timeout condition in the PFA testChen Gong1-0/+6
On some platforms PFA will not be triggered so that the PFA test can't finish. So the timeout functionality is necessary to avoid endless PFA test. Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
2012-09-18Clear existed ERST record before the testChen Gong1-3/+15
If ERST table is full, the test can't begin. To avoid this potential issue, if existing ERST record, erase one record to relase the storage space and let the test go on. Because the ERST test maybe damges the data in the ERST table, please restore the valid data in the ERST to the other safe place. Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
2012-09-12Auto remove EDAC module for PFA testChen Gong1-0/+14
Some EDAC modules will stop mcelog to collect the error log from kernel mcelog buffer, which cause the mcelog PFA function invalid. To avoid the influence from EDAC module, remove the specific EDAC module before the test and restore it after the test. Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
2012-09-12Update pfa test caseChen Gong1-3/+20
On some platforms original PFA case can't work well because of no actual reading/writing action in time. This patch enhances the reading/writing operations to ensure the error can be triggered. Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
2012-08-24Use BASH as default shell script interpreterShaoyong Wang14-14/+14
Some test scripts can't be recognized well on some Linux OS, such as Ubuntu. Change default *sh* to *bash*. Signed-off-by: Shaoyong Wang <shaoyongx.wang@intel.com> Signed-off-by: Chen Gong <gong.chen@intel.com>
2012-08-21Add SRAR DCU/IFU functional test caseChen Gong7-0/+331
This patch adds two SRAR functinal test cases (DCU & IFU). The SRAR test is highly BIOS dependent so if BIOS is bogus, system will be hang or panic. By default these two test cases are disabled, if one wants to test SRAR, please open them. Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
2012-08-21update the check method for debugfsChen Gong4-15/+8
On some platforms old methods can't find debugfs correctly, so a new way via /proc/mounts is used to find debugfs path. Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
2012-07-31Minor fixes for MCE-testChen Gong5-6/+26
Many minor fixes are added. Some for compatibility, some for enhancement, and the others for bug fixes. Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
2012-07-31Fix matching issue when selecting test cases from the menuShaoyong Wang1-1/+1
Old logic will filter out comment lines and the words containing on/off letters in case list files when executing case selecting. Signed-off-by: Shaoyong Wang <shaoyongx.wang@intel.com> Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
2012-05-19update missed 'x' operation bitv1.0Chen Gong2-0/+0
mcemenu and runmcetest are shell files and should own 'x' bit. Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
2012-04-24Readd x bits for shell scriptsAndi Kleen39-0/+0
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2012-04-13Reorganize mce-testChen Gong203-1661/+4261
This new design reorganize entire structure of MCE-test. After applying new structure, MCE-test owns new unified output format and interface. In principle, during this change, no functional change. Only some minor fixes and updates are added, BTW, a few new test cases are merged such as PFA. Other test cases will be applied after this change is fused into current MCE-test. Signed-off-by: Chen Gong <gong.chen@linux.intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2012-02-21add param_extension as default module parameter for EINJChen Gong2-2/+2
param_extension is an new module parameter to support param1/param2 as an BIOS extension for specific vendor. By default the tests need to enable this parameter to to get param1/param2. Signed-off-by: Chen Gong <gong.chen@linux.intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2012-01-12Clarify README a bitAndi Kleen1-3/+7
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2012-01-12add SRAR test case in the user context situationChen Gong5-1/+75
This is part of the SRAR test cases. It is used to test DCU error happening under user land and other CPUs working in the user context, kernel context, NMI context and IRQ context. Signed-off-by: Chen Gong <gong.chen@linux.intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2011-05-19format erst-inject.cChen Gong1-154/+154
reformat erst-inject.c to make it to follow UNIX style Signed-off-by: Chen Gong <gong.chen@linux.intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2011-05-19minor fixes for erst testChen Gong2-2/+6
1) in the last patch after update makefile rule, I forget to update corresponding shell script. And the shell script mode attribute is not correct, too 2) update erst-inject tool to provide more friendly prompt Signed-off-by: Chen Gong <gong.chen@linux.intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2011-04-29Add ERST functional test case (V3)Chen Gong6-3/+607
this case is used to test read/write/clear operations on ERST. Pay attention, please use this case on the kernel >=2.6.39-rc1. More detail information please refer the test case itself. BTW, this case doesn't consider the situation such as duplicate or missing id because current firmware has bugs. It will be updated after the firmware fixes this issue. V3 -> V2: Makefile without recursive make V2 -> V1: add copyright information Signed-off-by: Chen Gong <gong.chen@linux.intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2011-04-20fix guest_tmp usage in the kvm SRAO test caseChen Gong1-3/+3
guest_tmp usage is totally wrong. It assumes existing the same directory on the host and guest. In fact, the definition is just correct for guest system. Otherwise, the file guest_tmp can't be transfered to the host correctly. Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
2011-04-20no strict check when ssh/scp in the kvm SRAO testChen Gong1-2/+3
when first connecting to guest OS, guest OS will transfer its public key fingerprint to the host OS. To avoid interactive operation in the test procedure, no strict check is necessary. Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
2011-04-20minor fixes for kvm SRAO test againChen Gong2-2/+2
1) latest qemu monitor output format is changed, so update the condition check 2) it looks the starting anonymous memory addresses of simple_process can't be used as injection address. Just skip them. Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
2011-04-20minor fixes for KVM SRAO test caseChen Gong1-3/+7
here is the fix list: 1) rc3.d shouldn't be the default start position. it should be assgined according to the /etc/inittab 2) when test case quits unexpected, qemu should be killed, too. 3) delete an extra local parameter Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
2011-04-20update the KVM-SRAO test README file and related filesChen Gong3-15/+243
Add more content into it to make it more readable and operable. Besides the update for README file. Some related patches are added into mce-test suite, too. Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
2011-04-20enhance the test flexibility for auto-testChen Gong3-32/+1028
Some operations in the procedure of creating guest image can be done automatically. Such as copying simple_process and page-types tool into guest image. Another update is about public/private keys. The original usage maybe breaks the path relationship because user can set public/private key file path indepently without HOST_DIR involved. But these setting is useless, so delete these options. Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
2011-04-20Some minor fixes for KVM SRAO test casesChen Gong2-27/+61
Here is the list: 1) EARLYKILL is defined but not used 2) some echo info outside functions include "!", which will make shell confused and give wrong output 3) add page-types check on the host side 4) some $mnt usages are dangerous. Such as $mnt$get_tmp will return wrong path 5) fix a spell error for variable QEMU_PID 6) update p2v -> x-gpa2hva according to Ying's latest QEMU patch 7) in the usage host_run.sh can be executed directly but in fact it doesn't. Add execution permission for it. 8) add "-h" description and option "h" should not be given a ":" 9) make "-m" option a consistent action as other options 10) add more conditions check before tests 11) simplify some statements 12) auto mount mce_inject module Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
2011-04-11Revert "Some minor fixes for KVM SRAO test cases"Andi Kleen2-59/+26
This reverts commit 916cfd584ec37aa3dec3aae25b265e2701b35246.
2011-04-11Revert "enhance the test flexibility for auto-test"Andi Kleen3-1028/+32
This reverts commit b09f37e5d0d93d33fd5930222cc106708d85e1ed.
2011-04-11Revert "update the KVM-SRAO test README file and related files"Andi Kleen3-211/+15
This reverts commit 5c854ab100dcbd6a445a0c07e2f35f40fefe2a59.
2011-04-05update the KVM-SRAO test README file and related filesChen Gong3-15/+211
Add more content into it to make it more readable and operable. Besides the update for README file. Some related patches are added into mce-test suite, too. Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
2011-04-05enhance the test flexibility for auto-testChen Gong3-32/+1028
Some operations in the procedure of creating guest image can be done automatically. Such as copying simple_process and page-types tool into guest image. Another update is about public/private keys. The original usage maybe breaks the path relationship because user can set public/private key file path indepently without HOST_DIR involved. But these setting is useless, so delete these options. Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
2011-04-05Some minor fixes for KVM SRAO test casesChen Gong2-26/+59
I give a quick overview and find some defects. Here is the list: 1) EARLYKILL is defined but not used 2) some echo info outside functions include "!", which will make shell confused and give wrong output 3) add page-types check on the host side 4) some $mnt usages are dangerous. Such as $mnt$get_tmp will return wrong path 5) fix a spell error for variable QEMU_PID 6) update p2v -> x-gpa2hva according to Ying's latest QEMU patch 7) in the usage host_run.sh can be executed directly but in fact it doesn't. Add execution permission for it. 8) add "-h" description and option "h" should not be given a ":" 9) make "-m" option a consistent action as other options 10) add more conditions check before tests 11) simplify some statements 12) auto mount mce_inject module All of these fixes don't touch actual functions. Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
2011-02-11page-poisoning.c: fix build warningAndi Kleen1-0/+1
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2011-02-11Add hwpoison test for THP.Jin Dongming3-0/+530
THP is supported from v2.6.38-rc1. So add hwpoison test for testing it easier. Signed-off-by: Jin Dongming <jin.dongming@np.css.fujitsu.com> Reviewed-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2010-11-24Cleanup write/read_hugepage()Jin Dongming1-7/+24
Make parameters of write/read_hugepage() understand easier. And add comment for the write/read_hugepage(). Signed-off-by: Jin Dongming <jin.dongming@np.css.fujitsu.com> Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2010-11-24Fix unsuitable avoid checking of write/read_hugepage().Jin Dongming1-2/+2
The addr of write/read_hugepage() is the mapping address of file. So no matter how many hugepages are mapped, addr will be the head address of all hugepages. The avoid of write/read_hugepage() is the address which does not want to be touched. So it could be the head address of any hugepage. So addr == avoid in write/read_hugepage() is not equal always except the avoid is the address of the first hugepage. This patch fixed it. Signed-off-by: Jin Dongming <jin.dongming@np.css.fujitsu.com> Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2010-11-24Fix write/read_hugepage() for copy-on-write.Jin Dongming1-2/+2
When the cowflag is valid, child process should copy all the hugepage of its parent. But now no matter what cowflag is, the child process will not do copy-on-write operation. It is because the parameter(size==0) of write_hugepage() make write_hugepage() do nothing. This problem is introduced by commit c6a4c3d950385063db705e520bc9b6cda9587f57 Author: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> With this patch, the state of parent and child processes will be like following: Before this patch After this patch NO-COW Parent and child processes are killed. Same as before. COW Parent and child processes are killed. Only parent process is killed. (Here process is killed by memory-failure.) Signed-off-by: Jin Dongming <jin.dongming@np.css.fujitsu.com> Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2010-11-08Merge branch 'master' of git://git.kernel.org/pub/scm/utils/cpu/mce/mce-testAndi Kleen5-122/+471
2010-10-29Add missing utils.hAndi Kleen2-7/+1
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2010-10-29tinjpage: add hugepage testcasesNaoya Horiguchi2-24/+170
si_addr_lsb check in sighandler() is also extended to hugepage shift. Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2010-10-29thugetlb: add soft offline code to thugetlb.cNaoya Horiguchi1-89/+47
Soft offlining is driven by using options '-O' and '-x' Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2010-10-29tsoftinj: add hugetlb code on tsoftinj.cNaoya Horiguchi1-11/+66
Add three testcases for hugepage soft offlining. Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2010-10-29add header file giving utility functions for hugepageNaoya Horiguchi1-0/+162
Add routines allocating/freeing hugepages of the following types: - hugepage on shared memory, - anonymous hugepage, - filebacked hugepage. And also add read/write helper functions. Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2010-10-06Make addr_lsb failure a warning only for nowAndi Kleen1-2/+1
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2010-10-06tinjpage: Test for correct si_addr_lsb field in signalsAndi Kleen1-0/+35
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2010-09-28KVM test fixesDean Nelson1-14/+75
This patch makes the following changes to the mce-test suite's kvm test. (git://git.kernel.org/pub/scm/utils/cpu/mce/mce-test.git) . Re-enable the late kill option (-l) on host_run.sh. . Add a virtual guest RAM size option (-m) to host_run.sh that gets passed to qemu-system-x86_64. This allows for testing guest's >= 4069M in size. . Allow for guest .img files to consist of LVM partitions. Signed-off-by: Dean Nelson <dnelson@redhat.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2010-05-25tsimpleinj: enhance automatic test for hwpoison testChen Gong1-2/+13
add failure statistic and exit value check, so that it is easy to run automatic test. Signed-off-by: Chen Gong <gong.chen@linux.intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2010-05-20thugetlb.c: avoid extra newline in errorsAndi Kleen1-1/+1
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2010-05-20run-huge-test.sh: Fix typo in usage stringAndi Kleen1-1/+1
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2010-05-20stress, Makefile: make test should depend on make allLi, Haicheng1-1/+1
If hwpoison.sh is executed from the top level Makefile, it doesn't compile/install the required binaries. The Makefile in mce-test/stress works correctly. ... Test aborted by unexpected error! [error] !!! no bin subdir there !!! Reported-by: Evan McNabb <emcnabb@redhat.com> Signed-off-by: Haicheng Li <haicheng.li@intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2010-05-20stress, hwpoison.sh: fix fs_metadata workload for local test dirLi, Haicheng1-1/+1
This patch is to fix below problem: ... result summary: fs_metadata -- no test finished details: /root/git/mce-test/stress/log/fs_metadata/fs_metadata.log fsck.ext3 -- fsck on /dev/loop5 got pass totally 1 task-groups report failures ... ... [04-05 16:29:08] thread 0 starts with pid 25027 tee: ./hwpoison/fs_metadata/k-threads.pid: No such file or directory 25027 Signed-off-by: Evan McNabb <emcnabb@redhat.com> Signed-off-by: Haicheng Li <haicheng.li@intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2010-05-20thugetlb: Declare wait()Andi Kleen1-0/+1
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2010-05-20thugetlb.c: Fix error reportingAndi Kleen1-2/+5
Supply correct err() and errmsg macro, don't use implicit ones from glibc with different prototype Signed-off-by: Andi Kleen <ak@linux.intel.com>
2010-05-20thugetlb.c: Fix printf format stringAndi Kleen1-1/+1
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2010-05-20Add hugetlb test for hugetlb mca recovery testingNaoya Horiguchi3-1/+528
So far not hooked up to standard "make test" because the kernel patches are not in yet. Signed-off-by: Andi Kleen <ak@linux.intel.com>
2010-05-20random_offline: avoid extra unpoison pass on timeoutAndi Kleen1-0/+1
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2010-05-20random_offline: fix endless run without -t argumentAndi Kleen1-4/+6
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2010-05-20random_offline: give total success/failure statistics for testAndi Kleen1-2/+17
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2010-04-11tinjpage: Add more error checks for memory unmapsAndi Kleen1-6/+12
In case something goes wrong in the kernel with the poisoned mappings Signed-off-by: Andi Kleen <ak@linux.intel.com>
2010-04-11hwpoison.sh: use default child process num for each workload.Haicheng Li1-1/+2
If I run hwpoison.sh without -C option I get the following errors: ./hwpoison.sh: line 366: [: -eq: unary operator expected ./hwpoison.sh: line 371: [: -gt: unary operator expected ./hwpoison.sh: line 372: [: -eq: unary operator expected The reason is g_children is NULL, which should be zero. Reported-by: Evan McNabb <emcnabb@redhat.com> Signed-off-by: Haicheng Li <haicheng.li@intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2010-04-11tinjpage: Flush stdout in shared page childAndi Kleen1-0/+1
Not strictly needed due to line buffering, but more future proof. Signed-off-by: Andi Kleen <ak@linux.intel.com>
2010-04-11tinjpage: Reset failure counter in childAndi Kleen1-0/+2
This way the child won't fail if there were already other errors. Signed-off-by: Andi Kleen <ak@linux.intel.com>
2010-04-11Clean up IPC resources on shared memory tests in tinjpageAndi Kleen1-16/+35
Based on a report from Evan McNabb Signed-off-by: Andi Kleen <ak@linux.intel.com>
2010-03-13stress: add make test support.Haicheng Li2-1/+16
Signed-off-by: Haicheng Li <haicheng.li@linux.intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2010-03-13page-poisoning: code cleanup, free unused shared mem timely.Haicheng Li1-21/+30
Signed-off-by: Haicheng Li <haicheng.li@linux.intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2010-03-13page-poisoning: fix inaccurate result checking.Haicheng Li1-3/+2
Signed-off-by: Haicheng Li <haicheng.li@linux.intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2010-03-13page-poisoning: fix Bad Address issue in file_clean case.Haicheng Li1-1/+1
Signed-off-by: Haicheng Li <haicheng.li@linux.intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2010-03-13hwpoison.sh: test mode, to run test in local dir other than on target device.Haicheng Li1-24/+56
Signed-off-by: Haicheng Li <haicheng.li@linux.intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2010-03-13hwpoison.sh: minor fixes. - as ltp source shows, ltp-pan must be under ↵Haicheng Li1-9/+15
$ltp_root/pan/ - use invalid() to exit when error is related to command option. - add die() to let stress tester work fine with common func check_debugfs(). Signed-off-by: Haicheng Li <haicheng.li@linux.intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2010-03-13hwpoison.sh: code cleanup to show more clear log and usage help.Haicheng Li1-12/+18
Signed-off-by: Haicheng Li <haicheng.li@linux.intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2010-03-13hwpoison.sh: improve show_progress() to show more friendly logs.Haicheng Li1-3/+11
Signed-off-by: Haicheng Li <haicheng.li@linux.intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2010-03-13hwpoison.sh: regular page unpoisoning support.Haicheng Li1-3/+42
Signed-off-by: Haicheng Li <haicheng.li@linux.intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2010-03-13hwpoison.sh: to support Page Soft-Offlining testing.Haicheng Li1-20/+57
Signed-off-by: Haicheng Li <haicheng.li@linux.intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2010-03-13hwpoison.sh: to support nfs, cifs, ocfs2, reiserfs, btrfs, and xfs.Haicheng Li1-39/+64
Signed-off-by: Haicheng Li <haicheng.li@linux.intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2010-03-13hwpoison.sh: avoid unexpected page-state changing while stress testing.Haicheng Li1-0/+14
Signed-off-by: Haicheng Li <haicheng.li@linux.intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2010-03-02Work around ubuntu incompatibilities to the linux standardChen Gong13-14/+14
On the Ubuntu platform, sh is linked to dash so that all of these shell scripts can't run correctly. It needs to be substituted with BASH. Signed-off-by: Chen Gong <gong.chen@linux.intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2010-02-19There is problem with commit 138d18351a725e7ef43e6ae4fb7c9405718a797d thatCyril Hrubis1-8/+6
added clearing and backing up old logs for kdump driver. As testcases causes reboot and the script is re-run after each reboot the test ends up in infinite loop (as setupped stamp is moved). Second one is with loading mce-inject module. The kdump test driver is appereantly run with "set -ex" so all lines that can return non zero (and should not stop script exectuion) must be used only as a part of a conditionals. Signed-off-by: Andi Kleen <ak@linux.intel.com>
2010-02-19Handle case of multiple debugfs being mountedAndi Kleen1-1/+1
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2010-02-04Add KVM RAS test suiteJiajia Zheng6-2/+422
This patch is to add KVM RAS test suite into mce-test, which is a collection of test scripts for testing the Linux kernel MCE processing features in KVM guest system. Signed-off-by: Jiajia Zheng <jiajia.zheng@intel.com> Signed-off-by: Huang Ying <ying.huang@intel.com>
2010-01-12Better precondition checking in the test suiteChen Gong3-7/+24
1. auto-load einj module before apei test begins and update APEI_IF definition to a proper place 2. fix typos in the check_debugfs 3. enhance the module check before stress test Signed-off-by: Chen Gong <gong.chen@linux.intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-24Fix debugfs mountingAndi Kleen1-2/+2
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-24some code cleanup againChen Gong4-1/+7
1. test path shouldn't be placed under "/"in the stress/hwpoison.sh 2. to clear the log history, backup old test log with different names. 3. add execution attribute for apei test case Signed-off-by: Chen Gong <gong.chen@linux.intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-24makefile cleanupChen Gong6-5/+12
cleanup some confusion execution paths Signed-off-by: Chen Gong <gong.chen@linux.intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-21path of g_ltppan should be updated after ltp root path setChen Gong1-2/+2
g_ltppan needs to be updated after g_ltproot is set. BTW, I consider g_ltppan should be under g_ltproot directly. It is more clear. Signed-off-by: Chen Gong <gong.chen@linux.intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-21some codes cleanupChen Gong5-22/+41
1. more graceful output in the check_debugfs 2. eliminate trivial usage of parameter "debugfs" in hwpoison.sh 3. add some additional checks before driver kicks off, if not so, one maybe meets such info "Failed: MCE log is different from input", in fact it is only because module mce_inject isn't be inserted. Signed-off-by: Chen Gong <gong.chen@linux.intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-18The position of APEI_IF is not a const anymoreChen Gong2-2/+2
update definiton of APEI_IF. Now it can be located anywhere. Signed-off-by: Chen Gong <gong.chen@linux.intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-18update the definition of check_debugfsChen Gong1-2/+12
check_debugfs should not only be serviced for mce. And add a new function dedicated for mce. Signed-off-by: Chen Gong <gong.chen@linux.intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-16Fix english/line wrapping in gcov warningAndi Kleen2-4/+6
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-16Automatically mount debugfs in mce testerAndi Kleen1-0/+7
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-14Add make test-kernel to tsrcAndi Kleen1-0/+3
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-14Add quickstart note to the READMEAndi Kleen1-0/+5
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-14Don't run kdump test in make test, but in make test-kdumpAndi Kleen2-6/+11
This essentially renames test-simple to test Also some minor fixes to the Makefiel Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-14Add standard "make test" in tsrc / run from top level MakefileAndi Kleen3-1/+24
This runs all the standard functional tests for a quick test in hwpoison Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-14random_offline improvementsAndi Kleen1-5/+72
- Add way to specify random seed - Add timeout - Various new checks to be more user friendly - Use standard option parsing Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-14Add comment in tsimpleinj.cAndi Kleen1-0/+3
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-14Rename tinjpage-working to tsimpleinjAndi Kleen2-1/+1
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-14Fix warning/add comment in tring.cAndi Kleen1-2/+4
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-14Add copyright headersAndi Kleen2-1/+24
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-14Clean x.html in tsrc tooAndi Kleen1-0/+1
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-14Random soft offline test casesAndi Kleen4-1/+230
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-08Enable hwpoison filter if neededAndi Kleen1-0/+2
This is to handle kernel where the filter defaults to off. Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-02Fix warnings in simple_process.cAndi Kleen1-2/+5
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-02Fix makefile rules for simple_processAndi Kleen2-5/+13
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-12-02APEI injection support for mce-testJiajia Zheng9-11/+324
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-10-28Fix tsrc Makefile clean target to clean everythingAndi Kleen1-4/+10
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-10-28tinjpage: add status printf to second test loopNaoya Horiguchi1-1/+3
Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-10-28tinjpage: minor changes to shared memory test functionsAndi Kleen1-3/+4
- Fix indentation - Always report failure to parent Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-10-28tinjpage: add test case for mmap/ipv shared pagesNaoya Horiguchi1-0/+160
This patch adds the testcases for mmap/IPV shared pages. The purpose of these testcases is as follows: - We can check whether a process A is killed expectedly when it accesses the page shared with and hwpoisoned by another process B (in the late killing case). - We can check whether a process A is killed at once when another process B injected hwpoison into the page shared by both of them (in the early killing case). ChangeLog: - Add synchronization code between parent and child process with semaphore. - Share the common function do_shared() between mmap case and IPV case. - Add error chack code. Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-10-27Add another missing ruleAndi Kleen1-1/+1
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-10-27Add proper dependencies to stress MakefilesAndi Kleen4-13/+21
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-10-27fs-metadata workload: file system metadata test program.Haicheng Li6-0/+588
It is the fs-metadata workload. fs-metadata is designed to test i-node operations with heavy workload and make sure every i-node operation gets the expected result. In details, it firstly generates a huge directory hierarchy on the target disk, then it performs unlink operations on this directory hierarchy and duplicate a copy of the directory, finally it checks if these two directories are same as expected. Acked-by: Andi Kleen <andi.kleen@intel.com> Signed-off-by: Shaohui Zheng <shaohui.zheng@intel.com> Signed-off-by: Haicheng Li <haicheng.li@intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-10-27page-poisoning workload: multi-process based test program thru madvise syscall.Haicheng Li4-0/+925
page-poisoning test program is an extension of tinjpage test program with a multi-process model. It spawns thousands of processes that inject HWPosion error to various pages simultaneously thru madvise syscall. Then it checks if these errors get handled correctly, i.e. whether each test process receives or doesn't receive SIGBUS signal as expected. In details, page-poisoning is designed to cover all of possible userspace page types via following two test operations: - anonymous pages operations. - file data operations. Acked-by: Andi Kleen <andi.kleen@intel.com> Signed-off-by: Haicheng Li <haicheng.li@intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-10-27HOWTO: documentation of MCE stress test suite.Haicheng Li2-0/+344
Documentation of MCE stress test suite. Reviewed-by: Jiajia Zheng <jiajia.zheng@intel.com> Signed-off-by: Haicheng Li <haicheng.li@intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-10-27hwpoison.sh: test driver of MCE stress test suiteHaicheng Li4-0/+969
The MCE stress test suite is a collection of tools and test scripts, which intends to achieve stress testing for Linux kernel MCA high level handlers that include HWPosion page recovery, soft page offline, and so on. In general, this test suite is designed to do stress testing thru various test interfaces, i.e. madvise syscall, HWPoison page injector, and APEI injector (see ACPI4.0 spec). And it's able to support most of popular Linux File Systems (FS), that is, there is an option for user to specify which FS type they want the test to be running on. The MCE stress test suite consists of four parts: test driver, workload controller, customized workloads, and background workloads. The main test idea is described as below: - Test driver launchs various customized workloads to continuously generate lots of pages with expected page states, Note, all of these workloads know about their expected results that should not be affected by Linux MCE high level handlers. - Then test driver injects MCE errors to these pages thru either madvise syscall or HWPoison injector or APEI injector. While Linux Kernel handling these MCE errors, all the workloads continue running normally, - After long time running, test driver will collect test result of each workload to see if any unexpected failures happened. In such a way, it can decide if any bug is found. - If any system panics or FS corruption happens, that means there must be a bug. It's the bottom line to decide if test gets pass. Test driver (a.k.a hwpoison.sh) drives the whole test procedure. It's responsible for managing test environment, setting up error injection interface, controlling test progress, launching workloads, injecting page errors, as well as recordng test logs and reportng test result. Acked-by: Andi Kleen <andi.kleen@intel.com> Signed-off-by: Haicheng Li <haicheng.li@intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-10-26Avoid tsyncpage failures on NFSHaicheng Li1-1/+1
My debugging shows mmap() with MLOCKED flag will set page dirty again, then kernel handler would never enter into clean page handling logic. So use fsync() after mmap() to make the page clean. Signed-off-by: Haicheng Li <haicheng.li@intel.com> Tested-by: Dean Nelson <dnelson@redhat.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-10-08Fix tprctl tester to actually workAndi Kleen1-5/+5
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-10-08Add prctl tester for hwpoisonAndi Kleen2-1/+99
(requires hwpoison-2.6.32) Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-10-07Add a symlink tsrc -> hwpoison to make it clearAndi Kleen1-0/+1
tsrc tests hwpoison Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-09-22Update the documentation howto.txtZheng Jiajia1-2/+17
Update the howto.txt. -recommend to stop cron before mce testing. -add an introduction to loop-mce-test as well. Signed-off-by: Zheng Jiajia <jiajia.zheng@intel.com>
2009-09-18Minor changes to tools/loop-mce-test.shHuang Ying1-2/+11
Some parameter changes and other minor changes. Signed-off-by: Huang Ying <ying.huang@intel.com>
2009-09-18Rename and chmod tools/loop-mce-testHuang Ying1-0/+0
Rename to tools/loop-mce-test.sh to follow naming convention. chmod +x. Signed-off-by: Huang Ying <ying.huang@intel.com>
2009-09-17loop-mce-test: Exit with error code on failureDean Nelson1-1/+1
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-09-17loop-mce-test: Add a test tool for running the mca test cases in a loop.Zheng Jiajia1-0/+36
Based on work from Dean Nelson Signed-off-by: Zheng Jiajia <jiajia.zheng@intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-09-16Merge branch 'master' of git://git.kernel.org/pub/scm/utils/cpu/mce/mce-testAndi Kleen18-20/+28
2009-09-16tinjpage: fix another printf to new formatAndi Kleen1-1/+2
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-09-16tinjpage: mark currently broken in kernel errors optionalAndi Kleen1-4/+10
- all second errors are optional because the VFS reports only once - hole errors are optional because we can't propagate errors for holes Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-09-16tinjpage: clean up output to be easier readableAndi Kleen1-11/+14
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-09-16Add option to injpage to enable sniperAndi Kleen1-3/+34
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-09-16Add GPL copyright header to tinjpageAndi Kleen1-1/+16
Also add Fengguang as author Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-09-16Add mlock test cases to tinjpageAndi Kleen1-11/+43
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-09-16Fix duplicated mcelog records for some cases on machine with SER_PHuang Ying17-17/+17
On machine with SER_P, machine_check_poll in kernel filters out MCE with MCI_STATUS_S instead of MCI_STATUS_UC. So for some test cases run on machine with/without SER_P, both UC and S should be set. Signed-off-by: Huang Ying <ying.huang@intel.com>
2009-09-16fix kdump name definitionChen Gong1-3/+11
SLE11 change the kdump name from "kdump" to "boot.kdump". So fix it in a usual way. Signed-off-by: Chen Gong <gong.chen@intel.com>
2009-09-04Update howto for kdump test driverJiajia Zheng1-6/+30
update the document for test with kdump test driver. Signed-off-by: Jiajia Zheng <jiajia.zheng@intel.com>
2009-08-31New test case to test conditional control in machine_check_pollJiajia Zheng10-3/+84
Add a new test group -- poll_noser, add three cases -- fatal_poll, srar_poll and uc_poll to test the conditional control statement in machine_check_poll. Signed-off-by: Jiajia Zheng <jiajia.zheng@intel.com> Signed-off-by: Huang Ying <ying.huang@intel.com>
2009-08-22Merge branch 'master' of git://git.kernel.org/pub/scm/utils/cpu/mce/mce-testAndi Kleen16-446/+30
2009-08-22Update MADV_POISON value to new oneAndi Kleen3-5/+3
There was a conflict with the MADV_POISON value, so update to the new one. Note -- you need a new kernel for testing now. Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-08-19Fix make test via adding missing config filesHuang Ying2-0/+10
Add config/simple.conf and config/kdump.conf to make "make test" works again. Only minimal test cases works on all machine are added to these config files. Signed-off-by: Huang Ying <ying.huang@intel.com>
2009-08-19Change set_fake_panic to use new interfaceHuang Ying1-1/+1
Because fake panic configuration file moved from sysfs to debugfs. Signed-off-by: Huang Ying <ying.huang@intel.com>
2009-08-19Fix some coding style issueHuang Ying5-19/+18
LTP uses Linux kernel coding style now. So fix some coding style issue reported by checkpatch.pl. Signed-off-by: Huang Ying <ying.huang@intel.com>
2009-08-19Remove document for tools/readcoreHuang Ying1-10/+1
Because it has been removed. Signed-off-by: Huang Ying <ying.huang@intel.com>
2009-08-19Remove tools/readcoreHuang Ying8-416/+0
It is not used now. Signed-off-by: Huang Ying <ying.huang@intel.com>
2009-08-17Merge branch 'master' of git://git.kernel.org/pub/scm/utils/cpu/mce/mce-testAndi Kleen3-23/+83
2009-08-17Fix compilation of tcases/ttable with recent kernelAndi Kleen2-0/+4
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-08-10Document updateHuang Ying2-22/+47
Cleanup README. Update kernel requirement in doc/howto.txt. Signed-off-by: Huang Ying <ying.huang@intel.com>
2009-08-10Add doc for panic_ucr/srao_*_noripvZheng, Jiajia1-1/+36
two cases -- srao_mem_scrub_noripv and srao_ewb_noripv were added into panic_ucr, update the document accordingly. Signed-off-by: Jiajia Zheng <jiajia.zheng@intel.com>
2009-08-07tinjpage: fix error checking in expecterr()Hidehiro Kawai1-17/+19
expecterr() and optionalerr() think there was an error if a given return value of a function call is 0, otherwise no error. But this assumption is not always true (e.g. write(2)). So we should check errors in caller side and then pass the result. Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-08-07tinjpage: use O_RDWR flag for open before writeHidehiro Kawai1-1/+1
In file dirty case, it tries to write some data to a test file opened with O_RDONLY. It gets an unexpected error. Fix it by opening the file with O_RDWR flag. Signed-off-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-08-06Move two srao_*_noripv to panic_ucrHuang Ying4-8/+7
Because they cause system panic. Signed-off-by: Huang Ying <ying.huang@intel.com>
2009-08-06Fix panic type and exception message for the new cases of recoverable_ucrJin Dongming1-0/+2
The new cases are srao_ewb_noripv and srao_mem_scrub_noripv. After either one of them injected, the result of the case seems right. But panic type and exception message needed as standards for checking the result of the cases does not make sense in fact. Because they are "NULL". So I modify them. Signed-off-by: Jin Dongming <jin.dongming@np.css.fujitsu.com>
2009-08-04add two srao cases with RIPV unsetZheng, Jiajia3-0/+16
add two cases to recoverable_ucr. --srao_mem_scrub_noripv --srao_ewb_noripv Signed-off-by: Jiajia Zheng <jiajia.zheng@intel.com>
2009-07-28Update the document for the new cases.Jin Dongming2-2/+38
There are two cases added, one(srar_no_en) for panic_ucr and the other(ucna_over) for poll_ucr. Update the document for them. Signed-off-by: Jin Dongming <jin.dongming@np.css.fujitsu.com>
2009-07-23Add two cases ucna_over and srar_no_enZheng, Jiajia5-1/+19
Add two cases: ucna_over in poll_ucr and srar_no_en in panic_ucr. Signed-off-by: Jiajia Zheng <jiajia.zheng@intel.com> Signed-off-by: Huang Ying <ying.huang@intel.com>
2009-07-23Add document for soft-inj_recoverable_ucrZheng, Jiajia1-0/+138
Add document for test cases in recoverable_ucr. - add soft-inj_recoverable_ucr.txt Signed-off-by: Jiajia Zheng <jiajia.zheng@intel.com> Signed-off-by: Huang Ying <ying.huang@intel.com>
2009-07-14Change default make see browser to firefoxAndi Kleen1-1/+1
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-07-14Add Makefile target for tringAndi Kleen2-0/+5
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-07-14Update tsrc/README for kernel test programsAndi Kleen1-2/+5
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-07-14Add missing include file in tcases.cAndi Kleen1-0/+1
Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-07-14Fix compilation of tcases/ttable with recent kernelAndi Kleen10-4/+72
It was too difficult to compile with kernel includes, so add a hierarchy of fake kernel includes to stub out kernel functions. Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-07-14Fix /tmp race in mce_shell.shAndi Kleen1-1/+2
And delete tmp file after shell exit Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-06-23Add mce-shellHuang Ying1-0/+49
tools/mce_shell.sh simulates the environment of mce-test driver and test case script, used for debugging. mce-test library functions can be invoked by mce-shell interactively.
2009-06-18Update some of the document of MCE test toolJin Dongming6-55/+320
Update some of the document which is out of date. And add some new document of the test case which does not have document. - modified soft-inj_non-panic.txt - modified soft-inj_panic.txt - modified soft-inj_panic_npcc.txt - add soft-inj_panic_noser.txt - add soft-inj_panic_ucr.txt - add soft-inj_poll_ucr.txt Signed-off-by: Jin Dongming <jin.dongming@np.css.fujitsu.com>
2009-06-18Delete the reference cases which may be not necessary now.Jin Dongming2-16/+0
Some of the reference cases are same as the test cases, so I think they are not necessary now. - delete the reference case of fatal - delete the reference case of fatal_over Signed-off-by: Jin Dongming <jin.dongming@np.css.fujitsu.com>
2009-06-15Fix soft-inj/panic_noser/uc_over_corrected caseHuang Ying2-5/+2
Corrected Machine check should be logged in do_machine_check if system will go panic. Signed-off-by: Huang Ying <ying.huang@intel.com>
2009-06-15Fix kernel logging mechanism on busy machineHuang Ying2-2/+2
The speed of kernel output log is too slow to be catched on some machine. And there is a random sleep mechansim for random testing. So we move random sleep before kernel log extracting, and extend sleep time to at least 5 seconds. Reported-by: Jin Dongming <jin.dongming@np.css.fujitsu.com>
2009-06-15Remove test case group soft-inj/recoverableHuang Ying10-230/+0
Because it is obsolete now. Signed-off-by: Huang Ying <ying.huang@intel.com>
2009-06-15mcetest tool set the value of panic_on_oopsJin Dongming3-0/+8
The value of /proc/sys/kernel/panic_on_oops in some system is setted with "1" as default, so we want to resolve the trouble with this function for mcetest. Signed-off-by: Jin Dongming <jin.dongming@np.css.fujitsu.com>
2009-06-15Fix background testing signal processingHuang Ying1-4/+6
Start background testing in its own process group, so all processes in background testing process group can be killed with: kill -TERM -$pgrp
2009-06-03Fix gcov copy codeHuang Ying1-2/+4
It seems that /bin/cp does not work for debugfs seq file, so uses /bin/cat instead.
2009-06-03Canonicalise $KSRC_DIR in kdump driverHuang Ying1-0/+6
$KSRC_DIR may be a symbol link, this may break "find" used on that. So convert it into canonical form before usage, this can check whether it is a valid directory too.