aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
7 hoursCI: create review.yml fileHEADmastermainKinga Stefaniuk4-0/+75
Introduce review.yml used by GitHub actions. Add make probe, checkpatch and hardening-check on every pull request. Add dependabot.yml file which check for updates of actions used in this repository. This option enables to automatically fill new PR with action updated to the latest version. Signed-off-by: Kinga Stefaniuk <kinga.stefaniuk@intel.com>
10 hoursmdadm: Change main repository to GithubMariusz Tkaczyk2-66/+61
Now github will be used for tracking mdadm, adjust README.md. Daily routines will be automated on Github, there is not need to decribe them. Adjust release process, it must be published to both repositories. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
34 hoursWait for mdmon when it is stared via systemdKinga Stefaniuk4-5/+37
When mdmon is being started it may need few seconds to start. For now, we didn't wait for it. Introduce wait_for_mdmon() function, which waits up to 5 seconds for mdmon to start completely. Signed-off-by: Kinga Stefaniuk <kinga.stefaniuk@intel.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
34 hoursutil.c: change devnm to const in mdmon functionsKinga Stefaniuk2-4/+4
Devnm shall not be changed inside mdmon_running() and mdmon_pid() functions, change this parameter to const. Signed-off-by: Kinga Stefaniuk <kinga.stefaniuk@intel.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
5 daystests/23rdev-lifetime: fix a typoYu Kuai1-1/+1
"pill" was wrong, while it should be "kill", test will still pass while test thread will not be cleaned up. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
7 daysMakefile: Move -pie to LDFLAGSFabrice Fontaine1-2/+2
Move -pie from LDLIBS to LDFLAGS and make LDFLAGS configurable to allow the user to drop it by setting their own LDFLAGS (e.g. PIE could be enabled or disabled by the buildsystem such as buildroot). Suggested-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Signed-off-by: Fabrice Fontaine <fontaine.fabrice@gmail.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
7 daystests/01raid6integ.broken can be removedXiao Ni1-7/+0
01raid6integ can be run successfully with kernel 6.9.0-rc3. So remove 01raid6integ.broken. Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
7 daystests/01r5integ.brokenXiao Ni1-7/+0
01r5integ can be run successfully 152 times without error with kernel 6.9.0-rc4 and mdadm - v4.3-51-g52bead95. So remove this one broken case. Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
7 daystests/01r5fail enhanceXiao Ni1-5/+1
After removing dev0, the recovery starts because it already has a spare disk. It's good to check recovery. But it's not right to check recovery after adding dev3. Because the recovery may finish. It depends on the recovery performance of the testing machine. If the recovery finishes, it will fail. But dev3 is only added as a spare disk, we can't expect there is a recovery happens. So remove the codes about adding dev3. Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
9 daysimsm: support RAID 10 with more than 4 drivesMateusz Kusiak3-17/+45
VROC UEFI driver does not support RAID 10 with more than 4 drives. Add user prompts if such layout is being created and for R0->R10 reshapes. Refactor ask() function: - simplify the code, - remove dialog reattempts, - do no pass '?' sign on function calls, - highlight default option on output. This patch completes adding support for R10D4+ to IMSM. Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
9 daysimsm: simplify imsm_check_attributes()Mateusz Kusiak1-90/+16
imsm_check_attributes() is too complex for that it really does. Remove repeating code and simplify the function. Fix function calls. Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
9 daysimsm: define RAID_10 attributeMateusz Kusiak2-1/+7
Add MPB_ATTRIB_RAID10_EXT attribute to support RAID 10 with more than 4 drives. Allow more than 4 drives in imsm_orom_support_raid_disks_raid10(). This is one of last patches for introducing R10D4+ to imsm. Only small adjustments in reshape behaviours are needed. Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
9 daysimsm: bump minimal versionMateusz Kusiak1-49/+38
IMSM version 1.3 (called ATTRIBS) brought attributes used to define array properties which require support in driver. The goal of this change was to avoid changing version when adding new features. For some reasons migration has never been completed and currently (after 10 years of implementing) IMSM can use older versions. It is right time to finally switch it. There is no point in using old versions, use 1.3.00 as minimal one. Define JD_VERSION used by Windows driver. Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
9 daysimsm: refactor RAID level handlingMateusz Kusiak3-62/+138
Add imsm_level_ops struct for better handling and unifying raid level support. Add helper methods and move "orom_has_raid[...]" methods from header to source file. RAID 1e is not supported under Linux, remove RAID 1e associated code. Refactor imsm_analyze_change() and is_raid_level_supported(). Remove hardcoded check for 4 drives and make devNumChange a multiplier for RAID 10. Refactor printing supported raid levels. Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
9 daysimsm: add support for literal RAID 10Mateusz Kusiak1-19/+48
As for now, IMSM supports only 4 drive RAID 1+0. This patch is first in series to add support for literal RAID 10 (with more than 4 drives) to imsm. Allow setting RAID 10 as raid level for imsm arrays. Add update_imsm_raid_level() to handle raid level updates. Set RAID10 as default level for imsm R0 to R10 migrations. Replace magic numbers with defined values for RAID level checks/assigns. Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
9 daysmdadm: use struct context in reshape_super()Mateusz Kusiak3-49/+105
reshape_super() takes too many arguments. Change passing params in favor of single struct. Add devname pointer and change direction members to struct shape and use it for reshape_super(). Create reshape_array_size() and reshape_array_non_size() to handle reshape_super() calls. Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
9 daysmdadm: pass struct context for external reshapesMateusz Kusiak5-57/+37
This patch alters mutiple functions calls so the context is passed to external reshape functions. There are two main reasons behind it: - reduces number of arguments passed and unifies them, - imsm code will make use of context in incoming patches. Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-04-15Create.c: fix uclibc buildFabrice Fontaine1-0/+4
Define FALLOC_FL_ZERO_RANGE if needed as FALLOC_FL_ZERO_RANGE is only defined for aarch64 on uclibc-ng resulting in the following or1k build failure since commit 577fd10486d8d1472a6b559066f344ac30a3a391: Create.c: In function 'write_zeroes_fork': Create.c:155:35: error: 'FALLOC_FL_ZERO_RANGE' undeclared (first use in this function) 155 | if (fallocate(fd, FALLOC_FL_ZERO_RANGE | FALLOC_FL_KEEP_SIZE, | ^~~~~~~~~~~~~~~~~~~~ Fixes: - http://autobuild.buildroot.org/results/0e04bcdb591ca5642053e1f7e31384f06581e989 Signed-off-by: Fabrice Fontaine <fontaine.fabrice@gmail.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-04-08mdadm: Add README.mdMariusz Tkaczyk1-0/+83
Describe supported metadata types, add step-by-step patch sending instruction, mention minimally supported kernel version and licensing. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-04-08mdadm: Add MAINTAINERS.mdMariusz Tkaczyk1-0/+44
Describe rules maintainer should follow. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-04-08mdadm: add CHANGELOG.mdMariusz Tkaczyk28-1401/+368
Bring changelog back to life. Remove ANNOUCEs. It will use markdown format, to have one style. All releases are migrated to new changelog. It was a exercise I have taken, to familiarize with the mdadm history. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-04-02imsm: drive encryption policy implementationBlazej Kucman1-0/+73
IMSM cares about drive encryption state. It is not allowed to mix disks with different encryption state within one md device. This policy will verify that attempt to use disks with different encryption states will fail. Verification is performed for devices NVMe/SATA Opal and SATA. There is one exception, Opal SATA drives encryption is not checked when ENCRYPTION_NO_VERIFY key with "sata_opal" value is set in conf, for this reason such drives are treated as without encryption support. Signed-off-by: Blazej Kucman <blazej.kucman@intel.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-04-02imsm: print disk encryption informationBlazej Kucman4-4/+79
Print SATA/NVMe disk encryption information in --detail-platform. Encryption Ability and Status will be printed for each disk. There is one exception, Opal SATA drives encryption is not checked when ENCRYPTION_NO_VERIFY key with "sata_opal" value is set in conf, for this reason such drives are treated as without encryption support. To test this feature, drives SATA/NVMe with Opal support or SATA drives with encryption support have to be used. Example outputs of --detail-platform: Non Opal, encryption enabled, SATA drive: Port0 : /dev/sdc (CVPR050600G3120LGN) Encryption(Ability|Status): Other|Unlocked NVMe drive without Opal support: NVMe under VMD : /dev/nvme2n1 (PHLF737302GB1P0GGN) Encryption(Ability|Status): None|Unencrypted Unencrypted SATA drive with OPAL support: - default allow_tpm, we will get an error from mdadm: Port6 : /dev/sdi (CVTS4246015V180IGN) mdadm: Detected SATA drive /dev/sdi with Trusted Computing support. mdadm: Cannot verify encryption state. Requires libata.tpm_enabled=1. mdadm: Failed to get drive encrytpion information. - default "allow_tpm" and config entry "ENCRYPTION_NO_VERIFY sata_opal": Port6 : /dev/sdi (CVTS4246015V180IGN) Encryption(Ability|Status): None|Unencrypted - added "libata.allow_tpm=1" to boot parameters(requires reboot), the status will be read correctly: Port6 : /dev/sdi (CVTS4246015V180IGN) Encryption(Ability|Status): SED|Unencrypted Signed-off-by: Blazej Kucman <blazej.kucman@intel.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-04-02Add key ENCRYPTION_NO_VERIFY to confBlazej Kucman4-5/+50
Add ENCRYPTION_NO_VERIFY config key and allow to disable checking encryption status for given type of drives. The key is introduced because of SATA Opal disks for which TPM commands must be enabled in libata kernel module, (libata.allow_tpm=1), otherwise it is impossible to verify encryption status. TPM commands are disabled by default. Currently the key only supports the "sata_opal" value, if necessary, the functionality is ready to support more types of disks. This functionality will be used in the next patches. Signed-off-by: Blazej Kucman <blazej.kucman@intel.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-04-02Add reading SATA encryption informationBlazej Kucman4-0/+351
Functionality reads information about SATA disk encryption. Technical documentation used is given in the implementation. The implementation is able to recognized two encryption standards for SATA drives, OPAL and ATA security. If the SATA drive supports OPAL, encryption status and ability are determined based on Opal Level 0 discovery response, for ATA security, based on ATA identify response. If SATA supports OPAL, ability is set to "SED", for ATA security to "Other". SED(Self-Encrypting Drive) is commonly used to describe drive which using OPAL or Enterprise standards developed by Trusted Computing Group. Ability "Other" is used for ATA security because we rely only on information from ATA identify which describe the overall state of encryption. It is allowed to mix disks with different encryption ability such as "SED" and "Other" and it is not security gap. Motivation for adding this functionality is to block mixing of disks in IMSM arrays with encryption enabled and disabled. The main goal is to not allow stealing data by rebuilding array to not encrypted drive which can be read elsewhere. For SATA Opal drives, libata allow_tmp parameter enabled is required, which is necessary for Opal Security commands to work, therefore, if the parameter is not enabled, SATA Opal disk cannot be used in case the encryption will be checked by metadata. Implemented functions will be used in one of the next patches. In one of the next patches, a flag will be added to enable disabling SATA Opal encryption checking due to allow_tpm kernel setting dependency. Signed-off-by: Blazej Kucman <blazej.kucman@intel.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-04-02Add reading Opal NVMe encryption informationBlazej Kucman3-2/+396
For NVMe devices with Opal support, encryption information, status and ability are determined based on Opal Level 0 discovery response. Technical documentation used is given in the implementation. Ability in general describes what type of encryption is supported, Status describes in what state the disk with encryption support is. The current patch includes only the implementation of reading encryption information, functions will be used in one of the next patches. Motivation for adding this functionality is to block mixing of disks in IMSM arrays with encryption enabled and disabled. The main goal is to not allow stealing data by rebuilding array to not encrypted drive which can be read elsewhere. Value ENA_OTHER from enum encryption_ability will be used in the next patch. Signed-off-by: Blazej Kucman <blazej.kucman@intel.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-04-02mdadm: Move pr_vrb define to mdadm.hBlazej Kucman2-2/+2
Move pr_vrb define from super-intel.c to mdadm.h to make it widely available. This change will be used in the next patches. Signed-off-by: Blazej Kucman <blazej.kucman@intel.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-03-22Remove all "if zeros" pt.2Mateusz Kusiak3-29/+1
Commit e15e8b00cbce ("Remove all "if zeros"") did not remove all "if 0" code blocks. This commit is cleanup for that commit. Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-03-22mdadm: fix grow segfault for IMSMMariusz Tkaczyk2-2/+8
If sc is not initialized, there is possibility that sc.pols is not zeroed and it causes segfault. Add missing initialization. Add missing dev_policy_free() in two places. Fixes: f656201188d7 ("mdadm: drop get_required_spare_criteria()") Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-03-22sysfs: remove vers parameter from sysfs_set_arrayMateusz Kusiak4-8/+5
9003 was passed directly to sysfs_set_array() since md_get_version() always returned this value. md_get_version() was removed long ago. Remove dead version check from sysfs_set_array(). Remove "vers" argument and fix function calls. Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-03-22mdadm: Fix native --detail --exportMariusz Tkaczyk5-27/+52
Mentioned commit (see Fixes) causes that UUID is not swapped as expected for native superblock. Fix this problem. For detail, we should avoid superblock calls, we can have information about supertype from map, use that. Simplify fname_from_uuid() by removing dependencies to metadata handler, it is not needed. Decision is taken at compile time, expect super1 but this function is not used by super1. Add warning about that. Remove separator, it is always ':'. Fixes: 60c19530dd7c ("Detail: remove duplicated code") Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-03-22mdadm: set swapuuid in all handlersMariusz Tkaczyk3-0/+4
It is not set, so it should be 0 but it may vary on compilation settings. Set it always to 0. metadata should care to set UUID and read in proper endianness so it doesn't follow super1 concept of swapuuid to depend on endianness. It is not an attempt to fix endianness issues. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-03-12util.c: add limits.h include for NAME_MAX definitionAlexander Kanavin1-1/+1
Add limits.h include for NAME_MAX definition. Signed-off-by: Alexander Kanavin <alex@linutronix.de> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-03-12udev.c: Do not require libudev.h if DNO_LIBUDEVMariusz Tkaczyk1-0/+3
libudev may not be presented at all, do not require it. Reported-by: Boian Bonev <bbonev@ipacct.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-03-12mdadm: remove inventory fileMariusz Tkaczyk1-284/+0
It is a file with repo content list. It is outdated already. Remove it. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-03-11Revert "policy.c: Avoid to take spare without defined domain by imsm"Mariusz Tkaczyk1-4/+0
This reverts commit 3bf9495270d7 ("policy.c: Avoid to take spare without defined domain by imsm"). IMSM does not require to be special now because it doesn't create disk controller domain. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-03-11mdadm: drop get_disk_controller_domain()Mariusz Tkaczyk2-28/+0
This function is unused now. Drop it. Controller for IMSM is a device policy and is separated from user defined domains. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-03-11imsm: test_and_add_device_policies() implementationMariusz Tkaczyk2-34/+90
This patch removes get_disk_controller_domain_imsm() in favour of test_and_add_device_policies_imsm(). It is used by create, add and mdmonitor. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-03-11Monitor, Incremental: use device policiesMariusz Tkaczyk4-8/+15
spare_criteria is expanded to contain policies which will be generated by handler's get_spare_criteria() function. It provides a way to test device for metadata specific policies earlier than during add_do_super(), when device is already removed from previous array/container for Monitor. For Incremental, it ensures that all criteria are tested when trying spare. It is not tested when device contains valid metadata. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-03-11Manage: check device policies in manage_add_external()Mariusz Tkaczyk1-0/+8
Only IMSM is going to use device policies so it is added to manage_add_external(). Test policies before adding the drive to container. The change blocks adding new device to the container which already contains not matching devices Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-03-11Create: Use device policiesMariusz Tkaczyk1-6/+25
Generate and compare policies, abort if policies do not match. It is tested for both create modes, with container and disk list specified directly. It is used if supertype supports it. For a case when disk list is specified, container may contain more devices, so additional check on container is done to analyze all disks. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-03-11mdadm: test_and_add device policies implementationMariusz Tkaczyk2-0/+100
Add support for three scenarios: - obtaining array wide policies via fd, - obtaining array wide policies via struct mdinfo, - getting policies for particular drive from the request. Add proper functions and make them extern. These functions are used in next patches. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-03-11mdadm.h: Introduce custom device policiesMariusz Tkaczyk1-18/+36
The approach proposed here is to test drive policies outside validate_geometry() separately per every drive and add determined policies to list. The implementation reuses dev_policy we have in mdadm. This concept addresses following problems: - test drives if they fit together to criteria required by metadata handler, - test all drives assigned to the container even if some of them are not target of the request, mdmon is free to use any drive in the same container, - extensibility, new policies can be added to handler easy, - fix issues related to imsm controller domain verifying. Add superswitch function. It is used in next patches. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-03-11mdadm: introduce sysfs_get_container_devnm()Mariusz Tkaczyk4-21/+39
There at least two places where it is done directly, so replace them with function. Print message about creating external array, add "/dev/" prefix to refer directly to devnode. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-03-11Manage: implement manage_add_external()Mariusz Tkaczyk1-61/+86
Move external add code to separate function. It is easier to control error path now. Error messages are adjusted. No functional changes. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-03-11Manage: fix check after dereference issueMariusz Tkaczyk1-14/+12
The code dereferences dev_st earlier without checking, it gives SAST problem. dev_st is needed for attempt_re_add(), but it is executed only if dv->disposition != 'S', so move disposition check up. tst is a must to reach this place, dup_super() have to return valid pointer, all it needs to check is if load_super() returns superblock. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-03-11mdadm: drop get_required_spare_criteria()Mariusz Tkaczyk4-97/+140
Only IMSM implements get_spare_criteria, so load_super() in get_required_spare_criteria() is dead code. It is moved inside metadata handler, because only IMSM implements it. Give possibility to provide devnode to be opened. With that we can hide load_container() used only to fill spare criteria inside handler and simplify implementation in generic code. Add helper function for testing spare criteria in Incremental and error messages. File descriptor in get_spare_criteria_imsm() is always opened on purpose. New functionality added in next patches will require it. For the same reason, function is moved to other place. No functional changes. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-03-11mdadm: Add functions for spare criteria verificationMariusz Tkaczyk5-71/+67
It is done similar way in few places. As a result, two almost identical functions (dev_size_from_id() and dev_sector_size_from_id()) are removed. Now, it uses same file descriptor to send two ioctls. Two extern functions are added, in next patches disk_fd_matches_criteria() is used. Next optimization is inline zeroing struct spare_criteria. With that, we don't need to reset values in get_spare_criteria_imsm(). Dedicated boolean field for checking if criteria are filled is added. We don't need to execute the code if it is not set. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-03-06Detail: remove duplicated codeKinga Tanska1-20/+13
Remove duplicated code from Detail(), where MD_UUID and MD_DEVNAME are being set. Superblock is no longer required to print system properties. Now it tries to obtain map in two ways. Signed-off-by: Kinga Tanska <kinga.tanska@intel.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-02-29mdadm: move documentation to folderMariusz Tkaczyk3-0/+0
Move documentation text files to directory. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-02-29mdadm: remove mkinitramfs stuffMariusz Tkaczyk2-177/+0
This script uses mdadm.static which is known to not be abandoned (probably not working) from years. Mdadm is integrated with dracut and mkinitramfs these days. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-02-29mdadm: remove mdadm.specMariusz Tkaczyk1-47/+0
This file is outdated, distributions have their own specs. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-02-29mdadm: remove makedistMariusz Tkaczyk1-96/+0
Archives are generated kernel.org automation, no need to submit them manually, so remove legacy solution. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-02-29mdadm: remove TODOMariusz Tkaczyk1-213/+0
This file is not updated in 16 years. No reasons to keep it. Remove it. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-02-29super-intel: respect IMSM_DEVNAME_AS_SERIAL flagKinga Tanska1-6/+6
IMSM_DEVNAME_AS_SERIAL flag was respected only when searching serial using nvme or scsi device wasn't successful. This flag shall be applied first, to have user settings with the highest priority. Signed-off-by: Kinga Tanska <kinga.tanska@intel.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-02-29Monitor: Allow no PID in check_one_sharer()Mateusz Kusiak1-0/+5
Commit 5fb5479ad100 ("Monitor: open file before check in check_one_sharer()") introduced a regression that prohibits monitor from starting if PID file does not exist. Add check for no PID file. Add missing fclose(). Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-02-23test: run tests on system level mdadmMateusz Kusiak2-10/+18
The tests run with MDADM_NO_SYSTEMCTL flag by default, however it has no effect on udev. In case of external metadata, even if flag is set, udev will trigger systemd to launch mdmon. This commit changes test execution level, so the tests are run on system level mdadm, meaning local build must be installed prior to running tests. Add warning that the tests are run on system level mdadm and local build must be installed first. Do not call mdadm with "quiet" as it makes it not display critical messages necessary for debug. Remove forcing speed_limit and add restoring system speed_limit_max after test execution. Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-02-23mdmon: refactor md device name check in main()Mateusz Kusiak1-10/+11
Refactor mdmon main function to verify if fd is valid prior to checking device name. This is due to static code analysis complaining after change b938519e7719 ("util: remove obsolete code from get_md_name"). Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-02-23super1: check fd before passing to get_dev_size() in add_to_super1()Mateusz Kusiak1-1/+4
Check if file descriptor is valid before passing it to get_dev_size() in add_to_super(). Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-02-23Grow: remove dead condition in Grow_reshape()Mateusz Kusiak1-5/+1
Remove dead "if" condition from Grow_reshape(). Sysfs read check is performed earlier in the code. Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-02-23Monitor: open file before check in check_one_sharer()Mateusz Kusiak1-8/+5
Open file before performing checks in check_one_sharer() to avoid file tampering. Remove redundant access check. Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-02-23mdadm: signal_s() init variablesMateusz Kusiak1-3/+2
Init sigaction structs in signal_s(). This approach might throw warnings for GCC 4.x and lower. Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-02-23Create: add_disk_to_super() fix resource leakMateusz Kusiak1-1/+5
Fixes resource leak in add_disk_to_super(). Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-02-20Add understanding output section in manMateusz Kusiak1-1/+20
Add new section in man for explaining mdadm outputs. Describe checkpoint entry. Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-02-20Grow: Move update_tail assign to Grow_reshape()Mateusz Kusiak1-6/+7
Due to e919fb0af245 ("FIX: Enable metadata updates for raid0") code can't enter super-intel.c:3415, resulting in checkpoint not being saved to metadata for second volume in matrix raid array. This results in checkpoint being stuck at last value for the first volume. Move st->update_tail to Grow_reshape() so it is assigned for each volume. Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-02-20Super-intel: Fix first checkpoint restartMateusz Kusiak1-0/+3
When imsm based array is stopped after reaching first checkpoint and then assembled, first checkpoint is reported as 0. This behaviour is valid only for initial checkpoint, if the array was stopped while performing some action. Last checkpoint value is not taken from metadata but always starts with 0 and it's incremented when sync_completed in sysfs changes. In simplification, read_and_act() is responsible for checkpoint updates and is executed each time sysfs checkpoint update happens. For first checkpoint it is executed twice and due to marking checkpoint before triggering any action on the array, it is impossible to read sync_completed from sysfs in just two iterations. The workaround to this is not marking any checkpoint for first sysfs checkpoint after RAID assembly, to preserve checkpoint value stored in metadata. Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-02-20monitor: refactor checkpoint updateMateusz Kusiak1-26/+25
"if" statements of checkpoint updates have too many responsibilties. This results in unclear code flow and duplicated code. Refactor checkpoint update code and simplify "if" statements. Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-02-20Remove hardcoded checkpoint interval checkingMateusz Kusiak1-16/+6
Mdmon assumes that kernel marks checkpoint every 1/16 of the volume size and that the checkpoints are equal in size. This is not true, kernel may mark checkpoints more frequently depending on several factors, including sync speed. This results in checkpoints reported by mdadm --examine falling behind the one reported by kernel. Remove hardcoded checkpoint interval checking. Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-02-15Release mdadm-4.3mdadm-4.3Mariusz Tkaczyk4-5/+5
Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-02-13mdadm: fix update=resync regressionMariusz Tkaczyk1-0/+4
mdadm --assemble --update=resync started failing with the error "mdadm: --update=resync not understood for 1.x metadata". It is a regression. Add omitted branch to fix error. Resubmitted, original author is not responding. https://lore.kernel.org/linux-raid/ZZqJlCToUS3Qrl4J@bianca.dpss.psy.unipd.it/ Fixes: 7e8daba8b793 ("super1: refactor the code for enum") Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-02-09Revert "mdadm: remove container_enough logic"Mariusz Tkaczyk4-1/+48
Mentioned patch changes way of IMSM member arrays assembling, they are updated by every new drive incremental processes. Previously, member arrays were created and filled once, by last drive incremental process. We determined regressions with various impact. Unfortunately, initial testing didn't show them. Regressions are connected to drive appearance order and may not be reproducible on every configuration, there are at least two know issues for now: - sysfs attributes are filled using old metadata if there is outdated drive and it is enumerated first. - rebuild may be aborted and started from beginning after reboot, if drive under rebuild is enumerated as the last one. This reverts commit 4dde420fc3e24077ab926f79674eaae1b71de10b. It fixes checkpatch issues and reworks logic to remove empty "if" branch in Incremental. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-02-09super1: remove support for name= in configMariusz Tkaczyk5-94/+19
Only super1 provides "name=" to config. It is recoreded in metadata so there is no need to duplicate same information. UUID is our main key. It is not used by Incremental and Assemble handles empty name well because other supertypes don't set it in conf. Expectation that the name in config is same as in metadata is bug prone. Config should be the place where use can define customized settings. Remove printing "name=" from mdadm config creation commands. Ignore the name in config file to keep backward compatibility. Remove description from man mdadm.conf. Update 00conftest because "name" is no longer accepted. As the name is ignored, error for mdadm --detail is not printed. Reported-by: Stefan Fleischmann <sfle@kth.se> Fixes: e2eb503bd797 ("mdadm: Follow POSIX Portable Character Set") Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-01-24super-intel: Remove inaccessible codeMateusz Kusiak1-17/+0
Remove inaccessible "if" statement from imsm_set_array_state(). Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-01-24Replace "none" with macroMateusz Kusiak14-29/+42
String "none" is used many times throughout the code. Replace "none" strings with predefined macro. Add str_is_none() for comparing strings with "none". Replace str(n)cmp calls with function. Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-01-24Define sysfs max buffer sizeMateusz Kusiak10-52/+54
sysfs_get_str() usages have inconsistant buffer size. This results in wild buffer declarations and redundant memory usage. Define maximum buffer size for sysfs strings. Replace wild sysfs string buffer sizes for globaly defined value. Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-01-16tests: Gate tests for linear flavor with variable LINEARSong Liu10-3/+60
linear flavor is being removed in the kernel [1], so tests for the linear flavor will fail. Add detection for linear flavor and --disable-linear option, with the same logic as multipath. [1] https://lore.kernel.org/linux-raid/20231214222107.2016042-1-song@kernel.org/ Signed-off-by: Song Liu <song@kernel.org> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2024-01-05manage: adjust checking subarray state in update_subarrayPawel Piatkowski1-1/+2
Only changing bitmap related consistency_policy requires subarray to be inactive. consistency_policy with PPL or NO_PPL value can be changed on active subarray. It fixes regression introduced in commit db10eab68e652f141169 ("Fix --update-subarray on active volume") Signed-off-by: Pawel Piatkowski <pawel.piatkowski@intel.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2023-12-19Remove all "if zeros"Mateusz Kusiak5-127/+0
No more random encounters of "if zeros". Remove all "if 0" code blocks. Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
2023-11-21udev: Move udev_block() and udev_unblock() into udev.cMateusz Grzonka6-37/+54
Add kernel style comments and better error handling. Signed-off-by: Mateusz Grzonka <mateusz.grzonka@intel.com> Signed-off-by: Kinga Tanska <kinga.tanska@intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-11-21Mdmonitor: Improve udev event handlingMateusz Grzonka8-113/+253
Mdmonitor is waiting for udev queue to become empty. Even if the queue becomes empty, udev might still be processing last event. However we want to wait and wake up mdmonitor when udev finished processing events.. Also, the udev queue interface is considered legacy and should not be used outside of udev. Use udev monitor instead, and wake up mdmonitor on every event triggered by udev for md block device. We need to generate more change events from kernel, because they are missing in some situations, for example, when rebuild started. This will be addressed in a separate patch. Move udev specific code into separate functions, and place them in udev.c file. Also move use_udev() logic from lib.c into newly created file. Signed-off-by: Mateusz Grzonka <mateusz.grzonka@intel.com> Signed-off-by: Kinga Tanska <kinga.tanska@intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-10-26Fix assembling RAID volume by using incrementalPawel Piatkowski1-6/+4
After change "mdadm: remove container_enough logic" IMSM volumes are started immediately. If volume is during reshape, then it will be blocked by block_subarray() during first mdadm -I <devname>. Assemble_container_content() for next disk will see the change because metadata version from sysfs and metadata doesn't match and will execute sysfs_set_array again. Then it fails to set same component_size, it is prohibited by kernel. If array is frozen then first sign from metadata version is different ("/" vs "-"), so exclude it from comparison. All we want is to double check that base properties are set and we don't need to call sysfs_set_array again. Signed-off-by: Pawel Piatkowski <pawel.piatkowski@intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-10-26mdadm: remove container_enough logicPawel Piatkowski4-46/+1
Arrays without enough disk count will be assembled but not started. Now RAIDs will be assembled always (even if they are failed). RAID devices in all states will be assembled and exposed to mdstat. This change affects only IMSM (for ddf it wasn't used, container_enough was set to true always). Removed this logic from incremental_container as well with runstop checking because runstop condition is being verified in assemble_container_content function. Signed-off-by: Pawel Piatkowski <pawel.piatkowski@intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-10-26mdadm/super1: Add MD_FEATURE_RAID0_LAYOUT if kernel>=5.4Xiao Ni1-3/+16
After and include kernel v5.4, it adds one feature bit MD_FEATURE_RAID0_LAYOUT. It must need to specify a layout for raid0 with more than one zone. But for raid0 with one zone, in fact it also has a defalut layout. Now for raid0 with one zone, *unknown* layout can be seen when running mdadm -D command. It's the reason that mdadm doesn't set MD_FEATURE_RAID0_LAYOUT for raid0 with one zone. Then in kernel space, super_1_validate sets mddev->layout to -1 because of no MD_FEATURE_RAID0_LAYOUT. In fact, in raid0 io path, it uses the default layout. Set raid0_need_layout to true if kernel_version<=v5.4. Fixes: 329dfc28debb ('Create: add support for RAID0 layouts.') Signed-off-by: Xiao Ni <xni@redhat.com> Reviewed-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-10-26mdadm/ddf: Abort when raid disk is smaller in getinfo_super_ddfXiao Ni1-2/+4
The metadata is corrupted when the raid_disk<0. So abort directly. This also can avoid a building error: super-ddf.c:1988:58: error: array subscript -1 is below array bounds of ‘struct phys_disk_entry[0]’ Suggested-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Ackedy-by: Xiao Ni <xni@redhat.com> Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-10-26mdadm/tests: Don't run mknod before losetupXiao Ni1-1/+0
Sometimes it can fail: losetup: /var/tmp/mdtest0: failed to set up loop device: No such device or address /dev/loop0 and /var/tmp/mdtest0 are already created before losetup. Because losetup can create device node by itself. So remove mknod. Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-10-26Fix race of "mdadm --add" and "mdadm --incremental"Li Xiao Keng1-8/+16
There is a raid1 with sda and sdb. And we add sdc to this raid, it may return -EBUSY. The main process of --add: 1. dev_open(sdc) in Manage_add 2. store_super1(st, di->fd) in write_init_super1 3. fsync(fd) in store_super1 4. close(di->fd) in write_init_super1 5. ioctl(ADD_NEW_DISK) Step 2 and 3 will add sdc to metadata of raid1. There will be udev(change of sdc) event after step4. Then "/usr/sbin/mdadm --incremental --export $devnode --offroot $env{DEVLINKS}" will be run, and the sdc will be added to the raid1. Then step 5 will return -EBUSY because it checks if device isn't claimed in md_import_device()->lock_rdev()->blkdev_get_by_dev() ->blkdev_get(). It will be confusing for users because sdc is added first time. The "incremental" will get map_lock before add sdc to raid1. So we add map_lock before write_init_super in "mdadm --add" to fix the race of "add" and "incremental". Signed-off-by: Li Xiao Keng <lixiaokeng@huawei.com> Signed-off-by: Guanqin Miao <miaoguanqin@huawei.com> Reviewed-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-10-26mdadm/tests: Fix regular expression failureXiao Ni1-2/+2
The test fails because of the regular expression. Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-10-26Incremental: remove obsoleted calls to udisksColy Li1-51/+13
Utility udisks is removed from udev upstream, calling this obsoleted command in run_udisks() doesn't make any sense now. This patch removes the calls chain of udisks, which includes routines run_udisk(), force_remove(), and 2 locations where force_remove() are called. Considering force_remove() is removed with udisks util, it is fair to remove Manage_stop() inside force_remove() as well. In the two modifications where calling force_remove() are removed, the failure from Manage_subdevs() can be safely ignored, because, 1) udisks doesn't exist, no need to check the return value to umount the file system by udisks and remove the component disk again. 2) After the 'I' inremental remove, there is another 'r' hot remove following up. The first incremental remove is a best-try effort. Therefore in this patch, where force_remove() is removed, the return value of calling Manage_subdevs() is not checked too. Signed-off-by: Coly Li <colyli@suse.de> Reviewed-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Cc: Jes Sorensen <jes@trained-monkey.org> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-10-26mdadm: Follow POSIX Portable Character SetMariusz Tkaczyk8-102/+113
When the user creates a device with a name that contains whitespace, mdadm timeouts and throws an error. This issue is caused by udev, which truncates /dev/md link until the first whitespace. This patch introduces prohibition of characters other than A-Za-z0-9.-_ in the device name. Also, it prohibits using leading "-" in device name, so name won't be confused with cli parameter. Set of allowed characters is taken from POSIX 3.280 Portable Character Set. Also, device name length now is limited to NAME_MAX. In some places, there are other requirements for string length (e.g. size up to MD_NAME_MAX for device name). This routine is made to follow POSIX and other, more strict limitations should be checked separately. We are aware of the risk of regression in exceptional cases (as escape_devname function is removed) that should be fixed by updating the array name. The POSIX validation is added for: - 'name' parameter in every mode. - first devlist entry, for Build, Create, Assemble, Manage, Grow. - config entries, both devname and "name=". Additionally, some manual cleanups are made. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-10-26mdadm: define ident_set_devname()Mariusz Tkaczyk5-33/+92
Use dedicated set method for ident->devname. Now, devname validation is done early for modes where device is created (Build, Create and Assemble). The rules, used for devname validation are derived from config file. It could cause regression with execeptional cases where existing device has name which doesn't match criteria for Manage and Grow modes. It is low risk and those modes are not omitted from early devname validation. Use can used main numbered devnode to avoid this problem. Messages exposed to user are changed so it might cause a regression in negative scenarios. Error codes are not changed. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-10-26mdadm: refactor ident->name handlingMariusz Tkaczyk4-23/+104
Create dedicated setter for name in mddev_ident and propagate it. Following changes are made: - move duplicated code from config.c and mdadm.c into new function. - Add error enum in mdadm.h. - Use MD_NAME_MAX instead of hardcoded value in mddev_ident. - Use secure functions. - Add more detailed verification of the name. - make error messages reusable for cmdline and config: - for cmdline, these are errors so use pr_err(). - for config, these are just warnings, so use pr_info(). Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-10-26mdadm: set ident.devname if applicableMariusz Tkaczyk4-71/+55
This patch tries to propagate the usage of struct mddev_ident for cmdline where it is applicable. To avoid regression, this value is derived from devlist->devname for applicable modes only. As a result, the whole structure is passed to some functions. It produces some changes for Build, Create and Assemble. No functional changes intended. The goal of the change is to unify devname validation which is done in next patches. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-10-26tests: create 00confnamesMariusz Tkaczyk2-0/+127
The test is an attempt to document current implementation of devnode and name handling for config entries. It is focused on incremental- default way of array assembling on boot. The expectations are aligned to current implementation for native metadata because it is the most complicated scenario- both variables can be set. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-10-26tests: create names_templateMariusz Tkaczyk2-69/+70
Create templates directory and names_template. Move code from 00createnames. This code will be reused for 00confnames in next patch. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-10-26tests: add a regression test for raid456 deadlock againYu Kuai1-0/+34
This is a regression test for commit ("md/raid5: fix a deadlock in the case that reshape is interrupted"). Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-10-26tests: add a regression test that reshape can corrupt dataYu Kuai1-0/+35
This is a regression test for commit 1544e95c6dd8 ("md: fix data corruption for raid456 when reshape restart while grow up"). Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-10-26tests: add a regression test that raid456 can't assemble againYu Kuai1-0/+33
This is a regression test for commit 0aecb06e2249 ("md/raid5: don't allow replacement while reshape is in progress"). Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-10-26tests: add a regression test that raid456 can't assembleYu Kuai1-0/+32
If recovery is interrupted and reshape is started, then this array can't assemble anymore. The problem is supposed to be fixed by [1]. [1] https://lore.kernel.org/linux-raid/20230529031045.1760883-1-yukuai1@huaweicloud.com/ Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-10-26tests: add a regression test for raid456 deadlockYu Kuai1-0/+58
The deadlock is described in [1], as the last patch described, it's fixed first by [2], however this fix will be reverted and the deadlock is supposed to be fixed by [3]. [1] https://lore.kernel.org/linux-raid/5ed54ffc-ce82-bf66-4eff-390cb23bc1ac@molgen.mpg.de/T/#t [2] https://lore.kernel.org/linux-raid/20220621031129.24778-1-guoqing.jiang@linux.dev/ [3] https://lore.kernel.org/linux-raid/20230322064122.2384589-5-yukuai1@huaweicloud.com/ Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-10-26tests: add a regression test for raid10 deadlockYu Kuai2-0/+88
The deadlock is described in [1], it's fixed first by [2], however, it turns out this commit will trigger other problems[3], hence this commit will be reverted and the deadlock is supposed to be fixed by [1]. [1] https://lore.kernel.org/linux-raid/20230322064122.2384589-5-yukuai1@huaweicloud.com/ [2] https://lore.kernel.org/linux-raid/20220621031129.24778-1-guoqing.jiang@linux.dev/ [3] https://lore.kernel.org/linux-raid/20230322064122.2384589-2-yukuai1@huaweicloud.com/ Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-10-26tests: support to skip checking dmesgYu Kuai1-2/+6
Prepare to add a regression test for raid10 that require error injection to trigger error path, and kernel will complain about io error, checking dmesg for error log will make it impossible to pass this test. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-10-26tests: add a new test for rdev lifetimeYu Kuai1-0/+34
This test add and remove a underlying disk to raid concurretly, verify that the following problem is fixed: run mdadm test 23rdev-lifetime at Fri Apr 28 03:25:30 UTC 2023 md: could not open device unknown-block(1,0). sysfs: cannot create duplicate filename '/devices/virtual/block/md0/md/dev-ram0' CPU: 26 PID: 10521 Comm: test Not tainted 6.3.0-rc2-00134-g7b3a8828043c #115 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.1-2.fc37 04/014 Call Trace: <TASK> dump_stack_lvl+0xe7/0x180 dump_stack+0x18/0x30 sysfs_warn_dup+0xa2/0xd0 sysfs_create_dir_ns+0x119/0x140 kobject_add_internal+0x143/0x4d0 kobject_add_varg+0x35/0x70 kobject_add+0x64/0xd0 bind_rdev_to_array+0x254/0x840 [md_mod] new_dev_store+0x14d/0x350 [md_mod] md_attr_store+0xc1/0x1a0 [md_mod] sysfs_kf_write+0x51/0x70 kernfs_fop_write_iter+0x188/0x270 vfs_write+0x27e/0x460 ksys_write+0x85/0x180 __x64_sys_write+0x21/0x30 do_syscall_64+0x6c/0xe0 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7f26bacf5387 Code: 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 84 RSP: 002b:00007ffe98d79e68 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007f26bacf5387 RDX: 0000000000000004 RSI: 000055bd10282bf0 RDI: 0000000000000001 RBP: 000055bd10282bf0 R08: 000000000000000a R09: 00007f26bad8b4e0 R10: 00007f26bad8b3e0 R11: 0000000000000246 R12: 0000000000000004 R13: 00007f26badc8520 R14: 0000000000000004 R15: 00007f26badc8700 </TASK> Signed-off-by: Yu Kuai <yukuai3@huawei.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-10-26Assemble: fix redundant memory freeKinga Tanska1-2/+0
Commit e9fb93af0f76 ("Fix memory leak in file Assemble") fixes few memory leaks in Assemble, but it introduces problem with assembling RAID volume. It was caused by clearing metadata too fast, not only on fail in select_devices() function. This commit removes redundant memory free. Signed-off-by: Kinga Tanska <kinga.tanska@intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-09-01Add compiler defenses flagsMateusz Grzonka1-12/+29
It is essential to avoid buffer overflows and similar bugs as much as possible. According to Intel rules we are obligated to verify certain compiler flags, so it will be much easier if they are added to the Makefile. Add gcc flags for prevention of buffer overflows, format string vulnerabilities, stack protection to prevent stack overwrites and aslr enablement through -fPIE. Also make the flags configurable. The changes were verified on gcc versions 7.5, 8.3, 9.2, 10 and 12.2. Signed-off-by: Mateusz Grzonka <mateusz.grzonka@intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-09-01imsm: Add reading vmd register for finding imsm capabilityMateusz Grzonka3-6/+130
Currently mdadm does not find imsm capability when running inside VM. This patch adds the possibility to read from vmd register and check for capability, effectively allowing to use mdadm with imsm inside virtual machines. Additionally refactor find_imsm_capability() to make assignments in new lines. Signed-off-by: Mateusz Grzonka <mateusz.grzonka@intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-09-01platform-intel: limit guid lengthKinga Tanska2-4/+4
Moving GUID_STR_MAX to header to use it as a length limitation for snprintf function. Signed-off-by: Kinga Tanska <kinga.tanska@intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-09-01Fix unsafe string functionsKinga Tanska4-9/+9
Add string length limitations where necessary to avoid buffer overflows. Signed-off-by: Kinga Tanska <kinga.tanska@intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-09-01Fix memory leak in file mdadmGuanqin Miao1-0/+4
When we test mdadm with asan, we found some memory leaks in mdadm.c We fix these memory leaks based on code logic. Signed-off-by: Guanqin Miao <miaoguanqin@huawei.com> Signed-off-by: Li Xiao Keng <lixiaokeng@huawei.com> Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-09-01Fix memory leak in file ManageGuanqin Miao1-2/+11
When we test mdadm with asan, we found some memory leaks in Manage.c We fix these memory leaks based on code logic. v2: Fix free() of uninitialized 'tst' in abort path. Signed-off-by: Guanqin Miao <miaoguanqin@huawei.com> Signed-off-by: Li Xiao Keng <lixiaokeng@huawei.com> Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-09-01Fix memory leak in file KillGuanqin Miao1-1/+8
When we test mdadm with asan, we found some memory leaks in Kill.c We fix these memory leaks based on code logic. Signed-off-by: Guanqin Miao <miaoguanqin@huawei.com> Signed-off-by: Li Xiao Keng <lixiaokeng@huawei.com> Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-09-01Fix memory leak in file AssembleGuanqin Miao1-2/+12
When we test mdadm with asan, we found some memory leaks in Assemble.c We fix these memory leaks based on code logic. v2: Set st = NULL before jumping to loop Signed-off-by: Guanqin Miao <miaoguanqin@huawei.com> Signed-off-by: Li Xiao Keng <lixiaokeng@huawei.com> Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-09-01mdadm: Stop mdcheck_continue timer when mdcheck_start service can finish checkXiao Ni1-1/+7
mdcheck_continue is triggered by mdcheck_start timer. It's used to continue check action if the raid is too big and mdcheck_start service can't finish check action. If mdcheck start can finish check action, it doesn't need to mdcheck continue service anymore. So stop it when mdcheck start service can finish check action. Signed-off-by: Xiao Ni <xni@redhat.com> Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-09-01Add secure gethostname() wrapperBlazej Kucman6-8/+24
gethostname() func does not ensure null-terminated string if hostname is longer than buffer length. For security, a function s_gethostname() has been added to ensure that "\0" is added to the end of the buffer. Previously this had to be handled in each place of the gethostname() call. Signed-off-by: Blazej Kucman <blazej.kucman@intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-09-01imsm: fix free space calculationsMariusz Tkaczyk1-20/+30
Between two volumes or between last volume and metadata at least IMSM_RESERVED_SECTORS gap must exist. Currently the gap can be doubled because metadata reservation contains IMSM_RESERVED_SECTORS too. Divide reserve variable into pre_reservation and post_reservation to be more flexible and decide separately if each reservation is needed. Pre_reservation is needed only when a volume is created and it is not a real first volume in a container (we can check that by extent_idx). This type of reservation is not needed for expand. Post_reservation is not needed only if real last volume is created or expanded because reservation is done with the metadata. The volume index in metadata cannot be trusted, because the real volume order can be reversed. It is safer to use extent table, it is sorted by start position. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-09-01imsm: return free space after volume for expandMariusz Tkaczyk1-34/+37
merge_extends() routine searches for the biggest free space. For expand, it works only in standard cases where the last volume is expanded and the free space is determined after the last volume. Add volume index to extent struct and use that do determine size after super->current_vol during expand. Limitation to last volume is no longer needed. It unblocks scenarios where kill-subarray is used to remove first volume and later it is recreated (now it is the second volume, even if it is placed before existing one). Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-09-01imsm: move expand verification code into new functionMariusz Tkaczyk1-86/+101
The code here is too complex. Move it to separate function and simplify it. Add more error messages. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-09-01imsm: introduce round_member_size_to_mb()Mariusz Tkaczyk1-10/+21
Extract rounding logic to separate function. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-09-01imsm: imsm_get_free_size() refactor.Mariusz Tkaczyk1-13/+14
Move minsize calculations up. Add error message if free size is too small. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-09-01imsm: move sum_extents calculations to merge_extents()Mariusz Tkaczyk1-18/+19
This logic is only used by merge_extents() code, there is no need to pass it as parameter. Move it up. Add proper description. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-08-07imsm: Fix possible segfault in check_no_platform()Mateusz Grzonka1-0/+5
conf_line() may return NULL, which is not handled and might cause segfault. Signed-off-by: Mateusz Grzonka <mateusz.grzonka@intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-05-08enable RAID for SATA under VMDKevin Friedberg3-13/+37
Detect when a SATA controller has been mapped under Intel Alderlake RST VMD, so that it can use the VMD controller's RAID capabilities. Create new device type SYS_DEV_SATA_VMD and list separate controller to prevent mixing with the NVMe SYS_DEV_VMD devices on the same VMD domain. Signed-off-by: Kevin Friedberg <kev.friedberg@gmail.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-05-08mdadm: numbered names verificationMariusz Tkaczyk4-17/+50
New functions added to remove literals and make the code reusable. Use parse_num() instead of is_numer(). Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-05-08mdadm: define is_devname_ignore()Mariusz Tkaczyk5-10/+20
Use function instead of direct checks across code. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-05-08mdadm: define DEV_NUM_PREFMariusz Tkaczyk3-7/+15
Use define instead of inlines. Add _LEN define. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-05-08mdadm: define DEV_MD_DIRMariusz Tkaczyk13-45/+54
It is used many times. Additionally define _LEN to avoid repeated strlen() calls when length is needed. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-04-10Remove the config files in mdcheck_start|continue serviceXiao Ni2-4/+0
We set MDADM_CHECK_DURATION in the mdcheck_start|continue.service files. And mdcheck doesn't use any configs from the config file. So we can remove the dependencies. Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-04-10Bump minimum kernel version to 2.6.32Jes Sorensen5-45/+2
Summary: At this point it probably is reasonable to drop support for anything prior to 3.10. Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-04-10Fix some cases eyesore formattingJes Sorensen1-57/+60
Summary: No functional change .... just make it more readable. Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-04-10super1: fix truncation check for journal deviceHristo Venev1-2/+3
The journal device can be smaller than the component devices. Fixes: 171e9743881e ("super1: report truncated device") Signed-off-by: Hristo Venev <hristo@venev.name> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-04-10Fix null pointer for incremental in mdadmmiaoguanqin1-0/+3
when we excute mdadm --assemble, udev-md-raid-assembly.rules is triggered. Then we stop array, we found an coredump for mdadm --incremental.func stack are as follows: #0 enough (level=10, raid_disks=4, layout=258, clean=1, avail=avail@entry=0x0) at util.c:555 #1 0x0000562170c26965 in Incremental (devlist=<optimized out>, c=<optimized out>, st=0x5621729b6dc0) at Incremental.c:514 #2 0x0000562170bfb6ff in main (argc=<optimized out>, argv=<optimized out>) at mdadm.c:1762 func enough() use array avail,avail allocate space in func count_active, it may not alloc space, causing a coredump.We fix this coredump. Signed-off-by: Guanqin Miao <miaoguanqin@huawei.com> Signed-off-by: lixiaokeng <lixiaokeng@huawei.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-03-23Create: Fix checking for container in update_metadataMateusz Grzonka1-1/+1
The commit 8a4ce2c05386 ("Create: Factor out add_disks() helpers") introduced a regression that caused timeouts and udev failing to create links. Steps to reproduce the issue were as following: $ mdadm -CR imsm -e imsm -n4 /dev/nvme[0-3]n1 $ mdadm -CR vol -l5 -n4 /dev/nvme[0-3]n1 --assume-clean I found the check for container was wrong because negation was missing. Fixes: 8a4ce2c05386 ("Create: Factor out add_disks() helpers") Signed-off-by: Mateusz Grzonka <mateusz.grzonka@intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-03-23Revert "Revert "mdadm/systemd: remove KillMode=none from service file""Mariusz Tkaczyk1-1/+0
This reverts commit 28a083955c6f58f8e582734c8c82aff909a7d461. Resolved by commit 723d1df4946e ("mdmon: Improve switchroot interactions.") We are ready to drop it. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-03-20Improvements for IMSM_NO_PLATFORM testing.NeilBrown5-6/+55
Factor out IMSM_NO_PLATFORM testing into a single function that caches the result. Allow mdmon to explicitly set the result to "1" so that we don't need the ENV var in the unit file Check if the kernel command line contains "mdadm.imsm.test=1" and in that case assert NO_PLATFORM. This simplifies testing in a virtual machine. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-03-19mdopen: always try create_named_array()NeilBrown1-0/+1
mdopen() will use create_named_array() to ask the kernel to create the given md array, but only if it is given a number or name. If it is NOT given a name and is required to choose one itself using find_free_devnm() it does NOT use create_named_array(). On kernels with CONFIG_BLOCK_LEGACY_AUTOLOAD not set, this can result in failure to assemble an array. This can particularly seen when the "name" of the array begins with a host name different to the name of the host running the command. So add the missing call to create_named_array(). Link: https://bugzilla.kernel.org/show_bug.cgi?id=217074 Signed-off-by: NeilBrown <neilb@suse.de> Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-03-19mdmon: Improve switchroot interactions.NeilBrown6-9/+16
We need a new mdmon@mdfoo instance to run in the root filesystem after switch root, as /sys and /dev are removed from the initrd. systemd will not start a new unit with the same name running while the old unit is still active, and we want the two mdmon processes to overlap in time to avoid any risk of deadlock, which can happen when a write is attempted with no mdmon running. So we need a different unit name in the initrd than in the root. Apart from the name, everything else should be the same. This is easily achieved using a different instance name as the mdmon@.service unit file already supports multiple instances (for different arrays). So start "mdmon@mdfoo.service" from root, but "mdmon@initrd-mdfoo.service" from the initrd. udev can tell which circumstance is the case by looking for /etc/initrd-release. continue_from_systemd() is enhanced so that the "initrd-" prefix can be requested. Teach mdmon that a container name like "initrd/foo" should be treated just like "foo". Note that systemd passes the instance name "initrd-foo" as "initrd/foo". We don't need a similar mechanism at shutdown because dracut runs "mdmon --takeover --all" when appropriate. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-03-19mdmon: Remove need for KillMode=noneNeilBrown1-1/+6
mdmon needs to keep running during the switchroot out of (at boot) and then back into (at shutdown) the initrd. It runs until a new mdmon takes over. Killmode=none is used to achieve this, with the help of --offroot which sets argv[0][0] to '@' which systemd understands. This is needed because mdmon is currently run in system-mdmon.slice which conflicts with shutdown.target so without Killmode=none mdmon would get killed early in shutdown when system.mdmon.slice is removed. As described in systemd.service(5), this conflict with shutdown can be resolved by explicitly requesting system.slice, which is a natural counterpart to DefaultDependencies=no. So add that, and also add IgnoreOnIsolate=true to avoid another possible source of an early death. With these we no longer need KillMode=none which the systemd developers have marked as "deprecated". Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-03-19mdmon: change systemd unit file to use --foregroundNeilBrown1-2/+1
There is no value in mdmon forking when it is running under systemd - systemd can still track it anyway. So add --foreground option, and remove "Type=forking". Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-03-19mdmon: don't test both 'all' and 'container_name'.NeilBrown1-7/+4
If 'all' is not set, then container_name must be NULL, as nothing else can set it. So simplify the test to ignore container_name. This makes the purpose of the code more obvious. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-03-19Use existence of /etc/initrd-release to detect initrd.NeilBrown1-9/+1
Since v183, systemd has used the existence of /etc/initrd-release to detect if it is running in an initrd, rather than looking at the magic number of the root filesystem's device. It is time for mdadm to do the same. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-03-19Define alignof using _Alignof when using C11 or newerKhem Raj1-1/+11
WG14 N2350 made very clear that it is an UB having type definitions within "offsetof" [1]. This patch enhances the implementation of macro alignof_slot to use builtin "_Alignof" to avoid undefined behavior on when using std=c11 or newer clang 16+ has started to flag this [2] Fixes build when using -std >= gnu11 and using clang16+ Older compilers gcc < 4.9 or clang < 8 has buggy _Alignof even though it may support C11, exclude those compilers too [1] https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2350.htm [2] https://reviews.llvm.org/D133574 Upstream-Status: Pending Signed-off-by: Khem Raj <raj.khem@gmail.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-03-13manpage: Add --write-zeroes option to manpageLogan Gunthorpe1-1/+17
Document the new --write-zeroes option in the manpage. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Acked-by: Kinga Tanska <kinga.tanska@linux.intel.com> Reviewed-by: Xiao Ni <xni@redhat.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Acked-by: Coly Li <colyli@suse.de> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-03-13tests/00raid5-zero: Introduce test to exercise --write-zeros.Logan Gunthorpe1-0/+12
Attempt to create a raid5 array with --write-zeros. If it is successful check the array to ensure it is in sync. If it is unsuccessful and an unsupported error is printed, skip the test. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Acked-by: Kinga Tanska <kinga.tanska@linux.intel.com> Reviewed-by: Xiao Ni <xni@redhat.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Acked-by: Coly Li <colyli@suse.de> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-03-13mdadm: Add --write-zeros option for CreateLogan Gunthorpe4-2/+190
Add the --write-zeros option for Create which will send a write zeros request to all the disks before assembling the array. After zeroing the array, the disks will be in a known clean state and the initial sync may be skipped. Writing zeroes is best used when there is a hardware offload method to zero the data. But even still, zeroing can take several minutes on a large device. Because of this, all disks are zeroed in parallel using their own forked process and a message is printed to the user. The main process will proceed only after all the zeroing processes have completed successfully. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Acked-by: Kinga Tanska <kinga.tanska@linux.intel.com> Reviewed-by: Xiao Ni <xni@redhat.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Acked-by: Coly Li <colyli@suse.de> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-03-13mdadm: Introduce pr_info()Logan Gunthorpe2-3/+6
Feedback was given to avoid informational pr_err() calls that print to stderr, even though that's done all through out the code. Using printf() directly doesn't maintain the same format (an "mdadm" prefix on every line. So introduce pr_info() which prints to stdout with the same format and use it for a couple informational pr_err() calls in Create(). Future work can make this call used in more cases. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Acked-by: Kinga Tanska <kinga.tanska@linux.intel.com> Reviewed-by: Xiao Ni <xni@redhat.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Acked-by: Coly Li <colyli@suse.de> Acked-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-03-13Create: Factor out add_disks() helpersLogan Gunthorpe1-169/+213
The Create function is massive with a very large number of variables. Reading and understanding the function is almost impossible. To help with this, factor out the two pass loop that adds the disks to the array. This moves about 160 lines into three new helper functions and removes a bunch of local variables from the main Create function. The main new helper function add_disks() does the two pass loop and calls into add_disk_to_super() and update_metadata(). Factoring out the latter two helpers also helps to reduce a ton of indentation. No functional changes intended. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Acked-by: Kinga Tanska <kinga.tanska@linux.intel.com> Reviewed-by: Xiao Ni <xni@redhat.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Acked-by: Coly Li <colyli@suse.de> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-03-13Create: remove safe_mode_delay local variableLogan Gunthorpe1-3/+1
All .getinfo_super() call sets the info.safe_mode_delay variables to a constant value, so no matter what the current state is that function will always set it to the same value. Create() calls .getinfo_super() multiple times while creating the array. The value is stored in a local variable for every disk in the loop to add disks (so the last disc call takes precedence). The local variable is then used in the call to sysfs_set_safemode(). This can be simplified by using info.safe_mode_delay directly. The info variable had .getinfo_super() called on it early in the function so, by the reasoning above, it will have the same value as the local variable which can thus be removed. Doing this allows for factoring out code from Create() in a subsequent patch. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Acked-by: Kinga Tanska <kinga.tanska@linux.intel.com> Reviewed-by: Xiao Ni <xni@redhat.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Acked-by: Coly Li <colyli@suse.de> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-03-13Create: goto abort_locked instead of return 1 in error pathLogan Gunthorpe1-1/+1
The return 1 after the fstat_is_blkdev() check should be replaced with an error return that goes through the error path to unlock resources locked by this function. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Acked-by: Kinga Tanska <kinga.tanska@linux.intel.com> Reviewed-by: Xiao Ni <xni@redhat.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Acked-by: Coly Li <colyli@suse.de> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-03-08super-ddf.c: fix memleak in get_vd_num_of_subarray()Wu Guanghao1-2/+7
sra = sysfs_read() should be free before return in get_vd_num_of_subarray() Signed-off-by: Wu Guanghao <wuguanghao3@huawei.com> Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Acked-by: Coly Li <colyli@suse.de> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-03-08super-intel.c: fix memleak in find_disk_attached_hba()Wu Guanghao1-2/+2
If disk_path = diskfd_to_devpath(), we need free(disk_path) before return, otherwise there will be a memory leak Signed-off-by: Wu Guanghao <wuguanghao3@huawei.com> Reviewed-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Acked-by: Coly Li <colyli@suse.de> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-03-08isuper-intel.c: fix double free in load_imsm_mpb()Wu Guanghao1-0/+1
In load_imsm_mpb() there is potential double free issue on super->buf. The first location to free super->buf is from get_super_block() <== load_and_parse_mpb() <== load_imsm_mpb(): 4514 if (posix_memalign(&super->migr_rec_buf, MAX_SECTOR_SIZE, 4515 MIGR_REC_BUF_SECTORS*MAX_SECTOR_SIZE) != 0) { 4516 pr_err("could not allocate migr_rec buffer\n"); 4517 free(super->buf); 4518 return 2; 4519 } If the above error condition happens, super->buf is freed and value 2 is returned to get_super_block() eventually. Then in the following code block inside load_imsm_mpb(), 5289 error: 5290 if (!err) { 5291 s->next = *super_list; 5292 *super_list = s; 5293 } else { 5294 if (s) 5295 free_imsm(s); 5296 close_fd(&dfd); 5297 } at line 5295 when free_imsm() is called, super->buf is freed again from the call chain free_imsm() <== __free_imsm(), in following code block, 4651 if (super->buf) { 4652 free(super->buf); 4653 super->buf = NULL; 4654 } This patch sets super->buf as NULL after line 4517 in load_imsm_mpb() to avoid the potential double free(). (Coly Li helps to re-compose the commit log) Signed-off-by: Wu Guanghao <wuguanghao3@huawei.com> Reviewed-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Acked-by: Coly Li <colyli@suse.de> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-03-08Detail.c: fix memleak in Detail()Wu Guanghao1-0/+1
char *sysdev = xstrdup() but not free() in for loop, will cause memory leak Signed-off-by: Wu Guanghao <wuguanghao3@huawei.com> Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Acked-by: Coly Li <colyli@suse.de> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-03-08util.c: fix memleak in parse_layout_faulty()Wu Guanghao1-0/+2
char *m is allocated by xstrdup but not free() before return, will cause a memory leak Signed-off-by: Wu Guanghao <wuguanghao3@huawei.com> Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Acked-by: Coly Li <colyli@suse.de> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-03-08util.c: reorder code lines in parse_layout_faulty()Coly Li1-3/+6
Resort the code lines in parse_layout_faulty() to make it more comfortable, no logic change. Signed-off-by: Coly Li <colyli@suse.de> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-03-02Mdmonitor: Refactor check_one_sharer() for better error handlingMateusz Grzonka1-27/+62
Also check if autorebuild.pid is a symlink, which we shouldn't accept. Signed-off-by: Mateusz Grzonka <mateusz.grzonka@intel.com> Acked-by: Coly Li <colyli@suse.de> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-03-02Mdmonitor: Refactor write_autorebuild_pid()Mateusz Grzonka1-19/+36
Add better error handling and check for symlinks when opening MDMON_DIR. Signed-off-by: Mateusz Grzonka <mateusz.grzonka@intel.com> Acked-by: Coly Li <colyli@suse.de> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-03-02Add helpers to determine whether directories or files are soft linksMateusz Grzonka2-0/+47
Signed-off-by: Mateusz Grzonka <mateusz.grzonka@intel.com> Acked-by: Coly Li <colyli@suse.de> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-03-02Mdmonitor: Add helper functionsMateusz Grzonka1-70/+158
Add functions: - is_email_event(), - get_syslog_event_priority(), - sprint_event_message(), with kernel style comments containing more detailed descriptions. Also update event syslog priorities to be consistent with man. MoveSpare event was described in man as priority info, while implemented as warning. Move event data into a struct, so that it is passed between different functions if needed. Sort function declarations alphabetically and remove redundant alert() declaration. Signed-off-by: Mateusz Grzonka <mateusz.grzonka@intel.com> Acked-by: Coly Li <colyli@suse.de> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-03-02Mdmonitor: Pass events to alert() using enums instead of stringsMateusz Grzonka1-53/+83
Add events enum, and mapping_t struct, that maps them to strings, so that enums are passed around instead of strings. Signed-off-by: Mateusz Grzonka <mateusz.grzonka@intel.com> Acked-by: Coly Li <colyli@suse.de> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-03-02Mdmonitor: Make alert_info globalMateusz Grzonka1-63/+61
Move information about --test flag and hostname into alert_info. Signed-off-by: Mateusz Grzonka <mateusz.grzonka@intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-02-28Fix NULL dereference in super_by_fdLi Xiao Keng2-1/+10
When we create 100 partitions (major is 259 not 254) in a raid device, mdadm may coredump: Core was generated by `/usr/sbin/mdadm --detail --export /dev/md1p7'. Program terminated with signal SIGSEGV, Segmentation fault. #0 __strlen_avx2_rtm () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:74 74 VPCMPEQ (%rdi), %ymm0, %ymm1 (gdb) bt #0 __strlen_avx2_rtm () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:74 #1 0x00007fbb9a7e4139 in __strcpy_chk (dest=dest@entry=0x55d55d6a13ac "", src=0x0, destlen=destlen@entry=32) at strcpy_chk.c:28 #2 0x000055d55ba1766d in strcpy (__src=<optimized out>, __dest=0x55d55d6a13ac "") at /usr/include/bits/string_fortified.h:79 #3 super_by_fd (fd=fd@entry=3, subarrayp=subarrayp@entry=0x7fff44dfcc48) at util.c:1289 #4 0x000055d55ba273a6 in Detail (dev=0x7fff44dfef0b "/dev/md1p7", c=0x7fff44dfe440) at Detail.c:101 #5 0x000055d55ba0de61 in misc_list (c=<optimized out>, ss=<optimized out>, dump_directory=<optimized out>, ident=<optimized out>, devlist=<optimized out>) at mdadm.c:1959 #6 main (argc=<optimized out>, argv=<optimized out>) at mdadm.c:1629 The direct cause is fd2devnm returning NULL, so add a check. Signed-off-by: Li Xiao Keng <lixiaokeng@huawei.com> Signed-off-by: Wu Guang Hao <wuguanghao3@huawei.com> Acked-by: Coly Li <colyli@suse.de> Acked-by: Coly Li <colyli@suse.de <mailto:colyli@suse.de>> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-02-23Grow: fix can't change bitmap type from none to clustered.Heming Zhao1-1/+1
Commit a042210648ed ("disallow create or grow clustered bitmap with writemostly set") introduced this bug. We should use 'true' logic not '== 0' to deny setting up clustered array under WRITEMOSTLY condition. How to trigger ``` ~/mdadm # ./mdadm -Ss && ./mdadm --zero-superblock /dev/sd{a,b} ~/mdadm # ./mdadm -C /dev/md0 -l mirror -b clustered -e 1.2 -n 2 \ /dev/sda /dev/sdb --assume-clean mdadm: array /dev/md0 started. ~/mdadm # ./mdadm --grow /dev/md0 --bitmap=none ~/mdadm # ./mdadm --grow /dev/md0 --bitmap=clustered mdadm: /dev/md0 disks marked write-mostly are not supported with clustered bitmap ``` Signed-off-by: Heming Zhao <heming.zhao@suse.com> Acked-by: Coly Li <colyli@suse.de> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-02-02Revert "mdadm/systemd: remove KillMode=none from service file"Mariusz Tkaczyk2-0/+2
This reverts commit 52c67fcdd6dadc4138ecad73e65599551804d445. The functionality is marked as deprecated but we don't have alternative solution yet. Shutdown hangs if OS is installed on external array: task:umount state:D stack: 0 pid: 6285 ppid: flags:0x00004084 Call Trace: __schedule+0x2d1/0x830 ? finish_wait+0x80/0x80 schedule+0x35/0xa0 md_write_start+0x14b/0x220 ? finish_wait+0x80/0x80 raid1_make_request+0x3c/0x90 [raid1] md_handle_request+0x128/0x1b0 md_make_request+0x5b/0xb0 generic_make_request_no_check+0x202/0x330 submit_bio+0x3c/0x160 Use it until new solution is implemented. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-01-05manage: move comment with function descriptionKinga Tanska1-28/+44
Move the function description from the function body to outside to obey kernel coding style. Signed-off-by: Kinga Tanska <kinga.tanska@intel.com> Acked-by: Coly Li <colyli@suse.de> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-01-04super-intel: make freesize not required for chunk size migrationKinga Tanska1-5/+5
Freesize is needed to be set for migrations where size of RAID could be changed - expand. It tells how many free space is determined for members. In chunk size migartion freesize is not needed to be set, pointer shouldn't be checked if exists. This commit moves check to condition which contains size calculations, instead of checking it always at the first step. Fix return value when superblock is not set. Signed-off-by: Kinga Tanska <kinga.tanska@intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-01-04incremental, manage: do not verify if remove is safeKinga Tanska2-4/+5
Function is_remove_safe() was introduced to verify if removing member device won't cause failed state of the array. This verification should be used only with set-faulty command. Add special mode indicating that Incremental removal was executed. If this mode is used do not execute is_remove_safe() routine. Signed-off-by: Kinga Tanska <kinga.tanska@intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-01-04Manage: do not check array state when drive is removedKinga Tanska1-2/+1
Array state doesn't need to be checked when drive is removed, but until now clean state was required. Result of the is_remove_safe() function will be independent from array state. Signed-off-by: Kinga Tanska <kinga.tanska@intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-01-04mdadm/udev: Don't handle change event on raw devicesXiao Ni1-0/+8
The raw devices are ready when add event happpens and the raid can be assembled. So there is no need to handle change events. And it can cause some inconvenient problems. For example, the OS is installed on md0(/root) and md1(/home). md0 and md1 are created on partitions. When it wants to re-install OS, anaconda can't clear the storage configure. It deletes one partition and does some jobs. The change event happens. Now the raid device is assembled again. It can't delete the other partitions. So in this patch, we don't handle change event on raw devices anymore. Signed-off-by: Xiao Ni <xni@redhat.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-01-04util: remove obsolete code from get_md_nameMateusz Kusiak2-39/+20
get_md_name() is used only with mdstat entries. Remove dead code and simplyfy function. Remove redundadnt checks from mdmon.c Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-01-04mdmon: fix segfaultMateusz Kusiak2-15/+13
Mdmon crashes if stat2devnm returns null. Use open_mddev to check if device is mddevice and get name using fd2devnm. Refactor container name handling. Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-01-04Change char* to enum in context->update & refactor codeMateusz Kusiak3-57/+37
Storing update option in string is bad for frequent comparisons and error prone. Replace char array with enum so already existing enum is passed around instead of string. Adapt code to changes. Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-01-04Manage&Incremental: code refactor, string to enumMateusz Kusiak5-33/+45
Prepare Manage and Incremental for later changing context->update to enum. Change update from string to enum in multiple functions and pass enum where already possible. Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-01-04Change update to enum in update_super and update_subarrayMateusz Kusiak9-62/+52
Use already existing enum, change update_super and update_subarray update to enum globally. Refactor function references also. Remove code specific options from update_options. Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-01-04super-intel: refactor the code for enumMateusz Kusiak1-12/+25
It prepares super-intel for change context->update to enum. Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-01-04super1: refactor the code for enumMateusz Kusiak1-61/+91
It prepares update_super1 for change context->update to enum. Change if else statements into switch. Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-01-04super0: refactor the code for enumMateusz Kusiak1-39/+63
It prepares update_super0 for change context->update to enum. Change if else statements to switch. Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-01-04super-ddf: Remove update_super_ddf.Mateusz Kusiak1-70/+0
This is not supported by ddf. It hides errors by returning success status for some updates. Remove update_super_dff(). Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-01-04Add code specific update options to enum.Mateusz Kusiak2-0/+36
Some of update options aren't taken from user input, but are hard-coded as strings. Include those options in enum. Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-01-04Fix --update-subarray on active volumeMateusz Kusiak2-5/+7
Options: bitmap, ppl and name should not be updated when array is active. Those features are mutually exclusive and share the same data area in IMSM (danger of overwriting by kernel). Remove check for active subarrays from super-intel. Since ddf is not supported, apply it globally for all options. Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2023-01-04mdadm: Add option validation for --update-subarrayMateusz Kusiak4-74/+124
Subset of options available for "--update" is not same as for "--update-subarray". Define maps and enum for update options and use them instead of direct comparisons. Add proper error message. Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2022-12-28mdadm: create ident_init()Mariusz Tkaczyk3-32/+36
Add a wrapper for repeated initializations in mdadm.c and config.c. Move includes up. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2022-12-28Grow: fix possible memory leak.Blazej Kucman1-1/+4
Signed-off-by: Blazej Kucman <blazej.kucman@intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2022-12-28Update mdadm Monitor manual.Blazej Kucman1-21/+50
- describe monitor work modes, - clarify the turning off condition, - describe the mdmonitor.service as a prefered management way. Signed-off-by: Blazej Kucman <blazej.kucman@intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2022-12-28Monitor: block if monitor modes are combined.Blazej Kucman1-1/+6
Block monitoring start if --scan mode and MD devices list are combined. Signed-off-by: Blazej Kucman <blazej.kucman@intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2022-12-28Mdmonitor: Split alert() into separate functionsMateusz Grzonka1-91/+95
Signed-off-by: Mateusz Grzonka <mateusz.grzonka@intel.com> Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
2022-09-29Mdmonitor: Omit non-md devicesLukasz Florczak1-8/+4
Fix segfault commit [1] introduced check whether given device is mddevice, but it happend to terminate Mdmonitor if at least one of given devices didn't fulfill that condition. In result Mdmonitor service was no longer started on boot (with --scan option) when config contained some non-existent array entry. This commit introduces ommiting non-md devices so scan option can still be used when config is wrong and allow Mdmonitor service to run on boot. Giving a list of devices to monitor containing non-existing or non-md devices will result in monitoring only confirmed mddevices. [1] https://git.kernel.org/pub/scm/utils/mdadm/mdadm.git/commit/?id=e702f392959d1c2ad2089e595b52235ed97b4e18 Signed-off-by: Lukasz Florczak <lukasz.florczak@linux.intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2022-09-29mdadm: replace container level checking with inlineKinga Tanska10-20/+33
To unify all containers checks in code, is_container() function is added and propagated. Signed-off-by: Kinga Tanska <kinga.tanska@intel.com> Acked-by: Coly Li <colyli@suse.de> Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2022-09-16ReadMe: fix command-line helpMariusz Tkaczyk1-1/+1
Make command-line help consistent with manual page. Copied from Debian. Cc: Felix Lechner <felix.lechner@lease-up.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Acked-by: Coly Li <colyli@suse.de> Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2022-09-16mdadm: Add Documentation entries to systemd servicesMariusz Tkaczyk7-1/+8
Add documentation section. Copied from Debian. Cc: Felix Lechner <felix.lechner@lease-up.com> Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Acked-by: Coly Li <colyli@suse.de> Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2022-09-14mdadm: added support for Intel Alderlake RST on VMD platformOldřich Jedlička1-5/+13
Alderlake RST on VMD uses RstVmdV UEFI variable name, so detect it. Signed-off-by: Oldřich Jedlička <oldium.pro@gmail.com> Reviewed-by: Kinga Tanska <kinga.tanska@linux.intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2022-09-08Monitor: Fix statelist memory leaksPawel Baldysiak1-9/+31
Free statelist in error path in Monitor initialization. Signed-off-by: Pawel Baldysiak <pawel.baldysiak@intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2022-09-08Manage: Block unsafe member failingMateusz Kusiak1-1/+52
Kernel may or may not block mdadm from removing member device if it will cause arrays failed state. It depends on raid personality implementation in kernel. Add verification on requested removal path (#mdadm --set-faulty command). Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2022-09-08mdadm: Correct typos, punctuation and grammar in manMateusz Grzonka1-90/+88
Signed-off-by: Mateusz Grzonka <mateusz.grzonka@intel.com> Reviewed-by: Wol <anthony@youngman.org.uk> Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2022-08-29super1: report truncated deviceNeilBrown1-7/+28
When the metadata is at the start of the device, it is possible that it describes a device large than the one it is actually stored on. When this happens, report it loudly in --examine. .... Unused Space : before=1968 sectors, after=-2047 sectors DEVICE TOO SMALL State : clean TRUNCATED DEVICE .... Also report in --assemble so that the failure which the kernel will report will be explained. mdadm: Device /dev/sdb is not large enough for data described in superblock mdadm: no RAID superblock on /dev/sdb mdadm: /dev/sdb has no superblock - assembly aborted Scenario can be demonstrated as follows: mdadm: Note: this array has metadata at the start and may not be suitable as a boot device. If you plan to store '/boot' on this device please ensure that your boot-loader understands md/v1.x metadata, or use --metadata=0.90 mdadm: Defaulting to version 1.2 metadata mdadm: array /dev/md/test started. mdadm: stopped /dev/md/test Unused Space : before=1968 sectors, after=-2047 sectors DEVICE TOO SMALL State : clean TRUNCATED DEVICE Unused Space : before=1968 sectors, after=-2047 sectors DEVICE TOO SMALL State : clean TRUNCATED DEVICE Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2022-08-24Assemble: check if device is container before scheduling force-clean updateKinga Tanska1-3/+2
Up to now using assemble with force flag making each array as clean. Force-clean should not be done for the container. This commit add check if device is different than container before cleaning. Signed-off-by: Kinga Tanska <kinga.tanska@intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2022-08-24Grow: Split Grow_reshape into helper functionMateusz Kusiak3-59/+81
Grow_reshape should be split into helper functions given its size. - Add helper function for preparing reshape on external metadata. - Close cfd file descriptor. Signed-off-by: Mateusz Kusiak <mateusz.kusiak@intel.com> Acked-by: Coly Li <colyli@suse.de> Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2022-08-23mdadm: Don't open md device for CREATE and ASSEMBLELogan Gunthorpe3-20/+33
The mdadm command tries to open the md device for most modes, first thing, no matter what. When running to create or assemble an array, in most cases, the md device will not exist, the open call will fail and everything will proceed correctly. However, when running tests, a create or assembly command may be run shortly after stopping an array and the old md device file may still be around. Then, if create_on_open is set in the kernel, a new md device will be created when mdadm does its initial open. When mdadm gets around to creating the new device with the new_array parameter it issues this error: mdadm: Fail to create md0 when using /sys/module/md_mod/parameters/new_array, fallback to creation via node This is because an mddev was already created by the kernel with the earlier open() call and thus the new one being created will fail with EEXIST. The mdadm command will still successfully be created due to falling back to the node creation method. However, the error message itself will fail any test that's running it. This issue is a race condition that is very rare, but a recent change in the kernel caused this to happen more frequently: about 1 in 50 times. To fix this, don't bother trying to open the md device for CREATE, ASSEMBLE and BUILD commands, as the file descriptor will never be used anyway even if it is successfully openned. The mdfd has not been used for these commands since: 7f91af49ad09 ("Delay creation of array devices for assemble/build/create") The checks that were done on the open device can be changed to being done with stat. Side note: it would be nice to disable create_on_open as well to help solve this, but it seems the work for this was never finished. By default, mdadm will create using the old node interface when a name is specified unless the user specifically puts names=yes in a config file, which doesn't seem to be common or desirable to require this.. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2022-08-23mdadm: move data_offset to struct shapeMariusz Tkaczyk4-26/+22
Data offset is a shape property so move it there to remove additional parameter from some functions. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2022-08-23mdadm: remove symlink optionMariusz Tkaczyk6-53/+1
The option is not used. Remove it from code. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2022-08-23tests: add test for namesMariusz Tkaczyk1-0/+93
Current behavior is not documented and tested. This test is a base for future improvements. It is enough to test it only with native metadata, because it is generic code. Generated properties are passed to metadata handler. Signed-off-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>
2022-08-23tests/00readonly: Run udevadm settle before setting roLogan Gunthorpe1-0/+1
In some recent kernel versions, 00readonly fails with: mdadm: failed to set readonly for /dev/md0: Device or resource busy ERROR: array is not read-only! This was traced down to a race condition with udev holding a reference to the block device at the same time as trying to set it read only. To fix this, call udevadm settle before setting the array read only. Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Jes Sorensen <jsorensen@fb.com>