commit 53c580fabdc9af3adc096511c0caeb0673ab181d Author: Adam Borowski <kilobyte@angband.pl> Date: Mon Apr 14 00:12:58 2025 +0200 Debian release 81-1 commit 5444ecacb6dd8cfd02b92a1ca286a34a6ac769c4 Author: Adam Borowski <kilobyte@angband.pl> Date: Mon Apr 14 00:30:29 2025 +0200 Update symbols files. I didn't bother assigning the exact version a symbol got added for releases not in Debian. commit 00a067db2646c5f5f1ee9a50ca33c50ff1a8f516 Merge: 00d115e 92d5203 Author: Adam Borowski <kilobyte@angband.pl> Date: Mon Apr 14 00:08:46 2025 +0200 Merge tag 'v81' into debian ndctl: release v81 This release incorporates functionality up to the 6.14 kernel. Highlights are usability improvements in daxctl and region creation and listing. Namespace creation gets more robust parameter handling and unit tests are improved. Commands: cxl/region: report max size for region creation ndctl/list: display region caps for any of BTT, PFN, DAX daxctl: output more information if memblock is unremovable ndctl/namespace: protect against overflow handling param.offset ndctl/namespace: avoid integer overflow in namespace validation ndctl/namespace: close file descriptor in do_xaction_namespace() ndctl/namespace: protect against under|over-flow w bad param.align Tests: test/cxl-events.sh: do not fail test until event counts are reported test/security.sh: add jq requirement check test/monitor.sh: make shell variable handling more robust test/monitor.sh: convert float to integer before increment API: cxl/lib: remove unimplemented symbol cxl_mapping_get_region Infrastructure: util/strbuf: remove unused cli infrastructure imports cxl/json: remove prefix from tracefs.h #include ndctl: release v80 This release incorporates functionality up through the 6.11 kernel. Highlights include support for listing CXL media-errors, usability fixups in daxctl create-device, the addition of firmware revision in CXL memdev listings, and misc unit test and build fixes. Commands: cxl-list: add --media-errors option cxl-list: always emit memdev firmware revision daxctl: fail create-device with extra parameters daxctl: remove unused options from create-device usage message ndctl load-keys: stop leaking file descriptors upon error Tests: cxl-poison.sh: new unit test cxl-events.sh: add test case for region info daxctl-create.sh: use bash math syntax to find available size daxctl-create.sh: use CXL DAX regions instead of efi_fake_mem rescan-partitions.sh: refine search for created partition API: cxl_memdev_trigger_poison_list cxl_region_trigger_poison_list ndctl: release v79 This release incorporates functionality up to and including the 6.9 kernel. Highlights include test and build fixes, a new cxl-wait-sanitize command, support for QOS Class in cxl-create-region, and a new cxl-set-alert-config command. Commands: cxl-create-region: Add QOS Class support cxl-wait-sanitize: New command cxl-disable-region: Add a new --force option cxl-set-alert-config: New command cxl-monitor: fix event_trace array parsing daxctl-destroy-device: fix accounting for number of devices destroyed Tests: cxl/test: use max_available_extent in cxl-destroy-region cxl: Add a test for qos_class in CXL test suite cxl/test: add 3-way HB interleave testcase to cxl-xor-region.sh cxl/test: add double quotes in cxl-xor-region.sh cxl/test: replace spaces with tabs in cxl-xor-region.sh test/daxctl-create.sh: remove region and dax device assumptions test/cxl-region-sysfs: fix a missing space syntax error test/cxl-region-sysfs.sh: use '[[ ]]' command to evaluate operands as arithmetic expressions ndctl/test: Add destroy region test cxl/test: Validate sanitize notifications cxl/test: validate the auto region in cxl-topology.sh cxl/test: replace a bad root decoder usage in cxl-xor-region.sh test/security.sh: test keyctl before excuting test/daxctl-devices.sh: increase the namespace size to 4GiB test/cxl-event: Skip cxl event testing if cxl-test is not available test/cxl-update-firmware: Fix checksum sysfs query APIs: daxctl_dev_is_system_ram_capable cxl_cmd_alert_config_set_corrected_pmem_err_prog_warn_threshold cxl_cmd_alert_config_set_corrected_volatile_mem_err_prog_warn_threshold cxl_cmd_alert_config_set_dev_over_temperature_prog_warn_threshold cxl_cmd_alert_config_set_dev_under_temperature_prog_warn_threshold cxl_cmd_alert_config_set_enable_alert_actions cxl_cmd_alert_config_set_life_used_prog_warn_threshold cxl_cmd_alert_config_set_valid_alert_actions cxl_cmd_new_set_alert_config cxl_memdev_get_pmem_qos_class cxl_memdev_get_ram_qos_class cxl_memdev_wait_sanitize cxl_port_decoders_committed cxl_region_qos_class_mismatch cxl_root_decoder_get_qos_class ndctl: release v78 This release incorporates functionality up to the 6.5 kernel. Highlights include fixes to cxl-monitor and ndctl-monitor, a new cxl-update-firmware command, various documentation fixes, and a new unit test for cxl events. Commands: cxl-update-firmware: new command to update firmware on CXL memdevs {cxl,ndctl}-monitor: fixes for option handling for logging Tests: cxl-events.sh: new test for CXL events APIs: cxl_cmd_fw_info_get_active_slot cxl_cmd_fw_info_get_fw_ver cxl_cmd_fw_info_get_num_slots cxl_cmd_fw_info_get_online_activate_capable cxl_cmd_fw_info_get_staged_slot cxl_cmd_new_get_fw_info cxl_memdev_cancel_fw_update cxl_memdev_fw_update_get_remaining cxl_memdev_fw_update_in_progress cxl_memdev_update_fw commit 00d115eab3917802c85d9f4cb26564a7b7696a63 Author: Adam Borowski <kilobyte@angband.pl> Date: Sun Apr 13 23:59:58 2025 +0200 Incorporate NMU by Chris Hofstaedtler. commit 92d5203077553bfc9f7bf1c219563db0fc28e660 Author: Alison Schofield <alison.schofield@intel.com> Date: Fri Mar 21 16:39:19 2025 -0700 ndctl: release v81 This release incorporates functionality up to the 6.14 kernel. Highlights are usability improvements in daxctl and region creation and listing. Namespace creation gets more robust parameter handling and unit tests are improved. Commands: cxl/region: report max size for region creation ndctl/list: display region caps for any of BTT, PFN, DAX daxctl: output more information if memblock is unremovable ndctl/namespace: protect against overflow handling param.offset ndctl/namespace: avoid integer overflow in namespace validation ndctl/namespace: close file descriptor in do_xaction_namespace() ndctl/namespace: protect against under|over-flow w bad param.align Tests: test/cxl-events.sh: do not fail test until event counts are reported test/security.sh: add jq requirement check test/monitor.sh: make shell variable handling more robust test/monitor.sh: convert float to integer before increment API: cxl/lib: remove unimplemented symbol cxl_mapping_get_region Infrastructure: util/strbuf: remove unused cli infrastructure imports cxl/json: remove prefix from tracefs.h #include commit d49ba4b2fe3c39e2f5d64d2d4d1ac319675a6944 Author: Alison Schofield <alison.schofield@intel.com> Date: Thu Mar 6 15:50:14 2025 -0800 ndctl/namespace: protect against under|over-flow w bad param.align A coverity scan highlighted an integer underflow when param.align is 0, and an integer overflow when the parsing of param.align fails and returns ULLONG_MAX. Add explicit checks for both values. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/5f8a8a6cf332ec9ceb636180b9dd5cbf801f1e6e.1741304303.git.alison.schofield@intel.com Signed-off-by: Alison Schofield <alison.schofield@intel.com> commit 901b60c249bab74277e97b6f36c946fcf391ef42 Author: Alison Schofield <alison.schofield@intel.com> Date: Thu Mar 6 15:50:13 2025 -0800 ndctl/namespace: protect against overflow handling param.offset A param.offset is parsed using parse_size64() but the result is not checked for the error return ULLONG_MAX. If ULLONG_MAX is returned, follow-on calculations will lead to overflow. Add check for ULLONG_MAX upon return from parse_size64. Add check for overflow in subsequent PFN_MODE offset calculation. This issue was reported in a coverity scan. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/fd9b0fa9091490c71791ebd695ee48f8da12e5ec.1741304303.git.alison.schofield@intel.com Signed-off-by: Alison Schofield <alison.schofield@intel.com> commit 43e762fec751ce3a61a93ade9467e529b6938641 Author: Alison Schofield <alison.schofield@intel.com> Date: Thu Mar 6 15:50:12 2025 -0800 ndctl/dimm: do not increment a ULLONG_MAX slot value A coverity scan higlighted an overflow issue when the slot variable, an unsigned integer that is initialized to -1, is incremented and overflows. Initialize slot to 0 and increment slot in the for loop header. That keeps the comparison to a u32 as is and avoids overflow. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/04880bb53cbd400d9906ca2ac5042a9dc23b925f.1741304303.git.alison.schofield@intel.com Signed-off-by: Alison Schofield <alison.schofield@intel.com> commit d82c1dc842c0f3e34c74e10cfd6d8474ce9ce4bd Author: Alison Schofield <alison.schofield@intel.com> Date: Thu Mar 6 15:50:11 2025 -0800 ndctl/namespace: close file descriptor in do_xaction_namespace() A coverity scan highlighted a resource leak caused by not freeing the open file descriptor upon exit of do_xaction_namespace(). Move the fclose() to a 'goto out_close' and route all returns through that path. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/267483d9d16460ee4e5726c1675df4510d246ebc.1741304303.git.alison.schofield@intel.com Signed-off-by: Alison Schofield <alison.schofield@intel.com> commit 9e6f8ca8fc13abb7c56fe4ed92d92d11ba608814 Author: Alison Schofield <alison.schofield@intel.com> Date: Thu Mar 6 15:50:10 2025 -0800 ndctl/namespace: avoid integer overflow in namespace validation A coverity scan highlighted an integer overflow issue when testing if the size and align parameters make sense together. Before performing the multiplication, check that the result will not exceed the maximum value that an unsigned long long can hold. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/1b3cc602d61a1b0a5383a481452d216331e3477e.1741304303.git.alison.schofield@intel.com Signed-off-by: Alison Schofield <alison.schofield@intel.com> commit def79df43e763dd89973e0732dd49ee5f38416ac Author: Donet Tom <donettom@linux.vnet.ibm.com> Date: Thu Feb 20 00:20:29 2025 -0600 ndctl/list: display region caps for any of BTT, PFN, DAX If any one of BTT, PFN, or DAX is not present, but the other two are, then the region capabilities are not displayed in the ndctl list -R -C command. This is because util_region_capabilities_to_json() returns NULL if any one of BTT, PFN, or DAX is not present. In this patch, we have changed the logic to display all the region capabilities that are present. Test Results with CONFIG_BTT disabled ===================================== Without this patch ------------------ # ./ndctl list -R -C [ { "dev":"region1", "size":12884901888, "align":16777216, "available_size":11257511936, "max_available_extent":9630121984, "type":"pmem", "iset_id":14748366918514061582, "persistence_domain":"unknown" }, With this patch --------------- # ./ndctl list -R -C [ { "dev":"region1", "size":12884901888, "align":16777216, "available_size":11257511936, "max_available_extent":9630121984, "type":"pmem", "iset_id":14748366918514061582, "capabilities":[ { "mode":"fsdax", "alignments":[ 65536, 2097152, 1073741824 ] }, { "mode":"devdax", "alignments":[ 65536, 2097152, 1073741824 ] } ], "persistence_domain":"unknown" }, Fixes: 965fa02e372f ("util: Distribute 'filter' and 'json' helpers to per-tool objects") Signed-off-by: Donet Tom <donettom@linux.vnet.ibm.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Tested-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/20250220062029.9789-1-donettom@linux.vnet.ibm.com Signed-off-by: Alison Schofield <alison.schofield@intel.com> commit 0c5c33bb37d9289181968da5890cd930a3259f9f Author: Alison Schofield <alison.schofield@intel.com> Date: Tue Feb 11 19:40:18 2025 -0800 util/strbuf: remove unused cli infrastructure imports The ndctl cli interface is built around an imported perf cli infrastructure which was originally from git. [1] A recent static analysis scan exposed an integer overflow issue in strbuf_read() and although that is fixable, the function is not used in ndctl. Further examination revealed additional unused functionality in the string buffer handling import and a subset of that has already been obsoleted from the perf cli. In the interest of not maintaining unused code, remove the unused code in util/strbuf.h,c. Ndctl, including cxl-cli and daxctl, are mature cli's so it seems ok to let this functionality go after ~10 years. In the interest of not touching what is not causing an issue, the entirety of the original import was not reviewed at this time. [1] 91677390f9e6 ("ndctl: import cli infrastructure from perf") Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20250212034020.1865719-1-alison.schofield@intel.com Signed-off-by: Alison Schofield <alison.schofield@intel.com> commit cb21d22450c3a10be573aad3248b6a161dd37eb9 Author: Alison Schofield <alison.schofield@intel.com> Date: Fri Feb 14 18:13:16 2025 -0800 cxl/lib: remove unimplemented symbol cxl_mapping_get_region User reports this symbol was added but has never had an implementation causing their linker ld.lld to fail like so: ld.lld: error: version script assignment of 'LIBCXL_3' to symbol 'cxl_mapping_get_region' failed: symbol not defined This likely worked for some builds but not others because of different toolchains (linkers), compiler optimizations (garbage collection), or linker flags (ignoring or only warning on unused symbols). Clean this up by removing the symbol. Reposted here from github pull request: https://github.com/pmem/ndctl/pull/267/ Reviewed-by: Fan Ni <fan.ni@samsung.com> Link: https://lore.kernel.org/r/20250215021319.1948097-1-alison.schofield@intel.com Signed-off-by: Alison Schofield <alison.schofield@intel.com> commit ac46afe33d54522c6cc31ffcf3d4e19374c3433d Author: Li Ming <ming.li@zohomail.com> Date: Thu Dec 5 00:14:56 2024 +0800 daxctl: output more information if memblock is unremovable If CONFIG_MEMORY_HOTREMOVE is disabled by kernel, memblocks will not be removed, so 'dax offline-memory all' will output below error logs: libdaxctl: offline_one_memblock: dax0.0: Failed to offline /sys/devices/system/node/node6/memory371/state: Invalid argument dax0.0: failed to offline memory: Invalid argument error offlining memory: Invalid argument offlined memory for 0 devices The log does not clearly show why the command failed. So checking if the target memblock is removable before offlining it by querying '/sys/devices/system/node/nodeX/memoryY/removable', then output specific logs if the memblock is unremovable, output will be: libdaxctl: offline_one_memblock: dax0.0: memory371 is unremovable dax0.0: failed to offline memory: Operation not supported error offlining memory: Operation not supported offlined memory for 0 devices Besides, delay to set up string 'path' for offlining memblock operation, because string 'path' is stored in 'mem->mem_buf' which is a shared buffer, it will be used in memblock_is_removable(). Signed-off-by: Li Ming <ming.li@zohomail.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/20241204161457.1113419-1-ming.li@zohomail.com Signed-off-by: Alison Schofield <alison.schofield@intel.com> commit 331f5ffba5d92b13b3403fcaf3a2a873a088568e Author: Michal Suchanek <msuchanek@suse.de> Date: Sun Feb 9 10:03:46 2025 -0800 cxl/json: remove prefix from tracefs.h #include Distros vary on whether tracefs.h is placed in {prefix}/libtracefs/ or {prefix}/tracefs/. Since the library ships with pkgconfig info to determine the exact include path the #include statement can drop the tracefs/ prefix. This was previously found and fixed elsewhere: a59866328ec5 ("cxl/monitor: fix include paths for tracefs and traceevent") but was introduced anew with cxl media-error support in ndctl v80. Reposted here from github pull request: https://github.com/pmem/ndctl/pull/268/ [ alison: commit msg and log edits ] Fixes: 9873123fce03 ("cxl/list: collect and parse media_error records") Signed-off-by: Michal Suchanek <msuchanek@suse.de> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Marc Herbert <marc.herbert@intel.com> Link: https://lore.kernel.org/r/20250209180348.1773179-1-alison.schofield@intel.com Signed-off-by: Alison Schofield <alison.schofield@intel.com> commit 62222a96407f7ba48cc744ba9d573f7be907117b Author: Ira Weiny <ira.weiny@intel.com> Date: Sat Dec 14 20:58:28 2024 -0600 test/cxl-events.sh: do not fail test until event counts are reported Testing revealed that a failed cxl-event test lacked details on the event counts. This was because the greps were failing the test rather than the check against the counts. Suppress the grep failure and rely on event count checks for pass/fail of the test. [ alison: rm reference to DCD in commit log ] Signed-off-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20241214-dcd-region2-v4-1-36550a97f8e2@intel.com Signed-off-by: Alison Schofield <alison.schofield@intel.com> commit 7225fd2b31ca695f160d7b59aac8cd0f86fe84a4 Author: Ira Weiny <ira.weiny@intel.com> Date: Sat Dec 14 20:58:29 2024 -0600 cxl/region: report max size for region creation When creating a region if the size exceeds the max an error is printed. However, the max available space is not reported which makes it harder to determine what is wrong. Add the max size available to the output error. Signed-off-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Link: https://lore.kernel.org/r/20241214-dcd-region2-v4-2-36550a97f8e2@intel.com Signed-off-by: Alison Schofield <alison.schofield@intel.com> commit 7cdc15d7c367651661ac7c6ff82ab7b955adee7a Author: Li Zhijian <lizhijian@fujitsu.com> Date: Mon Oct 14 14:49:51 2024 +0800 test/security.sh: add jq requirement check Add jq requirement check explicitly like others so that the test can be skipped when no jq is installed. Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> [ alison: edit commit msg ] Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/20241014064951.1221095-1-lizhijian@fujitsu.com Signed-off-by: Alison Schofield <alison.schofield@intel.com> commit e39826281a3a80f2ee2df16aa50e7249e4f1fb3c Author: Li Zhijian <lizhijian@fujitsu.com> Date: Fri Oct 18 09:30:20 2024 +0800 test/monitor.sh: make shell variable handling more robust monitor_dimms fails to handle strings with multiple dimms. Fixup that case and do a shellcheck cleanup of the script to avoid needlessly failing or omitting any test cases. SC2086 [1], aka. Double quote to prevent globbing and word splitting. Previously, SC2086 will cause error in [[]] or [], for example $ grep -w line build/meson-logs/testlog.txt test/monitor.sh: line 99: [: too many arguments test/monitor.sh: line 99: [: nmem0: binary operator expected Firstly, generated diff by shellcheck tool: $ shellcheck -i SC2086 -f diff test/monitor.sh In addition, remove the double quote around $1 like below because an empty "$1" passed to a command will open to '' causing an error. Example: $ ndctl/build/test/list-smart-dimm -b nfit_test.0 '' Error: unknown parameter "" - $NDCTL monitor -c "$monitor_conf" -l "$logfile" "$1" & + $NDCTL monitor -c "$monitor_conf" -l "$logfile" $1 & - jlist=$("$TEST_PATH"/list-smart-dimm -b "$smart_supported_bus" "$1") + jlist=$("$TEST_PATH"/list-smart-dimm -b "$smart_supported_bus" $1) - $NDCTL inject-smart "$monitor_dimms" "$1" + $NDCTL inject-smart "$monitor_dimms" $1 - [[ $1 == $notify_dimms ]] + [[ "$1" == "$notify_dimms" ]] - [ ! -z "$monitor_dimms" ] && break + [[ "$monitor_dimms" ]] && break [1] https://www.shellcheck.net/wiki/SC2086 [ alison: edited commit msg/log ] Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/20241018013020.2523845-2-lizhijian@fujitsu.com Signed-off-by: Alison Schofield <alison.schofield@intel.com> commit 2e78b22f0e72f1641cedd3843ec7547c897f09cf Author: Li Zhijian <lizhijian@fujitsu.com> Date: Fri Oct 18 09:30:19 2024 +0800 test/monitor.sh: convert float to integer before increment The test log reported: test/monitor.sh: line 149: 40.0: syntax error: invalid arithmetic operator (error token is ".0") It does stop the test prematurely. We never run the temperature inject test case of test_filter_dimmevent() because of the inability to increment the float. Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/20241018013020.2523845-1-lizhijian@fujitsu.com Signed-off-by: Alison Schofield <alison.schofield@intel.com> commit 04815e5f8b87e02a4fb5a61aeebaa5cad25a15c3 Author: Alison Schofield <alison.schofield@intel.com> Date: Tue Oct 8 17:21:52 2024 -0700 ndctl: release v80 This release incorporates functionality up through the 6.11 kernel. Highlights include support for listing CXL media-errors, usability fixups in daxctl create-device, the addition of firmware revision in CXL memdev listings, and misc unit test and build fixes. Commands: cxl-list: add --media-errors option cxl-list: always emit memdev firmware revision daxctl: fail create-device with extra parameters daxctl: remove unused options from create-device usage message ndctl load-keys: stop leaking file descriptors upon error Tests: cxl-poison.sh: new unit test cxl-events.sh: add test case for region info daxctl-create.sh: use bash math syntax to find available size daxctl-create.sh: use CXL DAX regions instead of efi_fake_mem rescan-partitions.sh: refine search for created partition API: cxl_memdev_trigger_poison_list cxl_region_trigger_poison_list commit ed9c3c0b1fa5124e2a66346ce5b4e329d42e0035 Author: Alison Schofield <alison.schofield@intel.com> Date: Wed Aug 28 11:43:45 2024 -0700 test/rescan-partitions.sh: refine search for created partition Unit test rescan-partitions.sh can fail because the grep test looking for the expected partition is overly broad and can match multiple pmem devices. /root/ndctl/build/meson-logs/testlog.txt reports this failure: test/rescan-partitions.sh: failed at line 50 An example of an improper grep is: 'pmem10 pmem12 pmem1p1' when only 'pmem1p1' was expected Replace the faulty grep with a query of the lsblk JSON output that examines the children of this blockdev only and matches on size. This type of pesky issue is probably arising as the unit tests are being run in more complex environments and may also be due to other unit tests not properly cleaning up after themselves. No matter the cause this change makes this test more robust and that's a good thing! Reported-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/20240828192620.302092-1-alison.schofield@intel.com/ Signed-off-by: Alison Schofield <alison.schofield@intel.com> commit dfa937e2eb261f1142e64996a643b516c63d9bbc Author: Alison Schofield <alison.schofield@intel.com> Date: Mon Aug 26 18:51:10 2024 -0700 test/daxctl-create.sh: use CXL DAX regions instead of efi_fake_mem This test tries to use DAX regions created from efi_fake_mem devices. A recent kernel change removed efi_fake_mem support causing this test to SKIP because no DAX regions can be found. Alas, a new source of DAX regions is available: CXL. Use that now. Other than selecting a different region provider, the functionality of the test remains the same. Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/519161e23a43e530dbcffac203ecbbb897aa5342.1724813664.git.alison.schofield@intel.com/ Signed-off-by: Alison Schofield <alison.schofield@intel.com> commit ceb596056aa612b88c4fb9fdebafb531d586ade7 Author: Alison Schofield <alison.schofield@intel.com> Date: Mon Aug 26 18:37:34 2024 -0700 test/daxctl-create.sh: use bash math syntax to find available size The check for 1GB of available space in a DAX region always returned true due to being wrapped inside a [[ ... ]] test, even when space wasn't available. That caused set size to fail. Update to use bash arithmetic evaluation instead. This issue likely went unnoticed because users allocated >= 1GB of efi_fake_mem. This fix is part of the transition to use CXL regions in this test as efi_fake_mem support is being removed from the kernel. Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>a Link: https://lore.kernel.org/r/865e28870eb8c072c2e368362a6d86fc4fb9cb61.1724813664.git.alison.schofield@intel.com/ Signed-off-by: Alison Schofield <alison.schofield@intel.com> commit 67afbf10f5e0bda8dac3a3b5ac057ec147cfb440 Author: Li Zhijian <lizhijian@fujitsu.com> Date: Thu Jun 6 11:51:49 2024 +0800 daxctl: remove unused options in create-device usage message RECONFIG_OPTIONS and ZONE_OPTIONS are not implemented for create-device and they will be ignored by create-device. Remove them so that the usage message is identical to the manual. Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20240606035149.1030610-2-lizhijian@fujitsu.com/ Signed-off-by: Alison Schofield <alison.schofield@intel.com> commit f285461adb699a6bd6dea73c54bcf997f2fc8c6a Author: Li Zhijian <lizhijian@fujitsu.com> Date: Thu Jun 6 11:51:48 2024 +0800 daxctl: fail create-device if extra parameters are present Previously, an incorrect index(1) for create-device is causing the 1st extra parameter to be ignored, which is wrong. For example: $ daxctl create-device region0 [ { "chardev":"dax0.1", "size":268435456, "target_node":1, "align":2097152, "mode":"devdax" } ] created 1 device where above user would want to specify '-r region0'. Check extra parameters starting from index 0 to ensure no extra parameters are specified for create-device. Cc: Fan Ni <fan.ni@samsung.com> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20240606035149.1030610-1-lizhijian@fujitsu.com/ Signed-off-by: Alison Schofield <alison.schofield@intel.com> commit a5f50e55cb9aeb6d0356edd7496580d61ff44e6e Author: Miroslav Suchy <mirek+github@lomenotecka.cz> Date: Wed Aug 21 14:52:13 2024 -0700 ndctl.spec.in: use SPDX formula for license According to SPEC v2, the operator has to be in the upper case. Reposted here from github pull request: https://github.com/pmem/ndctl/pull/265/ Signed-off-by: Miroslav Suchy <mirek+github@lomenotecka.cz> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/20240821220232.105990-1-alison.schofield@intel.com/ Signed-off-by: Alison Schofield <alison.schofield@intel.com> commit 2df04c98be394d4a78c20d0699837fd3ae93de97 Author: Jerry James <loganjerry@gmail.com> Date: Wed Aug 21 14:25:32 2024 -0700 ndctl.spec.in: enable libtrace{event|fs} support for Fedora As noted in https://src.fedoraproject.org/rpms/ndctl/pull-request/2, the expression "0%{?rhel}" evaluates to zero on Fedora, so the conditional "%if 0%{?rhel} < 9" evaluates to true, since 0 is less than 9. The result is that ndctl builds for Fedora lack support for libtraceevent and libtracefs. Correct the expression. Reposted here from github pull request: https://github.com/pmem/ndctl/pull/266/ Signed-off-by: Jerry James <loganjerry@gmail.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20240821214529.96966-1-alison.schofield@intel.com/ Signed-off-by: Alison Schofield <alison.schofield@intel.com> commit a461871665a2cb5c81e9db5a850a83af5a6381b1 Author: Jeff Moyer <jmoyer@redhat.com> Date: Tue Aug 20 14:26:41 2024 -0400 libndctl: major and minor numbers are unsigned Static analysis points out that the cast of bus->major and bus->minor to a signed type in the call to parent_dev_path could result in a negative number. I sincerely doubt we'll see major and minor numbers that large, but let's fix it. Signed-off-by: Jeff Moyer <jmoyer@redhat.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/20240820182705.139842-3-jmoyer@redhat.com/ Signed-off-by: Alison Schofield <alison.schofield@intel.com> commit 00119a66148912f2671d22e7a96d39c31832ff6a Author: Jeff Moyer <jmoyer@redhat.com> Date: Tue Aug 20 14:26:40 2024 -0400 ndctl/keys.c: don't leak fd in error cases Static analysis points out that fd is leaked in some cases. The change to the while loop is optional. I only did that to make the code consistent. Signed-off-by: Jeff Moyer <jmoyer@redhat.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/20240820182705.139842-2-jmoyer@redhat.com/ Signed-off-by: Alison Schofield <alison.schofield@intel.com> commit f1dc47e1f1dd0e93d1fe4e1e4056f663d52e7d30 Author: Alison Schofield <alison.schofield@intel.com> Date: Wed Jul 24 23:45:49 2024 -0700 cxl/list: add firmware_version to default memdev listings cxl list users may discover the firmware revision of a memory device by using the -F option to cxl list. That option uses the CXL GET_FW_INFO command and emits this json: "firmware":{ "num_slots":2, "active_slot":1, "staged_slot":1, "online_activate_capable":false, "slot_1_version":"BWFW VERSION 0", "fw_update_in_progress":false } Since device support for GET_FW_INFO is optional, the above method is not guaranteed. However, the IDENTIFY command is mandatory and provides the current firmware revision. Accessors already exist for retrieval from sysfs so simply add add the new json member to the default memdev listing. This means users of the -F option will get the same info twice, if GET_FW_INFO is supported. [ { "memdev":"mem9", "pmem_size":268435456, "serial":0, "host":"0000:c0:00.0" "firmware_version":"BWFW VERSION 00", } ] Suggested-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Xingtao Yao <yaoxt.fnst@fujitsu.com> Link: https://lore.kernel.org/r/20240725073050.219952-1-alison.schofield@intel.com/ Signed-off-by: Alison Schofield <alison.schofield@intel.com> commit eded1031991174b92b7c3b2196b7609452f9e8ab Author: Alison Schofield <alison.schofield@intel.com> Date: Thu Feb 15 20:42:01 2024 -0800 cxl/test: add test case for region info to cxl-events.sh Events cxl_general_media and cxl_dram both report DPAs that may be mapped in a region. If the DPA is mapped, the trace event will include the HPA translation, region name and region uuid in the trace event. Add a test case that triggers these events with DPAs that map into a region. Verify the region is included in the trace event. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20240328043727.2186722-1-alison.schofield@intel.com/ Signed-off-by: Alison Schofield <alison.schofield@intel.com> commit a682dceef74cfd0fbe62ea4b7c9cb90b71e0b5c3 Author: Alison Schofield <alison.schofield@intel.com> Date: Wed Mar 13 18:17:54 2024 -0700 cxl/test: add cxl-poison.sh unit test Exercise cxl list, libcxl, and driver pieces of the get poison list pathway. Inject and clear poison using debugfs and use cxl-cli to read the poison list by memdev and by region. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Tested-by: Xingtao Yao <yaoxt.fnst@fujitsu.com> Link: https://lore.kernel.org/r/4212bf9d89e31a17f0092b84da473de2abf554a2.1720241079.git.alison.schofield@intel.com/ Signed-off-by: Alison Schofield <alison.schofield@intel.com> commit d7532bb049e0a59a491a90b81b34c4a66d02f7e6 Author: Alison Schofield <alison.schofield@intel.com> Date: Wed Oct 12 20:55:15 2022 -0700 cxl/list: add --media-errors option to cxl list The --media-errors option to 'cxl list' retrieves poison lists from memory devices supporting the capability and displays the returned media_error records in the cxl list json. This option can apply to memdevs or regions. Include media-errors in the -vvv verbose option. Example usage in the Documentation/cxl/cxl-list.txt update. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Xingtao Yao <yaoxt.fnst@fujitsu.com> Link: https://lore.kernel.org/r/76eb7636d1aab2fecd60d18617828d004adb58d9.1720241079.git.alison.schofield@intel.com/ Signed-off-by: Alison Schofield <alison.schofield@intel.com> commit 9873123fce03175ef92ec5c097ef9511ae67526d Author: Alison Schofield <alison.schofield@intel.com> Date: Sun Jan 14 16:21:09 2024 -0800 cxl/list: collect and parse media_error records Media_error records are logged as events in the kernel tracing subsystem. To prepare the media_error records for cxl list, enable tracing, trigger the poison list read, and parse the generated cxl_poison events into a json representation. Use the event_trace private parsing option to customize the json representation based on cxl-list calling options and event field settings. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Xingtao Yao <yaoxt.fnst@fujitsu.com> Link: https://lore.kernel.org/r/d267fb81f39c64979e47dd52391f458b0d9178e2.1720241079.git.alison.schofield@intel.com/ Signed-off-by: Alison Schofield <alison.schofield@intel.com> commit fc5afef917a46d7d8205ebe4fdb16e632628cd08 Author: Alison Schofield <alison.schofield@intel.com> Date: Wed Oct 12 16:34:58 2022 -0700 libcxl: add interfaces for GET_POISON_LIST mailbox commands CXL devices maintain a list of locations that are poisoned or result in poison if the addresses are accessed by the host. Per the spec (CXL 3.1 8.2.9.9.4.1), the device returns the Poison List as a set of Media Error Records that include the source of the error, the starting device physical address and length. Trigger the retrieval of the poison list by writing to the memory device sysfs attribute: trigger_poison_list. The CXL driver only offers triggering per memdev, so the trigger by region interface offered here is a convenience API that triggers a poison list retrieval for each memdev contributing to a region. int cxl_memdev_trigger_poison_list(struct cxl_memdev *memdev); int cxl_region_trigger_poison_list(struct cxl_region *region); The resulting poison records are logged as kernel trace events named 'cxl_poison'. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/ef5503682f5042e68f153824634a751b41d1342a.1720241079.git.alison.schofield@intel.com/ Signed-off-by: Alison Schofield <alison.schofield@intel.com> commit 360b244b1de0ee38b24fd55d2634fdff8b104296 Author: Alison Schofield <alison.schofield@intel.com> Date: Sun Jan 14 16:18:41 2024 -0800 util/trace: add helpers to retrieve tep fields by type Add helpers to extract the value of an event record field given the field name. This is useful when the user knows the name and format of the field and simply needs to get it. The helpers also return the 'type'_MAX of the type when the field is Since this is in preparation for adding a cxl_poison private parser for 'cxl list --media-errors' support those specific required types: u8, u32, u64. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Fan Ni <fan.ni@samsung.com> Link: https://lore.kernel.org/r/b6089a98199539eca9c89f81de19cede18468408.1720241079.git.alison.schofield@intel.com/ Signed-off-by: Alison Schofield <alison.schofield@intel.com> commit eff9f1f287a3d747232f7a9254386abb3526416b Author: Alison Schofield <alison.schofield@intel.com> Date: Tue Mar 26 18:31:04 2024 -0700 util/trace: pass an event_ctx to its own parse_event method Tidy-up the calling convention used in trace event parsing by passing the entire event_ctx to its parse_event method. This makes it explicit that a parse_event operates on an event_ctx object and it allows the parse_event function to access any members of the event_ctx structure. This is in preparation for adding a private parser requiring more context for cxl_poison events. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/da9be6ff7edcaef18470cc1579343fc08bc1dc1e.1720241079.git.alison.schofield@intel.com/ Signed-off-by: Alison Schofield <alison.schofield@intel.com> commit d40cac268b924acdaead1e65820c197f6e1b6c0e Author: Alison Schofield <alison.schofield@intel.com> Date: Thu Nov 10 10:40:00 2022 -0800 util/trace: add an optional pid check to event parsing When parsing events, callers may only be interested in events that originate from the current process. Introduce an optional argument to the event trace context: event_pid. When event_pid is present, simply skip the parsing of events without a matching pid. It is not a failure to see other, non matching events. The initial use case for this is CXL device poison listings where only the media-error records requested by this process are wanted. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/78e904ed934820f217f96d19603acf64e322184a.1720241079.git.alison.schofield@intel.com/ Signed-off-by: Alison Schofield <alison.schofield@intel.com> commit 9d234d8127ba256fb956b3240a248c4d730cf245 Author: Alison Schofield <alison.schofield@intel.com> Date: Thu Mar 21 10:09:29 2024 -0700 util/trace: move trace helpers from ndctl/cxl/ to ndctl/util/ A set of helpers used to parse kernel trace events were introduced in ndctl/cxl/ in support of the CXL monitor command. The work these helpers perform may be useful beyond CXL. Move them to the ndctl/util/ where other generic helpers reside. Replace cxl-ish naming with generic names and update the single user, cxl/monitor.c, to match. This move is in preparation for extending the helpers in support of cxl_poison trace events. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/d1d60f8f475684e398fd0c415358c48105b42b45.1720241079.git.alison.schofield@intel.com/ Signed-off-by: Alison Schofield <alison.schofield@intel.com> commit 16f45755f991f4fb6d76fec70a42992426c84234 Author: Vishal Verma <vishal.l.verma@intel.com> Date: Thu May 2 14:48:01 2024 -0600 ndctl: release v79 This release incorporates functionality up to and including the 6.9 kernel. Highlights include test and build fixes, a new cxl-wait-sanitize command, support for QOS Class in cxl-create-region, and a new cxl-set-alert-config command. Commands: cxl-create-region: Add QOS Class support cxl-wait-sanitize: New command cxl-disable-region: Add a new --force option cxl-set-alert-config: New command cxl-monitor: fix event_trace array parsing daxctl-destroy-device: fix accounting for number of devices destroyed Tests: cxl/test: use max_available_extent in cxl-destroy-region cxl: Add a test for qos_class in CXL test suite cxl/test: add 3-way HB interleave testcase to cxl-xor-region.sh cxl/test: add double quotes in cxl-xor-region.sh cxl/test: replace spaces with tabs in cxl-xor-region.sh test/daxctl-create.sh: remove region and dax device assumptions test/cxl-region-sysfs: fix a missing space syntax error test/cxl-region-sysfs.sh: use '[[ ]]' command to evaluate operands as arithmetic expressions ndctl/test: Add destroy region test cxl/test: Validate sanitize notifications cxl/test: validate the auto region in cxl-topology.sh cxl/test: replace a bad root decoder usage in cxl-xor-region.sh test/security.sh: test keyctl before excuting test/daxctl-devices.sh: increase the namespace size to 4GiB test/cxl-event: Skip cxl event testing if cxl-test is not available test/cxl-update-firmware: Fix checksum sysfs query APIs: daxctl_dev_is_system_ram_capable cxl_cmd_alert_config_set_corrected_pmem_err_prog_warn_threshold cxl_cmd_alert_config_set_corrected_volatile_mem_err_prog_warn_threshold cxl_cmd_alert_config_set_dev_over_temperature_prog_warn_threshold cxl_cmd_alert_config_set_dev_under_temperature_prog_warn_threshold cxl_cmd_alert_config_set_enable_alert_actions cxl_cmd_alert_config_set_life_used_prog_warn_threshold cxl_cmd_alert_config_set_valid_alert_actions cxl_cmd_new_set_alert_config cxl_memdev_get_pmem_qos_class cxl_memdev_get_ram_qos_class cxl_memdev_wait_sanitize cxl_port_decoders_committed cxl_region_qos_class_mismatch cxl_root_decoder_get_qos_class commit add0c37bf687881c2239479f09d193ecb6b628c2 Author: Dan Williams <dan.j.williams@intel.com> Date: Wed May 1 14:25:53 2024 -0600 Build: Fix deprecated str.format() usage New versions of Meson throw a warning around ndctl's use of 'str.format': WARNING: Broken features used: * 1.3.0: {'str.format: Value other than strings, integers, bools, options, dictionaries and lists thereof.'} Fix this by explicit string concatenation for building paths for the version script, whence the warnings originated. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/20240501-vv-build-fix-v1-1-792eecb2183b@intel.com Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> commit 7c8c993b87ee8471b4c138de549c39d1267f0067 Author: Alison Schofield <alison.schofield@intel.com> Date: Tue Apr 23 19:54:03 2024 -0700 cxl/test: use max_available_extent in cxl-destroy-region Using .size in decoder selection can lead to a set_size failure with these error messages: cxl region: create_region: region8: set_size failed: Numerical result out of range [] cxl_core:alloc_hpa:555: cxl region8: HPA allocation error (-34) for size:0x0000000020000000 in CXL Window 0 [mem 0xf010000000-0xf04fffffff flags 0x200] Use max_available_extent for decoder selection instead. The test overlooked the region creation failure because the not 'null' comparison succeeds when cxl create-region command emits nothing. Use the ! comparator when checking the create-region result. When checking the ram_size output of cxl-list add a check for empty. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/20240424025404.2343942-1-alison.schofield@intel.com Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> commit f533bd784859cb4fad62f4512f43b8310905cc8c Author: Vishal Verma <vishal.l.verma@intel.com> Date: Fri Apr 12 15:05:40 2024 -0600 daxctl/device.c: Fix error propagation in do_xaction_device() The loop through the provided list of devices in do_xaction_device() returns the status based on whatever the last device did. Since the order of processing devices, especially in cases like the 'all' keyword, can be effectively random, this can lead to the same command, and same effects, exiting with a different error code based on device ordering. This was noticed with flakiness in the daxctl-create.sh unit test. Its 'destroy-device all' command would either pass or fail based on the order it tried to destroy devices in. (Recall that until now, destroying a daxX.0 device would result in a failure). Make this slightly more consistent by saving a failed status in do_xaction_device if any iteration of the loop produces a failure. Return this saved status instead of returning the status of the last device processed. Link: https://lore.kernel.org/r/20240412-vv-daxctl-fixes-v1-2-6e808174e24f@intel.com Cc: Dan Williams <dan.j.williams@intel.com> Cc: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> commit d78f57ebdc0764cfcc161873449aab3f20520456 Author: Vishal Verma <vishal.l.verma@intel.com> Date: Fri Apr 12 15:05:39 2024 -0600 daxctl/device.c: Handle special case of destroying daxX.0 The kernel has special handling for destroying the 0th dax device under any given DAX region (daxX.0). It ensures the size is set to 0, but doesn't actually remove the device, instead it returns an EBUSY, indicating that this device cannot be removed. Add an expectation in daxctl's dev_destroy() helper to handle this case instead of returning the error - as far as the user is concerned, the size has been set to zero, and the destroy operation has been completed, even if the kernel indicated an EBUSY. Link: https://lore.kernel.org/r/20240412-vv-daxctl-fixes-v1-1-6e808174e24f@intel.com Cc: Dan Williams <dan.j.williams@intel.com> Cc: Alison Schofield <alison.schofield@intel.com> Reported-by: Ira Weiny <ira.weiny@intel.com> Reported-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> commit 804ba8671d51539a73d9d0591010a771625ca053 Author: Dave Jiang <dave.jiang@intel.com> Date: Mon Apr 22 14:17:55 2024 -0700 ndctl: cxl: Remove dependency for attributes derived from IDENTIFY command A memdev may optionally not host a mailbox and therefore not able to execute the IDENTIFY command. Currently the kernel emits empty strings for some of the attributes instead of making them invisible in order to keep backward compatibility for CXL CLI. Remove dependency of CXL CLI on the existance of these attributes and only expose them if they exist. Without the dependency the kernel will be able to make the non-existant attributes invisible. Link: https://lore.kernel.org/all/20230606121534.00003870@Huawei.com/ Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20240422211755.417632-1-dave.jiang@intel.com Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> commit 9e0225ae1eee136a61511036aa1c296ffae5d65b Merge: e0d0680 d39b151 Author: Vishal Verma <vishal.l.verma@intel.com> Date: Wed Apr 17 12:11:21 2024 -0600 Merge branch 'for-79/disable-region-check' into pending Add a check for memdev disable to see if there are active regions present before disabling the device. commit e0d0680bd3e554bd5f211e989480c5a13a023b2d Merge: 731ca1f 5e9157d Author: Vishal Verma <vishal.l.verma@intel.com> Date: Tue Mar 5 23:24:42 2024 -0700 Merge branch 'for-79/qos-class' into pending Starting in v6.8, the kernel exports a qos_class token for the root decoders (CFMWS) and as well as for the CXL memory devices. The qos_class exported for a device is calculated by the driver during device probe. Currently a qos_class is exported for the volatile partition (ram) and another for the persistent partition (pmem). In the future qos_class will be exported for DCD regions. Display of qos_class is through the CXL CLI list command with -vvv for extra verbose. A qos_class check as also been added for region creation. A warning is emitted when the qos_class of a memory range of a CXL memory device being included in the CXL region assembly does not match the qos_class of the root decoder. Options are available to suppress the warning or to fail the region creation. This enabling provides a guidance on flagging memory ranges being used is not optimal for performance for the CXL region to be formed. commit 5e9157d6721a878757f0fe8a3c51f06f9e94934a Author: Dave Jiang <dave.jiang@intel.com> Date: Mon Mar 4 10:35:35 2024 -0700 cxl: Add a test for qos_class in CXL test suite Add a test, cxl-qos-class.sh, to verify qos_class attributes are set as expected using canned values by the cxl_test module. Root decoders should have qos_class attribute set. Memory devices should have ram_qos_class or pmem_qos_class set depending on which partitions are valid. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20240304173618.1580662-5-dave.jiang@intel.com Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> commit 8e46bb879d5a0716ebd738892b83fc5d7b8314c8 Author: Dave Jiang <dave.jiang@intel.com> Date: Mon Mar 4 10:35:34 2024 -0700 cxl: Add QoS class checks for region creation The CFMWS provides a QTG ID. The kernel driver creates a root decoder that represents the CFMWS. A qos_class attribute is exported via sysfs for the root decoder. One or more qos_class tokens are retrieved via QTG ID _DSM from the ACPI0017 device for a CXL memory device. The input for the _DSM is the read and write latency and bandwidth for the path between the device and the CPU. The numbers are constructed by the kernel driver for the _DSM input. When a device is probed, QoS class tokens are retrieved. This is useful for a hot-plugged CXL memory device that does not have regions created. Add a QoS check during region creation. If --enforce-qos/-Q is set and the qos_class doesn't match, the region creation will fail. Reviewed-by: Alison Schofield <alison.schofield@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20240304173618.1580662-4-dave.jiang@intel.com Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> commit 4bad24165cbd1eee74f96f7eaa8d70c3952c75ab Author: Dave Jiang <dave.jiang@intel.com> Date: Mon Mar 4 10:35:33 2024 -0700 cxl/lib: Add APIs to retrieve QoS class for memory devices Add libcxl APIs to retrieve the QoS class tokens for the memory devices. Two API calls are added. One for 'ram' or 'volatile' mode and another for 'pmem' or 'persistent' mode. Support also added for displaying the QoS class tokens through the 'cxl list' command. Reviewed-by: Alison Schofield <alison.schofield@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20240304173618.1580662-3-dave.jiang@intel.com Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> commit 731ca1f75a5f421249baef4f850b477e9867a00d Author: Justin Ernst <justin.ernst@hpe.com> Date: Thu Feb 29 17:11:51 2024 -0600 util/json: Use json_object_get_uint64() with uint64 support If HAVE_JSON_U64=1, utils/json.c:display_hex() can call json_object_get_int64() on a struct json_object created with json_object_new_uint64(). In the context of 'ndctl list --regions --human', this results in a static value of 0x7fffffffffffffff being displayed for iset_id, as seen in #217. Correct hex values are observed with the use of json_object_get_uint64(). To support builds against older json-c, use a new static inline function util_json_get_u64() to fallback to json_object_get_int64() if HAVE_JSON_U64=0. Link: https://github.com/pmem/ndctl/issues/217 Fixes: 691cd249 ("json: Add support for json_object_new_uint64()") Signed-off-by: Justin Ernst <justin.ernst@hpe.com> Link: https://lore.kernel.org/r/20240229231151.358694-1-justin.ernst@hpe.com Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> commit a37665ad3f0024b6364e8f90d0804ecc0716ba85 Author: Alison Schofield <alison.schofield@intel.com> Date: Thu Feb 29 13:28:38 2024 -0800 cxl/documentation: tidy up cxl-wait-sanitize man page format Remove extra '==' to address these asciidoctor complaints: Generating Documentation/cxl/cxl-wait-sanitize with a custom command ERROR: cxl-wait-sanitize.txt: line 1: non-conforming manpage title ERROR: cxl-wait-sanitize.txt: line 3: name section expected WARNING: cxl-wait-sanitize.txt: line 4: unterminated example block WARNING: cxl-wait-sanitize.txt: line 26: unterminated listing block Signed-off-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/20240229212838.2006205-1-alison.schofield@intel.com Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> commit ffbbb0bc246d967d53821184047f1121e02f8a81 Author: Alison Schofield <alison.schofield@intel.com> Date: Thu Feb 15 22:06:10 2024 -0800 cxl/event_trace: parse arrays separately from strings Arrays are being parsed as strings based on a flag that seems like it would be the differentiator, ARRAY and STRING, but it is not. libtraceevent sets the flags for arrays and strings like this: array: TEP_FIELD_IS_[ARRAY | STRING] string: TEP_FIELD_IS_[ARRAY | STRING | DYNAMIC] Use TEP_FIELD_IS_DYNAMIC to discover the field type, otherwise arrays get parsed as strings and 'cxl monitor' returns gobbledygook in the array type fields. This fixes the "data" field of cxl_generic_events and the "uuid" field of cxl_poison. That cxl_poison uuid format can be further improved by using the trace type (__field_struct uuid_t) in the CXL kernel driver. The parser will automatically pick up that new type, as illustrated in the "hdr_uuid" of cxl_generic_media event trace above. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/20240216060610.1951127-1-alison.schofield@intel.com Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> commit d39b151aa61ac540758b405b3adacd796b2d1bab Author: Dave Jiang <dave.jiang@intel.com> Date: Thu Nov 30 14:51:37 2023 -0700 cxl: Add check for regions before disabling memdev Add a check for memdev disable to see if there are active regions present before disabling the device. This is necessary now regions are present to fulfill the TODO that was left there. The best way to determine if a region is active is to see if there are decoders enabled for the mem device. This is also best effort as the state is only a snapshot the kernel provides and is not atomic WRT the memdev disable operation. The expectation is the admin issuing the command has full control of the mem device and there are no other agents also attempt to control the device. Reviewed-by: Quanquan Cao <caoqq@fujitsu.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/170138109724.2882696.123294980050048623.stgit@djiang5-mobl3 Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> commit 7f364779c59d85e718f4eb1f449f68988f97f4fb Author: Dave Jiang <dave.jiang@intel.com> Date: Tue Nov 28 13:43:51 2023 -0700 cxl: Save the number of decoders committed to a port Save the number of decoders committed to a port exposed by the kernel to the libcxl cxl_port context. The attribute is helpful for determing if a region is active. Add libcxl API to retrieve the number of decoders committed. Add the decoders_committed attribute to the port for cxl list command. Link: https://lore.kernel.org/linux-cxl/169645700414.623072.3893376765415910289.stgit@djiang5-mobl3/T/#t Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/170120423159.2725915.14670830315829916850.stgit@djiang5-mobl3 Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> commit 4d767c0c9b91d254e8ff0d7f0d3be04a498ad9f0 Merge: 162d269 367d4b5 Author: Vishal Verma <vishal.l.verma@intel.com> Date: Fri Feb 16 16:06:53 2024 -0700 Merge branch 'for-79/cxl-region-sysfs-test-fix' into pending Add a couple of fixes for the cxl-region-sysfs.sh unit test. commit 162d2690c39ebc4338c90ba7e4235d38700049ac Author: Vishal Verma <vishal.l.verma@intel.com> Date: Thu Jan 11 16:00:53 2024 -0700 test/daxctl-create.sh: remove region and dax device assumptions The daxctl-create.sh test had some hard-coded assumptions about what dax device it expects to find, and what region number it will be under. This usually worked when the unit test environment only had efi_fake_mem devices as the sources of hmem memory. With CXL however, the region numbering namespace is shared with CXL regions, often pushing the efi_fake_mem region to something other than 'region0'. Remove any region and device number assumptions from this test so it works regardless of how regions get enumerated. Link: https://lore.kernel.org/r/20240111-vv-daxctl-create-v2-1-1052c8390c5d@intel.com Cc: Joao Martins <joao.m.martins@oracle.com> Cc: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com> commit 367d4b55f9d8c8dd00318e16b1bdf0809dd1cb64 Author: Li Zhijian <lizhijian@fujitsu.com> Date: Wed Dec 13 16:25:56 2023 +0800 test/cxl-region-sysfs: fix a missing space syntax error Currently the cxl-region-sysfs.sh test runs to completion and passes, but with syntax errors in the log. It turns out that because the test is checking for a positive condition as a failure, that also happens to mask the syntax errors. Fix the syntax and note that this also happens to unblock a test case that was being hidden by this error. Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Link: https://lore.kernel.org/r/20231213082556.1401741-2-lizhijian@fujitsu.com Acked-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Vishal Verma <vishal.l.verma@intel.com>
Among the 2 debian patches available in version 81-1 of the package, we noticed the following issues: