1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
|
From bippy-4986f5686161 Mon Sep 17 00:00:00 2001
From: Lee Jones <lee@kernel.org>
To: <linux-cve-announce@vger.kernel.org>
Reply-to: <cve@kernel.org>, <linux-kernel@vger.kernel.org>
Subject: CVE-2023-52490: mm: migrate: fix getting incorrect page mapping during page migration
X-Developer-Signature: v=1; a=openpgp-sha256; l=4186; i=lee@kernel.org;
h=from:subject; bh=ZBVwCCKlBVdibLgpACGKWQea2cD/umY1ae/Nn0Q8Hto=;
b=owEBbQKS/ZANAwAKAVGvii+H/HdhAcsmYgBl4KhO42Ovx9dA7GIA3sRzrAAKCIpJ7uo9xj5ZV
OIoHAWYRRSJAjMEAAEKAB0WIQR2tsk1o74gmpTwh0hRr4ovh/x3YQUCZeCoTgAKCRBRr4ovh/x3
Yck7D/9QFZQ9nrH8y5sBwu7rSTdxz11WKFSJ8E1Xr0ADpJ9lMbX6kwhFZ1GXCOdLPdhcGLXYATd
fm6/MLghu2dk969938PT01N2IKu5VUUoeyZ7/AMaFXsSpgTPKT1KLHpu5zENvt0Q9oJew5avSXs
h20b5ahQ7/Zx9lngQCsNuvuohcwwrcdzdnuCbkSywi51rsyt9M1xRHD4ffR7zUEf09HAPS8svKE
BAqGp81B76FcwhQnD3yjGlVPRuDj3uMNVDfByLYgqPo7GD+UMOdqrolRMqILVQ9NWZULUdIhcrc
3Ih+w+2sIJpmecy3HK1YRdv0YOs2twGnSvlnMDM+HGmvrtJGSCn49govS4XP2/7dNLYR851d5bx
3FP185LMyTd20FDpac77t/F9O5o0CtAwMlpoH5HjHvIBy6zBRmxL8MtnwFQp3pebEtCiIJPCUXo
6kH0fnrEGwp8mktvBX+HNFOk8CRm0hOVqhPd4IVLDboZxO8J+4ElAKF4YVwaJNhqOtnAAS7WRn8
Jn393lJ5R7dEPHLIEC3GN8Ad+/+fsqsuaJ3IfumCUNGjG69tvhLwJlKqEgfCfsMu+KF0+XXIIHv
9wiqIYQmBGG1lFkMN4stRtBi5l+VMS4cq1A0uqyTLjqY/e0ER32OBXmL7iCEwIOfvVQhcg7y6Al
IY5/ZkXePkOZS+g==
X-Developer-Key: i=lee@kernel.org; a=openpgp;
fpr=76B6C935A3BE209A94F0874851AF8A2F87FC7761
Description
===========
In the Linux kernel, the following vulnerability has been resolved:
mm: migrate: fix getting incorrect page mapping during page migration
When running stress-ng testing, we found below kernel crash after a few hours:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
pc : dentry_name+0xd8/0x224
lr : pointer+0x22c/0x370
sp : ffff800025f134c0
......
Call trace:
dentry_name+0xd8/0x224
pointer+0x22c/0x370
vsnprintf+0x1ec/0x730
vscnprintf+0x2c/0x60
vprintk_store+0x70/0x234
vprintk_emit+0xe0/0x24c
vprintk_default+0x3c/0x44
vprintk_func+0x84/0x2d0
printk+0x64/0x88
__dump_page+0x52c/0x530
dump_page+0x14/0x20
set_migratetype_isolate+0x110/0x224
start_isolate_page_range+0xc4/0x20c
offline_pages+0x124/0x474
memory_block_offline+0x44/0xf4
memory_subsys_offline+0x3c/0x70
device_offline+0xf0/0x120
......
After analyzing the vmcore, I found this issue is caused by page migration.
The scenario is that, one thread is doing page migration, and we will use the
target page's ->mapping field to save 'anon_vma' pointer between page unmap and
page move, and now the target page is locked and refcount is 1.
Currently, there is another stress-ng thread performing memory hotplug,
attempting to offline the target page that is being migrated. It discovers that
the refcount of this target page is 1, preventing the offline operation, thus
proceeding to dump the page. However, page_mapping() of the target page may
return an incorrect file mapping to crash the system in dump_mapping(), since
the target page->mapping only saves 'anon_vma' pointer without setting
PAGE_MAPPING_ANON flag.
There are seveval ways to fix this issue:
(1) Setting the PAGE_MAPPING_ANON flag for target page's ->mapping when saving
'anon_vma', but this can confuse PageAnon() for PFN walkers, since the target
page has not built mappings yet.
(2) Getting the page lock to call page_mapping() in __dump_page() to avoid crashing
the system, however, there are still some PFN walkers that call page_mapping()
without holding the page lock, such as compaction.
(3) Using target page->private field to save the 'anon_vma' pointer and 2 bits
page state, just as page->mapping records an anonymous page, which can remove
the page_mapping() impact for PFN walkers and also seems a simple way.
So I choose option 3 to fix this issue, and this can also fix other potential
issues for PFN walkers, such as compaction.
The Linux kernel CVE team has assigned CVE-2023-52490 to this issue.
Affected and fixed versions
===========================
Issue introduced in 6.3 with commit 64c8902ed441 and fixed in 6.6.15 with commit 9128bfbc5c80
Issue introduced in 6.3 with commit 64c8902ed441 and fixed in 6.7.3 with commit 3889a418b6eb
Issue introduced in 6.3 with commit 64c8902ed441 and fixed in 6.8-rc1 with commit d1adb25df711
Please see https://www.kernel.org or a full list of currently supported
kernel versions by the kernel community.
Unaffected versions might change over time as fixes are backported to
older supported kernel versions. The official CVE entry at
https://cve.org/CVERecord/?id=CVE-2023-52490
will be updated if fixes are backported, please check that for the most
up to date information about this issue.
Affected files
==============
The file(s) affected by this issue are:
mm/migrate.c
Mitigation
==========
The Linux kernel CVE team recommends that you update to the latest
stable kernel version for this, and many other bugfixes. Individual
changes are never tested alone, but rather are part of a larger kernel
release. Cherry-picking individual commits is not recommended or
supported by the Linux kernel community at all. If however, updating to
the latest release is impossible, the individual changes to resolve this
issue can be found at these commits:
https://git.kernel.org/stable/c/9128bfbc5c80d8f4874dd0a0424d1f5fb010df1b
https://git.kernel.org/stable/c/3889a418b6eb9a1113fb989aaadecf2f64964767
https://git.kernel.org/stable/c/d1adb25df7111de83b64655a80b5a135adbded61
|