Linux » Linux Kernel : Security Vulnerabilities, CVEs, (Memory corruption) CVSS score >= 5
In the Linux kernel, the following vulnerability has been resolved:
perf: RISCV: Fix panic on pmu overflow handler
(1 << idx) of int is not desired when setting bits in unsigned long
overflowed_ctrs, use BIT() instead. This panic happens when running
'perf record -e branches' on sophgo sg2042.
[ 273.311852] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000098
[ 273.320851] Oops [#1]
[ 273.323179] Modules linked in:
[ 273.326303] CPU: 0 PID: 1475 Comm: perf Not tainted 6.6.0-rc3+ #9
[ 273.332521] Hardware name: Sophgo Mango (DT)
[ 273.336878] epc : riscv_pmu_ctr_get_width_mask+0x8/0x62
[ 273.342291] ra : pmu_sbi_ovf_handler+0x2e0/0x34e
[ 273.347091] epc : ffffffff80aecd98 ra : ffffffff80aee056 sp : fffffff6e36928b0
[ 273.354454] gp : ffffffff821f82d0 tp : ffffffd90c353200 t0 : 0000002ade4f9978
[ 273.361815] t1 : 0000000000504d55 t2 : ffffffff8016cd8c s0 : fffffff6e3692a70
[ 273.369180] s1 : 0000000000000020 a0 : 0000000000000000 a1 : 00001a8e81800000
[ 273.376540] a2 : 0000003c00070198 a3 : 0000003c00db75a4 a4 : 0000000000000015
[ 273.383901] a5 : ffffffd7ff8804b0 a6 : 0000000000000015 a7 : 000000000000002a
[ 273.391327] s2 : 000000000000ffff s3 : 0000000000000000 s4 : ffffffd7ff8803b0
[ 273.398773] s5 : 0000000000504d55 s6 : ffffffd905069800 s7 : ffffffff821fe210
[ 273.406139] s8 : 000000007fffffff s9 : ffffffd7ff8803b0 s10: ffffffd903f29098
[ 273.413660] s11: 0000000080000000 t3 : 0000000000000003 t4 : ffffffff8017a0ca
[ 273.421022] t5 : ffffffff8023cfc2 t6 : ffffffd9040780e8
[ 273.426437] status: 0000000200000100 badaddr: 0000000000000098 cause: 000000000000000d
[ 273.434512] [<ffffffff80aecd98>] riscv_pmu_ctr_get_width_mask+0x8/0x62
[ 273.441169] [<ffffffff80076bd8>] handle_percpu_devid_irq+0x98/0x1ee
[ 273.447562] [<ffffffff80071158>] generic_handle_domain_irq+0x28/0x36
[ 273.454151] [<ffffffff8047a99a>] riscv_intc_irq+0x36/0x4e
[ 273.459659] [<ffffffff80c944de>] handle_riscv_irq+0x4a/0x74
[ 273.465442] [<ffffffff80c94c48>] do_irq+0x62/0x92
[ 273.470360] Code: 0420 60a2 6402 5529 0141 8082 0013 0000 0013 0000 (6d5c) b783
[ 273.477921] ---[ end trace 0000000000000000 ]---
[ 273.482630] Kernel panic - not syncing: Fatal exception in interrupt
Max CVSS
5.5
EPSS Score
0.05%
Published
2024-04-17
Updated
2024-04-17
In the Linux kernel, the following vulnerability has been resolved:
aoe: fix the potential use-after-free problem in aoecmd_cfg_pkts
This patch is against CVE-2023-6270. The description of cve is:
A flaw was found in the ATA over Ethernet (AoE) driver in the Linux
kernel. The aoecmd_cfg_pkts() function improperly updates the refcnt on
`struct net_device`, and a use-after-free can be triggered by racing
between the free on the struct and the access through the `skbtxq`
global queue. This could lead to a denial of service condition or
potential code execution.
In aoecmd_cfg_pkts(), it always calls dev_put(ifp) when skb initial
code is finished. But the net_device ifp will still be used in
later tx()->dev_queue_xmit() in kthread. Which means that the
dev_put(ifp) should NOT be called in the success path of skb
initial code in aoecmd_cfg_pkts(). Otherwise tx() may run into
use-after-free because the net_device is freed.
This patch removed the dev_put(ifp) in the success path in
aoecmd_cfg_pkts(), and added dev_put() after skb xmit in tx().
Max CVSS
5.5
EPSS Score
0.04%
Published
2024-04-17
Updated
2024-04-17
In the Linux kernel, the following vulnerability has been resolved:
wifi: ath9k: delay all of ath9k_wmi_event_tasklet() until init is complete
The ath9k_wmi_event_tasklet() used in ath9k_htc assumes that all the data
structures have been fully initialised by the time it runs. However, because of
the order in which things are initialised, this is not guaranteed to be the
case, because the device is exposed to the USB subsystem before the ath9k driver
initialisation is completed.
We already committed a partial fix for this in commit:
8b3046abc99e ("ath9k_htc: fix NULL pointer dereference at ath9k_htc_tx_get_packet()")
However, that commit only aborted the WMI_TXSTATUS_EVENTID command in the event
tasklet, pairing it with an "initialisation complete" bit in the TX struct. It
seems syzbot managed to trigger the race for one of the other commands as well,
so let's just move the existing synchronisation bit to cover the whole
tasklet (setting it at the end of ath9k_htc_probe_device() instead of inside
ath9k_tx_init()).
Max CVSS
5.5
EPSS Score
0.04%
Published
2024-04-17
Updated
2024-04-17
In the Linux kernel, the following vulnerability has been resolved:
wifi: wilc1000: prevent use-after-free on vif when cleaning up all interfaces
wilc_netdev_cleanup currently triggers a KASAN warning, which can be
observed on interface registration error path, or simply by
removing the module/unbinding device from driver:
echo spi0.1 > /sys/bus/spi/drivers/wilc1000_spi/unbind
==================================================================
BUG: KASAN: slab-use-after-free in wilc_netdev_cleanup+0x508/0x5cc
Read of size 4 at addr c54d1ce8 by task sh/86
CPU: 0 PID: 86 Comm: sh Not tainted 6.8.0-rc1+ #117
Hardware name: Atmel SAMA5
unwind_backtrace from show_stack+0x18/0x1c
show_stack from dump_stack_lvl+0x34/0x58
dump_stack_lvl from print_report+0x154/0x500
print_report from kasan_report+0xac/0xd8
kasan_report from wilc_netdev_cleanup+0x508/0x5cc
wilc_netdev_cleanup from wilc_bus_remove+0xc8/0xec
wilc_bus_remove from spi_remove+0x8c/0xac
spi_remove from device_release_driver_internal+0x434/0x5f8
device_release_driver_internal from unbind_store+0xbc/0x108
unbind_store from kernfs_fop_write_iter+0x398/0x584
kernfs_fop_write_iter from vfs_write+0x728/0xf88
vfs_write from ksys_write+0x110/0x1e4
ksys_write from ret_fast_syscall+0x0/0x1c
[...]
Allocated by task 1:
kasan_save_track+0x30/0x5c
__kasan_kmalloc+0x8c/0x94
__kmalloc_node+0x1cc/0x3e4
kvmalloc_node+0x48/0x180
alloc_netdev_mqs+0x68/0x11dc
alloc_etherdev_mqs+0x28/0x34
wilc_netdev_ifc_init+0x34/0x8ec
wilc_cfg80211_init+0x690/0x910
wilc_bus_probe+0xe0/0x4a0
spi_probe+0x158/0x1b0
really_probe+0x270/0xdf4
__driver_probe_device+0x1dc/0x580
driver_probe_device+0x60/0x140
__driver_attach+0x228/0x5d4
bus_for_each_dev+0x13c/0x1a8
bus_add_driver+0x2a0/0x608
driver_register+0x24c/0x578
do_one_initcall+0x180/0x310
kernel_init_freeable+0x424/0x484
kernel_init+0x20/0x148
ret_from_fork+0x14/0x28
Freed by task 86:
kasan_save_track+0x30/0x5c
kasan_save_free_info+0x38/0x58
__kasan_slab_free+0xe4/0x140
kfree+0xb0/0x238
device_release+0xc0/0x2a8
kobject_put+0x1d4/0x46c
netdev_run_todo+0x8fc/0x11d0
wilc_netdev_cleanup+0x1e4/0x5cc
wilc_bus_remove+0xc8/0xec
spi_remove+0x8c/0xac
device_release_driver_internal+0x434/0x5f8
unbind_store+0xbc/0x108
kernfs_fop_write_iter+0x398/0x584
vfs_write+0x728/0xf88
ksys_write+0x110/0x1e4
ret_fast_syscall+0x0/0x1c
[...]
David Mosberger-Tan initial investigation [1] showed that this
use-after-free is due to netdevice unregistration during vif list
traversal. When unregistering a net device, since the needs_free_netdev has
been set to true during registration, the netdevice object is also freed,
and as a consequence, the corresponding vif object too, since it is
attached to it as private netdevice data. The next occurrence of the loop
then tries to access freed vif pointer to the list to move forward in the
list.
Fix this use-after-free thanks to two mechanisms:
- navigate in the list with list_for_each_entry_safe, which allows to
safely modify the list as we go through each element. For each element,
remove it from the list with list_del_rcu
- make sure to wait for RCU grace period end after each vif removal to make
sure it is safe to free the corresponding vif too (through
unregister_netdev)
Since we are in a RCU "modifier" path (not a "reader" path), and because
such path is expected not to be concurrent to any other modifier (we are
using the vif_mutex lock), we do not need to use RCU list API, that's why
we can benefit from list_for_each_entry_safe.
[1] https://lore.kernel.org/linux-wireless/ab077dbe58b1ea5de0a3b2ca21f275a07af967d2.camel@egauge.net/
Max CVSS
5.5
EPSS Score
0.04%
Published
2024-04-17
Updated
2024-04-17
In the Linux kernel, the following vulnerability has been resolved:
firmware: arm_scmi: Fix double free in SMC transport cleanup path
When the generic SCMI code tears down a channel, it calls the chan_free
callback function, defined by each transport. Since multiple protocols
might share the same transport_info member, chan_free() might want to
clean up the same member multiple times within the given SCMI transport
implementation. In this case, it is SMC transport. This will lead to a NULL
pointer dereference at the second time:
| scmi_protocol scmi_dev.1: Enabled polling mode TX channel - prot_id:16
| arm-scmi firmware:scmi: SCMI Notifications - Core Enabled.
| arm-scmi firmware:scmi: unable to communicate with SCMI
| Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
| Mem abort info:
| ESR = 0x0000000096000004
| EC = 0x25: DABT (current EL), IL = 32 bits
| SET = 0, FnV = 0
| EA = 0, S1PTW = 0
| FSC = 0x04: level 0 translation fault
| Data abort info:
| ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
| CM = 0, WnR = 0, TnD = 0, TagAccess = 0
| GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
| user pgtable: 4k pages, 48-bit VAs, pgdp=0000000881ef8000
| [0000000000000000] pgd=0000000000000000, p4d=0000000000000000
| Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
| Modules linked in:
| CPU: 4 PID: 1 Comm: swapper/0 Not tainted 6.7.0-rc2-00124-g455ef3d016c9-dirty #793
| Hardware name: FVP Base RevC (DT)
| pstate: 61400009 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--)
| pc : smc_chan_free+0x3c/0x6c
| lr : smc_chan_free+0x3c/0x6c
| Call trace:
| smc_chan_free+0x3c/0x6c
| idr_for_each+0x68/0xf8
| scmi_cleanup_channels.isra.0+0x2c/0x58
| scmi_probe+0x434/0x734
| platform_probe+0x68/0xd8
| really_probe+0x110/0x27c
| __driver_probe_device+0x78/0x12c
| driver_probe_device+0x3c/0x118
| __driver_attach+0x74/0x128
| bus_for_each_dev+0x78/0xe0
| driver_attach+0x24/0x30
| bus_add_driver+0xe4/0x1e8
| driver_register+0x60/0x128
| __platform_driver_register+0x28/0x34
| scmi_driver_init+0x84/0xc0
| do_one_initcall+0x78/0x33c
| kernel_init_freeable+0x2b8/0x51c
| kernel_init+0x24/0x130
| ret_from_fork+0x10/0x20
| Code: f0004701 910a0021 aa1403e5 97b91c70 (b9400280)
| ---[ end trace 0000000000000000 ]---
Simply check for the struct pointer being NULL before trying to access
its members, to avoid this situation.
This was found when a transport doesn't really work (for instance no SMC
service), the probe routines then tries to clean up, and triggers a crash.
Max CVSS
5.5
EPSS Score
0.05%
Published
2024-04-17
Updated
2024-04-17
In the Linux kernel, the following vulnerability has been resolved:
wifi: mt76: mt7921e: fix use-after-free in free_irq()
From commit a304e1b82808 ("[PATCH] Debug shared irqs"), there is a test
to make sure the shared irq handler should be able to handle the unexpected
event after deregistration. For this case, let's apply MT76_REMOVED flag to
indicate the device was removed and do not run into the resource access
anymore.
BUG: KASAN: use-after-free in mt7921_irq_handler+0xd8/0x100 [mt7921e]
Read of size 8 at addr ffff88824a7d3b78 by task rmmod/11115
CPU: 28 PID: 11115 Comm: rmmod Tainted: G W L 5.17.0 #10
Hardware name: Micro-Star International Co., Ltd. MS-7D73/MPG B650I
EDGE WIFI (MS-7D73), BIOS 1.81 01/05/2024
Call Trace:
<TASK>
dump_stack_lvl+0x6f/0xa0
print_address_description.constprop.0+0x1f/0x190
? mt7921_irq_handler+0xd8/0x100 [mt7921e]
? mt7921_irq_handler+0xd8/0x100 [mt7921e]
kasan_report.cold+0x7f/0x11b
? mt7921_irq_handler+0xd8/0x100 [mt7921e]
mt7921_irq_handler+0xd8/0x100 [mt7921e]
free_irq+0x627/0xaa0
devm_free_irq+0x94/0xd0
? devm_request_any_context_irq+0x160/0x160
? kobject_put+0x18d/0x4a0
mt7921_pci_remove+0x153/0x190 [mt7921e]
pci_device_remove+0xa2/0x1d0
__device_release_driver+0x346/0x6e0
driver_detach+0x1ef/0x2c0
bus_remove_driver+0xe7/0x2d0
? __check_object_size+0x57/0x310
pci_unregister_driver+0x26/0x250
__do_sys_delete_module+0x307/0x510
? free_module+0x6a0/0x6a0
? fpregs_assert_state_consistent+0x4b/0xb0
? rcu_read_lock_sched_held+0x10/0x70
? syscall_enter_from_user_mode+0x20/0x70
? trace_hardirqs_on+0x1c/0x130
do_syscall_64+0x5c/0x80
? trace_hardirqs_on_prepare+0x72/0x160
? do_syscall_64+0x68/0x80
? trace_hardirqs_on_prepare+0x72/0x160
entry_SYSCALL_64_after_hwframe+0x44/0xae
Max CVSS
5.5
EPSS Score
0.05%
Published
2024-04-17
Updated
2024-04-17
In the Linux kernel, the following vulnerability has been resolved:
net: hns3: fix kernel crash when 1588 is received on HIP08 devices
The HIP08 devices does not register the ptp devices, so the
hdev->ptp is NULL, but the hardware can receive 1588 messages,
and set the HNS3_RXD_TS_VLD_B bit, so, if match this case, the
access of hdev->ptp->flags will cause a kernel crash:
[ 5888.946472] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000018
[ 5888.946475] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000018
...
[ 5889.266118] pc : hclge_ptp_get_rx_hwts+0x40/0x170 [hclge]
[ 5889.272612] lr : hclge_ptp_get_rx_hwts+0x34/0x170 [hclge]
[ 5889.279101] sp : ffff800012c3bc50
[ 5889.283516] x29: ffff800012c3bc50 x28: ffff2040002be040
[ 5889.289927] x27: ffff800009116484 x26: 0000000080007500
[ 5889.296333] x25: 0000000000000000 x24: ffff204001c6f000
[ 5889.302738] x23: ffff204144f53c00 x22: 0000000000000000
[ 5889.309134] x21: 0000000000000000 x20: ffff204004220080
[ 5889.315520] x19: ffff204144f53c00 x18: 0000000000000000
[ 5889.321897] x17: 0000000000000000 x16: 0000000000000000
[ 5889.328263] x15: 0000004000140ec8 x14: 0000000000000000
[ 5889.334617] x13: 0000000000000000 x12: 00000000010011df
[ 5889.340965] x11: bbfeff4d22000000 x10: 0000000000000000
[ 5889.347303] x9 : ffff800009402124 x8 : 0200f78811dfbb4d
[ 5889.353637] x7 : 2200000000191b01 x6 : ffff208002a7d480
[ 5889.359959] x5 : 0000000000000000 x4 : 0000000000000000
[ 5889.366271] x3 : 0000000000000000 x2 : 0000000000000000
[ 5889.372567] x1 : 0000000000000000 x0 : ffff20400095c080
[ 5889.378857] Call trace:
[ 5889.382285] hclge_ptp_get_rx_hwts+0x40/0x170 [hclge]
[ 5889.388304] hns3_handle_bdinfo+0x324/0x410 [hns3]
[ 5889.394055] hns3_handle_rx_bd+0x60/0x150 [hns3]
[ 5889.399624] hns3_clean_rx_ring+0x84/0x170 [hns3]
[ 5889.405270] hns3_nic_common_poll+0xa8/0x220 [hns3]
[ 5889.411084] napi_poll+0xcc/0x264
[ 5889.415329] net_rx_action+0xd4/0x21c
[ 5889.419911] __do_softirq+0x130/0x358
[ 5889.424484] irq_exit+0x134/0x154
[ 5889.428700] __handle_domain_irq+0x88/0xf0
[ 5889.433684] gic_handle_irq+0x78/0x2c0
[ 5889.438319] el1_irq+0xb8/0x140
[ 5889.442354] arch_cpu_idle+0x18/0x40
[ 5889.446816] default_idle_call+0x5c/0x1c0
[ 5889.451714] cpuidle_idle_call+0x174/0x1b0
[ 5889.456692] do_idle+0xc8/0x160
[ 5889.460717] cpu_startup_entry+0x30/0xfc
[ 5889.465523] secondary_start_kernel+0x158/0x1ec
[ 5889.470936] Code: 97ffab78 f9411c14 91408294 f9457284 (f9400c80)
[ 5889.477950] SMP: stopping secondary CPUs
[ 5890.514626] SMP: failed to stop secondary CPUs 0-69,71-95
[ 5890.522951] Starting crashdump kernel...
Max CVSS
5.5
EPSS Score
0.04%
Published
2024-04-17
Updated
2024-04-17
In the Linux kernel, the following vulnerability has been resolved:
quota: Fix potential NULL pointer dereference
Below race may cause NULL pointer dereference
P1 P2
dquot_free_inode quota_off
drop_dquot_ref
remove_dquot_ref
dquots = i_dquot(inode)
dquots = i_dquot(inode)
srcu_read_lock
dquots[cnt]) != NULL (1)
dquots[type] = NULL (2)
spin_lock(&dquots[cnt]->dq_dqb_lock) (3)
....
If dquot_free_inode(or other routines) checks inode's quota pointers (1)
before quota_off sets it to NULL(2) and use it (3) after that, NULL pointer
dereference will be triggered.
So let's fix it by using a temporary pointer to avoid this issue.
Max CVSS
5.5
EPSS Score
0.04%
Published
2024-04-17
Updated
2024-04-17
In the Linux kernel, the following vulnerability has been resolved:
nfs: fix panic when nfs4_ff_layout_prepare_ds() fails
We've been seeing the following panic in production
BUG: kernel NULL pointer dereference, address: 0000000000000065
PGD 2f485f067 P4D 2f485f067 PUD 2cc5d8067 PMD 0
RIP: 0010:ff_layout_cancel_io+0x3a/0x90 [nfs_layout_flexfiles]
Call Trace:
<TASK>
? __die+0x78/0xc0
? page_fault_oops+0x286/0x380
? __rpc_execute+0x2c3/0x470 [sunrpc]
? rpc_new_task+0x42/0x1c0 [sunrpc]
? exc_page_fault+0x5d/0x110
? asm_exc_page_fault+0x22/0x30
? ff_layout_free_layoutreturn+0x110/0x110 [nfs_layout_flexfiles]
? ff_layout_cancel_io+0x3a/0x90 [nfs_layout_flexfiles]
? ff_layout_cancel_io+0x6f/0x90 [nfs_layout_flexfiles]
pnfs_mark_matching_lsegs_return+0x1b0/0x360 [nfsv4]
pnfs_error_mark_layout_for_return+0x9e/0x110 [nfsv4]
? ff_layout_send_layouterror+0x50/0x160 [nfs_layout_flexfiles]
nfs4_ff_layout_prepare_ds+0x11f/0x290 [nfs_layout_flexfiles]
ff_layout_pg_init_write+0xf0/0x1f0 [nfs_layout_flexfiles]
__nfs_pageio_add_request+0x154/0x6c0 [nfs]
nfs_pageio_add_request+0x26b/0x380 [nfs]
nfs_do_writepage+0x111/0x1e0 [nfs]
nfs_writepages_callback+0xf/0x30 [nfs]
write_cache_pages+0x17f/0x380
? nfs_pageio_init_write+0x50/0x50 [nfs]
? nfs_writepages+0x6d/0x210 [nfs]
? nfs_writepages+0x6d/0x210 [nfs]
nfs_writepages+0x125/0x210 [nfs]
do_writepages+0x67/0x220
? generic_perform_write+0x14b/0x210
filemap_fdatawrite_wbc+0x5b/0x80
file_write_and_wait_range+0x6d/0xc0
nfs_file_fsync+0x81/0x170 [nfs]
? nfs_file_mmap+0x60/0x60 [nfs]
__x64_sys_fsync+0x53/0x90
do_syscall_64+0x3d/0x90
entry_SYSCALL_64_after_hwframe+0x46/0xb0
Inspecting the core with drgn I was able to pull this
>>> prog.crashed_thread().stack_trace()[0]
#0 at 0xffffffffa079657a (ff_layout_cancel_io+0x3a/0x84) in ff_layout_cancel_io at fs/nfs/flexfilelayout/flexfilelayout.c:2021:27
>>> prog.crashed_thread().stack_trace()[0]['idx']
(u32)1
>>> prog.crashed_thread().stack_trace()[0]['flseg'].mirror_array[1].mirror_ds
(struct nfs4_ff_layout_ds *)0xffffffffffffffed
This is clear from the stack trace, we call nfs4_ff_layout_prepare_ds()
which could error out initializing the mirror_ds, and then we go to
clean it all up and our check is only for if (!mirror->mirror_ds). This
is inconsistent with the rest of the users of mirror_ds, which have
if (IS_ERR_OR_NULL(mirror_ds))
to keep from tripping over this exact scenario. Fix this up in
ff_layout_cancel_io() to make sure we don't panic when we get an error.
I also spot checked all the other instances of checking mirror_ds and we
appear to be doing the correct checks everywhere, only unconditionally
dereferencing mirror_ds when we know it would be valid.
Max CVSS
5.5
EPSS Score
0.05%
Published
2024-04-17
Updated
2024-04-17
In the Linux kernel, the following vulnerability has been resolved:
comedi: comedi_8255: Correct error in subdevice initialization
The refactoring done in commit 5c57b1ccecc7 ("comedi: comedi_8255: Rework
subdevice initialization functions") to the initialization of the io
field of struct subdev_8255_private broke all cards using the
drivers/comedi/drivers/comedi_8255.c module.
Prior to 5c57b1ccecc7, __subdev_8255_init() initialized the io field
in the newly allocated struct subdev_8255_private to the non-NULL
callback given to the function, otherwise it used a flag parameter to
select between subdev_8255_mmio and subdev_8255_io. The refactoring
removed that logic and the flag, as subdev_8255_mm_init() and
subdev_8255_io_init() now explicitly pass subdev_8255_mmio and
subdev_8255_io respectively to __subdev_8255_init(), only
__subdev_8255_init() never sets spriv->io to the supplied
callback. That spriv->io is NULL leads to a later BUG:
BUG: kernel NULL pointer dereference, address: 0000000000000000
PGD 0 P4D 0
Oops: 0010 [#1] SMP PTI
CPU: 1 PID: 1210 Comm: systemd-udevd Not tainted 6.7.3-x86_64 #1
Hardware name: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
RIP: 0010:0x0
Code: Unable to access opcode bytes at 0xffffffffffffffd6.
RSP: 0018:ffffa3f1c02d7b78 EFLAGS: 00010202
RAX: 0000000000000000 RBX: ffff91f847aefd00 RCX: 000000000000009b
RDX: 0000000000000003 RSI: 0000000000000001 RDI: ffff91f840f6fc00
RBP: ffff91f840f6fc00 R08: 0000000000000000 R09: 0000000000000001
R10: 0000000000000000 R11: 000000000000005f R12: 0000000000000000
R13: 0000000000000000 R14: ffffffffc0102498 R15: ffff91f847ce6ba8
FS: 00007f72f4e8f500(0000) GS:ffff91f8d5c80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffffffffffffd6 CR3: 000000010540e000 CR4: 00000000000406f0
Call Trace:
<TASK>
? __die_body+0x15/0x57
? page_fault_oops+0x2ef/0x33c
? insert_vmap_area.constprop.0+0xb6/0xd5
? alloc_vmap_area+0x529/0x5ee
? exc_page_fault+0x15a/0x489
? asm_exc_page_fault+0x22/0x30
__subdev_8255_init+0x79/0x8d [comedi_8255]
pci_8255_auto_attach+0x11a/0x139 [8255_pci]
comedi_auto_config+0xac/0x117 [comedi]
? __pfx___driver_attach+0x10/0x10
pci_device_probe+0x88/0xf9
really_probe+0x101/0x248
__driver_probe_device+0xbb/0xed
driver_probe_device+0x1a/0x72
__driver_attach+0xd4/0xed
bus_for_each_dev+0x76/0xb8
bus_add_driver+0xbe/0x1be
driver_register+0x9a/0xd8
comedi_pci_driver_register+0x28/0x48 [comedi_pci]
? __pfx_pci_8255_driver_init+0x10/0x10 [8255_pci]
do_one_initcall+0x72/0x183
do_init_module+0x5b/0x1e8
init_module_from_file+0x86/0xac
__do_sys_finit_module+0x151/0x218
do_syscall_64+0x72/0xdb
entry_SYSCALL_64_after_hwframe+0x6e/0x76
RIP: 0033:0x7f72f50a0cb9
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 47 71 0c 00 f7 d8 64 89 01 48
RSP: 002b:00007ffd47e512d8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
RAX: ffffffffffffffda RBX: 0000562dd06ae070 RCX: 00007f72f50a0cb9
RDX: 0000000000000000 RSI: 00007f72f52d32df RDI: 000000000000000e
RBP: 0000000000000000 R08: 00007f72f5168b20 R09: 0000000000000000
R10: 0000000000000050 R11: 0000000000000246 R12: 00007f72f52d32df
R13: 0000000000020000 R14: 0000562dd06785c0 R15: 0000562dcfd0e9a8
</TASK>
Modules linked in: 8255_pci(+) comedi_8255 comedi_pci comedi intel_gtt e100(+) acpi_cpufreq rtc_cmos usbhid
CR2: 0000000000000000
---[ end trace 0000000000000000 ]---
RIP: 0010:0x0
Code: Unable to access opcode bytes at 0xffffffffffffffd6.
RSP: 0018:ffffa3f1c02d7b78 EFLAGS: 00010202
RAX: 0000000000000000 RBX: ffff91f847aefd00 RCX: 000000000000009b
RDX: 0000000000000003 RSI: 0000000000000001 RDI: ffff91f840f6fc00
RBP: ffff91f840f6fc00 R08: 0000000000000000 R09: 0000000000000001
R10: 0000000000000000 R11: 000000000000005f R12: 0000000000000000
R13: 0000000000000000 R14: ffffffffc0102498 R15: ffff91f847ce6ba8
FS:
---truncated---
Max CVSS
5.5
EPSS Score
0.04%
Published
2024-04-17
Updated
2024-04-17
In the Linux kernel, the following vulnerability has been resolved:
spi: lpspi: Avoid potential use-after-free in probe()
fsl_lpspi_probe() is allocating/disposing memory manually with
spi_alloc_host()/spi_alloc_target(), but uses
devm_spi_register_controller(). In case of error after the latter call the
memory will be explicitly freed in the probe function by
spi_controller_put() call, but used afterwards by "devm" management outside
probe() (spi_unregister_controller() <- devm_spi_unregister() below).
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000070
...
Call trace:
kernfs_find_ns
kernfs_find_and_get_ns
sysfs_remove_group
sysfs_remove_groups
device_remove_attrs
device_del
spi_unregister_controller
devm_spi_unregister
release_nodes
devres_release_all
really_probe
driver_probe_device
__device_attach_driver
bus_for_each_drv
__device_attach
device_initial_probe
bus_probe_device
deferred_probe_work_func
process_one_work
worker_thread
kthread
ret_from_fork
Max CVSS
5.5
EPSS Score
0.05%
Published
2024-04-17
Updated
2024-04-17
In the Linux kernel, the following vulnerability has been resolved:
rds: tcp: Fix use-after-free of net in reqsk_timer_handler().
syzkaller reported a warning of netns tracker [0] followed by KASAN
splat [1] and another ref tracker warning [1].
syzkaller could not find a repro, but in the log, the only suspicious
sequence was as follows:
18:26:22 executing program 1:
r0 = socket$inet6_mptcp(0xa, 0x1, 0x106)
...
connect$inet6(r0, &(0x7f0000000080)={0xa, 0x4001, 0x0, @loopback}, 0x1c) (async)
The notable thing here is 0x4001 in connect(), which is RDS_TCP_PORT.
So, the scenario would be:
1. unshare(CLONE_NEWNET) creates a per netns tcp listener in
rds_tcp_listen_init().
2. syz-executor connect()s to it and creates a reqsk.
3. syz-executor exit()s immediately.
4. netns is dismantled. [0]
5. reqsk timer is fired, and UAF happens while freeing reqsk. [1]
6. listener is freed after RCU grace period. [2]
Basically, reqsk assumes that the listener guarantees netns safety
until all reqsk timers are expired by holding the listener's refcount.
However, this was not the case for kernel sockets.
Commit 740ea3c4a0b2 ("tcp: Clean up kernel listener's reqsk in
inet_twsk_purge()") fixed this issue only for per-netns ehash.
Let's apply the same fix for the global ehash.
[0]:
ref_tracker: net notrefcnt@0000000065449cc3 has 1/1 users at
sk_alloc (./include/net/net_namespace.h:337 net/core/sock.c:2146)
inet6_create (net/ipv6/af_inet6.c:192 net/ipv6/af_inet6.c:119)
__sock_create (net/socket.c:1572)
rds_tcp_listen_init (net/rds/tcp_listen.c:279)
rds_tcp_init_net (net/rds/tcp.c:577)
ops_init (net/core/net_namespace.c:137)
setup_net (net/core/net_namespace.c:340)
copy_net_ns (net/core/net_namespace.c:497)
create_new_namespaces (kernel/nsproxy.c:110)
unshare_nsproxy_namespaces (kernel/nsproxy.c:228 (discriminator 4))
ksys_unshare (kernel/fork.c:3429)
__x64_sys_unshare (kernel/fork.c:3496)
do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:129)
...
WARNING: CPU: 0 PID: 27 at lib/ref_tracker.c:179 ref_tracker_dir_exit (lib/ref_tracker.c:179)
[1]:
BUG: KASAN: slab-use-after-free in inet_csk_reqsk_queue_drop (./include/net/inet_hashtables.h:180 net/ipv4/inet_connection_sock.c:952 net/ipv4/inet_connection_sock.c:966)
Read of size 8 at addr ffff88801b370400 by task swapper/0/0
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
Call Trace:
<IRQ>
dump_stack_lvl (lib/dump_stack.c:107 (discriminator 1))
print_report (mm/kasan/report.c:378 mm/kasan/report.c:488)
kasan_report (mm/kasan/report.c:603)
inet_csk_reqsk_queue_drop (./include/net/inet_hashtables.h:180 net/ipv4/inet_connection_sock.c:952 net/ipv4/inet_connection_sock.c:966)
reqsk_timer_handler (net/ipv4/inet_connection_sock.c:979 net/ipv4/inet_connection_sock.c:1092)
call_timer_fn (./arch/x86/include/asm/jump_label.h:27 ./include/linux/jump_label.h:207 ./include/trace/events/timer.h:127 kernel/time/timer.c:1701)
__run_timers.part.0 (kernel/time/timer.c:1752 kernel/time/timer.c:2038)
run_timer_softirq (kernel/time/timer.c:2053)
__do_softirq (./arch/x86/include/asm/jump_label.h:27 ./include/linux/jump_label.h:207 ./include/trace/events/irq.h:142 kernel/softirq.c:554)
irq_exit_rcu (kernel/softirq.c:427 kernel/softirq.c:632 kernel/softirq.c:644)
sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1076 (discriminator 14))
</IRQ>
Allocated by task 258 on cpu 0 at 83.612050s:
kasan_save_stack (mm/kasan/common.c:48)
kasan_save_track (mm/kasan/common.c:68)
__kasan_slab_alloc (mm/kasan/common.c:343)
kmem_cache_alloc (mm/slub.c:3813 mm/slub.c:3860 mm/slub.c:3867)
copy_net_ns (./include/linux/slab.h:701 net/core/net_namespace.c:421 net/core/net_namespace.c:480)
create_new_namespaces (kernel/nsproxy.c:110)
unshare_nsproxy_name
---truncated---
Max CVSS
5.5
EPSS Score
0.05%
Published
2024-04-17
Updated
2024-04-17
In the Linux kernel, the following vulnerability has been resolved:
net/bnx2x: Prevent access to a freed page in page_pool
Fix race condition leading to system crash during EEH error handling
During EEH error recovery, the bnx2x driver's transmit timeout logic
could cause a race condition when handling reset tasks. The
bnx2x_tx_timeout() schedules reset tasks via bnx2x_sp_rtnl_task(),
which ultimately leads to bnx2x_nic_unload(). In bnx2x_nic_unload()
SGEs are freed using bnx2x_free_rx_sge_range(). However, this could
overlap with the EEH driver's attempt to reset the device using
bnx2x_io_slot_reset(), which also tries to free SGEs. This race
condition can result in system crashes due to accessing freed memory
locations in bnx2x_free_rx_sge()
799 static inline void bnx2x_free_rx_sge(struct bnx2x *bp,
800 struct bnx2x_fastpath *fp, u16 index)
801 {
802 struct sw_rx_page *sw_buf = &fp->rx_page_ring[index];
803 struct page *page = sw_buf->page;
....
where sw_buf was set to NULL after the call to dma_unmap_page()
by the preceding thread.
EEH: Beginning: 'slot_reset'
PCI 0011:01:00.0#10000: EEH: Invoking bnx2x->slot_reset()
bnx2x: [bnx2x_io_slot_reset:14228(eth1)]IO slot reset initializing...
bnx2x 0011:01:00.0: enabling device (0140 -> 0142)
bnx2x: [bnx2x_io_slot_reset:14244(eth1)]IO slot reset --> driver unload
Kernel attempted to read user page (0) - exploit attempt? (uid: 0)
BUG: Kernel NULL pointer dereference on read at 0x00000000
Faulting instruction address: 0xc0080000025065fc
Oops: Kernel access of bad area, sig: 11 [#1]
.....
Call Trace:
[c000000003c67a20] [c00800000250658c] bnx2x_io_slot_reset+0x204/0x610 [bnx2x] (unreliable)
[c000000003c67af0] [c0000000000518a8] eeh_report_reset+0xb8/0xf0
[c000000003c67b60] [c000000000052130] eeh_pe_report+0x180/0x550
[c000000003c67c70] [c00000000005318c] eeh_handle_normal_event+0x84c/0xa60
[c000000003c67d50] [c000000000053a84] eeh_event_handler+0xf4/0x170
[c000000003c67da0] [c000000000194c58] kthread+0x1c8/0x1d0
[c000000003c67e10] [c00000000000cf64] ret_from_kernel_thread+0x5c/0x64
To solve this issue, we need to verify page pool allocations before
freeing.
Max CVSS
5.5
EPSS Score
0.04%
Published
2024-04-17
Updated
2024-04-17
In the Linux kernel, the following vulnerability has been resolved:
net: sparx5: Fix use after free inside sparx5_del_mact_entry
Based on the static analyzis of the code it looks like when an entry
from the MAC table was removed, the entry was still used after being
freed. More precise the vid of the mac_entry was used after calling
devm_kfree on the mac_entry.
The fix consists in first using the vid of the mac_entry to delete the
entry from the HW and after that to free it.
Max CVSS
5.5
EPSS Score
0.05%
Published
2024-04-17
Updated
2024-04-17
In the Linux kernel, the following vulnerability has been resolved:
igc: avoid returning frame twice in XDP_REDIRECT
When a frame can not be transmitted in XDP_REDIRECT
(e.g. due to a full queue), it is necessary to free
it by calling xdp_return_frame_rx_napi.
However, this is the responsibility of the caller of
the ndo_xdp_xmit (see for example bq_xmit_all in
kernel/bpf/devmap.c) and thus calling it inside
igc_xdp_xmit (which is the ndo_xdp_xmit of the igc
driver) as well will lead to memory corruption.
In fact, bq_xmit_all expects that it can return all
frames after the last successfully transmitted one.
Therefore, break for the first not transmitted frame,
but do not call xdp_return_frame_rx_napi in igc_xdp_xmit.
This is equally implemented in other Intel drivers
such as the igb.
There are two alternatives to this that were rejected:
1. Return num_frames as all the frames would have been
transmitted and release them inside igc_xdp_xmit.
While it might work technically, it is not what
the return value is meant to represent (i.e. the
number of SUCCESSFULLY transmitted packets).
2. Rework kernel/bpf/devmap.c and all drivers to
support non-consecutively dropped packets.
Besides being complex, it likely has a negative
performance impact without a significant gain
since it is anyway unlikely that the next frame
can be transmitted if the previous one was dropped.
The memory corruption can be reproduced with
the following script which leads to a kernel panic
after a few seconds. It basically generates more
traffic than a i225 NIC can transmit and pushes it
via XDP_REDIRECT from a virtual interface to the
physical interface where frames get dropped.
#!/bin/bash
INTERFACE=enp4s0
INTERFACE_IDX=`cat /sys/class/net/$INTERFACE/ifindex`
sudo ip link add dev veth1 type veth peer name veth2
sudo ip link set up $INTERFACE
sudo ip link set up veth1
sudo ip link set up veth2
cat << EOF > redirect.bpf.c
SEC("prog")
int redirect(struct xdp_md *ctx)
{
return bpf_redirect($INTERFACE_IDX, 0);
}
char _license[] SEC("license") = "GPL";
EOF
clang -O2 -g -Wall -target bpf -c redirect.bpf.c -o redirect.bpf.o
sudo ip link set veth2 xdp obj redirect.bpf.o
cat << EOF > pass.bpf.c
SEC("prog")
int pass(struct xdp_md *ctx)
{
return XDP_PASS;
}
char _license[] SEC("license") = "GPL";
EOF
clang -O2 -g -Wall -target bpf -c pass.bpf.c -o pass.bpf.o
sudo ip link set $INTERFACE xdp obj pass.bpf.o
cat << EOF > trafgen.cfg
{
/* Ethernet Header */
0xe8, 0x6a, 0x64, 0x41, 0xbf, 0x46,
0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF,
const16(ETH_P_IP),
/* IPv4 Header */
0b01000101, 0, # IPv4 version, IHL, TOS
const16(1028), # IPv4 total length (UDP length + 20 bytes (IP header))
const16(2), # IPv4 ident
0b01000000, 0, # IPv4 flags, fragmentation off
64, # IPv4 TTL
17, # Protocol UDP
csumip(14, 33), # IPv4 checksum
/* UDP Header */
10, 0, 1, 1, # IP Src - adapt as needed
10, 0, 1, 2, # IP Dest - adapt as needed
const16(6666), # UDP Src Port
const16(6666), # UDP Dest Port
const16(1008), # UDP length (UDP header 8 bytes + payload length)
csumudp(14, 34), # UDP checksum
/* Payload */
fill('W', 1000),
}
EOF
sudo trafgen -i trafgen.cfg -b3000MB -o veth1 --cpp
Max CVSS
5.5
EPSS Score
0.05%
Published
2024-04-17
Updated
2024-04-17
In the Linux kernel, the following vulnerability has been resolved:
net/ipv6: avoid possible UAF in ip6_route_mpath_notify()
syzbot found another use-after-free in ip6_route_mpath_notify() [1]
Commit f7225172f25a ("net/ipv6: prevent use after free in
ip6_route_mpath_notify") was not able to fix the root cause.
We need to defer the fib6_info_release() calls after
ip6_route_mpath_notify(), in the cleanup phase.
[1]
BUG: KASAN: slab-use-after-free in rt6_fill_node+0x1460/0x1ac0
Read of size 4 at addr ffff88809a07fc64 by task syz-executor.2/23037
CPU: 0 PID: 23037 Comm: syz-executor.2 Not tainted 6.8.0-rc4-syzkaller-01035-gea7f3cfaa588 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/25/2024
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x1e7/0x2e0 lib/dump_stack.c:106
print_address_description mm/kasan/report.c:377 [inline]
print_report+0x167/0x540 mm/kasan/report.c:488
kasan_report+0x142/0x180 mm/kasan/report.c:601
rt6_fill_node+0x1460/0x1ac0
inet6_rt_notify+0x13b/0x290 net/ipv6/route.c:6184
ip6_route_mpath_notify net/ipv6/route.c:5198 [inline]
ip6_route_multipath_add net/ipv6/route.c:5404 [inline]
inet6_rtm_newroute+0x1d0f/0x2300 net/ipv6/route.c:5517
rtnetlink_rcv_msg+0x885/0x1040 net/core/rtnetlink.c:6597
netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2543
netlink_unicast_kernel net/netlink/af_netlink.c:1341 [inline]
netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1367
netlink_sendmsg+0xa3b/0xd70 net/netlink/af_netlink.c:1908
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg+0x221/0x270 net/socket.c:745
____sys_sendmsg+0x525/0x7d0 net/socket.c:2584
___sys_sendmsg net/socket.c:2638 [inline]
__sys_sendmsg+0x2b0/0x3a0 net/socket.c:2667
do_syscall_64+0xf9/0x240
entry_SYSCALL_64_after_hwframe+0x6f/0x77
RIP: 0033:0x7f73dd87dda9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 e1 20 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f73de6550c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007f73dd9ac050 RCX: 00007f73dd87dda9
RDX: 0000000000000000 RSI: 0000000020000140 RDI: 0000000000000005
RBP: 00007f73dd8ca47a R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 000000000000006e R14: 00007f73dd9ac050 R15: 00007ffdbdeb7858
</TASK>
Allocated by task 23037:
kasan_save_stack mm/kasan/common.c:47 [inline]
kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
poison_kmalloc_redzone mm/kasan/common.c:372 [inline]
__kasan_kmalloc+0x98/0xb0 mm/kasan/common.c:389
kasan_kmalloc include/linux/kasan.h:211 [inline]
__do_kmalloc_node mm/slub.c:3981 [inline]
__kmalloc+0x22e/0x490 mm/slub.c:3994
kmalloc include/linux/slab.h:594 [inline]
kzalloc include/linux/slab.h:711 [inline]
fib6_info_alloc+0x2e/0xf0 net/ipv6/ip6_fib.c:155
ip6_route_info_create+0x445/0x12b0 net/ipv6/route.c:3758
ip6_route_multipath_add net/ipv6/route.c:5298 [inline]
inet6_rtm_newroute+0x744/0x2300 net/ipv6/route.c:5517
rtnetlink_rcv_msg+0x885/0x1040 net/core/rtnetlink.c:6597
netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2543
netlink_unicast_kernel net/netlink/af_netlink.c:1341 [inline]
netlink_unicast+0x7ea/0x980 net/netlink/af_netlink.c:1367
netlink_sendmsg+0xa3b/0xd70 net/netlink/af_netlink.c:1908
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg+0x221/0x270 net/socket.c:745
____sys_sendmsg+0x525/0x7d0 net/socket.c:2584
___sys_sendmsg net/socket.c:2638 [inline]
__sys_sendmsg+0x2b0/0x3a0 net/socket.c:2667
do_syscall_64+0xf9/0x240
entry_SYSCALL_64_after_hwframe+0x6f/0x77
Freed by task 16:
kasan_save_stack mm/kasan/common.c:47 [inline]
kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
kasan_save_free_info+0x4e/0x60 mm/kasan/generic.c:640
poison_slab_object+0xa6/0xe0 m
---truncated---
Max CVSS
5.5
EPSS Score
0.04%
Published
2024-04-17
Updated
2024-04-17
In the Linux kernel, the following vulnerability has been resolved:
nvme-fc: do not wait in vain when unloading module
The module exit path has race between deleting all controllers and
freeing 'left over IDs'. To prevent double free a synchronization
between nvme_delete_ctrl and ida_destroy has been added by the initial
commit.
There is some logic around trying to prevent from hanging forever in
wait_for_completion, though it does not handling all cases. E.g.
blktests is able to reproduce the situation where the module unload
hangs forever.
If we completely rely on the cleanup code executed from the
nvme_delete_ctrl path, all IDs will be freed eventually. This makes
calling ida_destroy unnecessary. We only have to ensure that all
nvme_delete_ctrl code has been executed before we leave
nvme_fc_exit_module. This is done by flushing the nvme_delete_wq
workqueue.
While at it, remove the unused nvme_fc_wq workqueue too.
Max CVSS
5.5
EPSS Score
0.04%
Published
2024-04-17
Updated
2024-04-17
In the Linux kernel, the following vulnerability has been resolved:
spi: cadence-qspi: fix pointer reference in runtime PM hooks
dev_get_drvdata() gets used to acquire the pointer to cqspi and the SPI
controller. Neither embed the other; this lead to memory corruption.
On a given platform (Mobileye EyeQ5) the memory corruption is hidden
inside cqspi->f_pdata. Also, this uninitialised memory is used as a
mutex (ctlr->bus_lock_mutex) by spi_controller_suspend().
Max CVSS
5.5
EPSS Score
0.05%
Published
2024-04-04
Updated
2024-04-04
In the Linux kernel, the following vulnerability has been resolved:
net: ip_tunnel: prevent perpetual headroom growth
syzkaller triggered following kasan splat:
BUG: KASAN: use-after-free in __skb_flow_dissect+0x19d1/0x7a50 net/core/flow_dissector.c:1170
Read of size 1 at addr ffff88812fb4000e by task syz-executor183/5191
[..]
kasan_report+0xda/0x110 mm/kasan/report.c:588
__skb_flow_dissect+0x19d1/0x7a50 net/core/flow_dissector.c:1170
skb_flow_dissect_flow_keys include/linux/skbuff.h:1514 [inline]
___skb_get_hash net/core/flow_dissector.c:1791 [inline]
__skb_get_hash+0xc7/0x540 net/core/flow_dissector.c:1856
skb_get_hash include/linux/skbuff.h:1556 [inline]
ip_tunnel_xmit+0x1855/0x33c0 net/ipv4/ip_tunnel.c:748
ipip_tunnel_xmit+0x3cc/0x4e0 net/ipv4/ipip.c:308
__netdev_start_xmit include/linux/netdevice.h:4940 [inline]
netdev_start_xmit include/linux/netdevice.h:4954 [inline]
xmit_one net/core/dev.c:3548 [inline]
dev_hard_start_xmit+0x13d/0x6d0 net/core/dev.c:3564
__dev_queue_xmit+0x7c1/0x3d60 net/core/dev.c:4349
dev_queue_xmit include/linux/netdevice.h:3134 [inline]
neigh_connected_output+0x42c/0x5d0 net/core/neighbour.c:1592
...
ip_finish_output2+0x833/0x2550 net/ipv4/ip_output.c:235
ip_finish_output+0x31/0x310 net/ipv4/ip_output.c:323
..
iptunnel_xmit+0x5b4/0x9b0 net/ipv4/ip_tunnel_core.c:82
ip_tunnel_xmit+0x1dbc/0x33c0 net/ipv4/ip_tunnel.c:831
ipgre_xmit+0x4a1/0x980 net/ipv4/ip_gre.c:665
__netdev_start_xmit include/linux/netdevice.h:4940 [inline]
netdev_start_xmit include/linux/netdevice.h:4954 [inline]
xmit_one net/core/dev.c:3548 [inline]
dev_hard_start_xmit+0x13d/0x6d0 net/core/dev.c:3564
...
The splat occurs because skb->data points past skb->head allocated area.
This is because neigh layer does:
__skb_pull(skb, skb_network_offset(skb));
... but skb_network_offset() returns a negative offset and __skb_pull()
arg is unsigned. IOW, we skb->data gets "adjusted" by a huge value.
The negative value is returned because skb->head and skb->data distance is
more than 64k and skb->network_header (u16) has wrapped around.
The bug is in the ip_tunnel infrastructure, which can cause
dev->needed_headroom to increment ad infinitum.
The syzkaller reproducer consists of packets getting routed via a gre
tunnel, and route of gre encapsulated packets pointing at another (ipip)
tunnel. The ipip encapsulation finds gre0 as next output device.
This results in the following pattern:
1). First packet is to be sent out via gre0.
Route lookup found an output device, ipip0.
2).
ip_tunnel_xmit for gre0 bumps gre0->needed_headroom based on the future
output device, rt.dev->needed_headroom (ipip0).
3).
ip output / start_xmit moves skb on to ipip0. which runs the same
code path again (xmit recursion).
4).
Routing step for the post-gre0-encap packet finds gre0 as output device
to use for ipip0 encapsulated packet.
tunl0->needed_headroom is then incremented based on the (already bumped)
gre0 device headroom.
This repeats for every future packet:
gre0->needed_headroom gets inflated because previous packets' ipip0 step
incremented rt->dev (gre0) headroom, and ipip0 incremented because gre0
needed_headroom was increased.
For each subsequent packet, gre/ipip0->needed_headroom grows until
post-expand-head reallocations result in a skb->head/data distance of
more than 64k.
Once that happens, skb->network_header (u16) wraps around when
pskb_expand_head tries to make sure that skb_network_offset() is unchanged
after the headroom expansion/reallocation.
After this skb_network_offset(skb) returns a different (and negative)
result post headroom expansion.
The next trip to neigh layer (or anything else that would __skb_pull the
network header) makes skb->data point to a memory location outside
skb->head area.
v2: Cap the needed_headroom update to an arbitarily chosen upperlimit to
prevent perpetual increase instead of dropping the headroom increment
completely.
Max CVSS
5.5
EPSS Score
0.04%
Published
2024-04-04
Updated
2024-04-04
In the Linux kernel, the following vulnerability has been resolved:
drivers: perf: ctr_get_width function for legacy is not defined
With parameters CONFIG_RISCV_PMU_LEGACY=y and CONFIG_RISCV_PMU_SBI=n
linux kernel crashes when you try perf record:
$ perf record ls
[ 46.749286] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
[ 46.750199] Oops [#1]
[ 46.750342] Modules linked in:
[ 46.750608] CPU: 0 PID: 107 Comm: perf-exec Not tainted 6.6.0 #2
[ 46.750906] Hardware name: riscv-virtio,qemu (DT)
[ 46.751184] epc : 0x0
[ 46.751430] ra : arch_perf_update_userpage+0x54/0x13e
[ 46.751680] epc : 0000000000000000 ra : ffffffff8072ee52 sp : ff2000000022b8f0
[ 46.751958] gp : ffffffff81505988 tp : ff6000000290d400 t0 : ff2000000022b9c0
[ 46.752229] t1 : 0000000000000001 t2 : 0000000000000003 s0 : ff2000000022b930
[ 46.752451] s1 : ff600000028fb000 a0 : 0000000000000000 a1 : ff600000028fb000
[ 46.752673] a2 : 0000000ae2751268 a3 : 00000000004fb708 a4 : 0000000000000004
[ 46.752895] a5 : 0000000000000000 a6 : 000000000017ffe3 a7 : 00000000000000d2
[ 46.753117] s2 : ff600000028fb000 s3 : 0000000ae2751268 s4 : 0000000000000000
[ 46.753338] s5 : ffffffff8153e290 s6 : ff600000863b9000 s7 : ff60000002961078
[ 46.753562] s8 : ff60000002961048 s9 : ff60000002961058 s10: 0000000000000001
[ 46.753783] s11: 0000000000000018 t3 : ffffffffffffffff t4 : ffffffffffffffff
[ 46.754005] t5 : ff6000000292270c t6 : ff2000000022bb30
[ 46.754179] status: 0000000200000100 badaddr: 0000000000000000 cause: 000000000000000c
[ 46.754653] Code: Unable to access instruction at 0xffffffffffffffec.
[ 46.754939] ---[ end trace 0000000000000000 ]---
[ 46.755131] note: perf-exec[107] exited with irqs disabled
[ 46.755546] note: perf-exec[107] exited with preempt_count 4
This happens because in the legacy case the ctr_get_width function was not
defined, but it is used in arch_perf_update_userpage.
Also remove extra check in riscv_pmu_ctr_get_width_mask
Max CVSS
5.5
EPSS Score
0.05%
Published
2024-04-04
Updated
2024-04-04
In the Linux kernel, the following vulnerability has been resolved:
gtp: fix use-after-free and null-ptr-deref in gtp_newlink()
The gtp_link_ops operations structure for the subsystem must be
registered after registering the gtp_net_ops pernet operations structure.
Syzkaller hit 'general protection fault in gtp_genl_dump_pdp' bug:
[ 1010.702740] gtp: GTP module unloaded
[ 1010.715877] general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] SMP KASAN NOPTI
[ 1010.715888] KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f]
[ 1010.715895] CPU: 1 PID: 128616 Comm: a.out Not tainted 6.8.0-rc6-std-def-alt1 #1
[ 1010.715899] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.0-alt1 04/01/2014
[ 1010.715908] RIP: 0010:gtp_newlink+0x4d7/0x9c0 [gtp]
[ 1010.715915] Code: 80 3c 02 00 0f 85 41 04 00 00 48 8b bb d8 05 00 00 e8 ed f6 ff ff 48 89 c2 48 89 c5 48 b8 00 00 00 00 00 fc ff df 48 c1 ea 03 <80> 3c 02 00 0f 85 4f 04 00 00 4c 89 e2 4c 8b 6d 00 48 b8 00 00 00
[ 1010.715920] RSP: 0018:ffff888020fbf180 EFLAGS: 00010203
[ 1010.715929] RAX: dffffc0000000000 RBX: ffff88800399c000 RCX: 0000000000000000
[ 1010.715933] RDX: 0000000000000001 RSI: ffffffff84805280 RDI: 0000000000000282
[ 1010.715938] RBP: 000000000000000d R08: 0000000000000001 R09: 0000000000000000
[ 1010.715942] R10: 0000000000000001 R11: 0000000000000001 R12: ffff88800399cc80
[ 1010.715947] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000400
[ 1010.715953] FS: 00007fd1509ab5c0(0000) GS:ffff88805b300000(0000) knlGS:0000000000000000
[ 1010.715958] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1010.715962] CR2: 0000000000000000 CR3: 000000001c07a000 CR4: 0000000000750ee0
[ 1010.715968] PKRU: 55555554
[ 1010.715972] Call Trace:
[ 1010.715985] ? __die_body.cold+0x1a/0x1f
[ 1010.715995] ? die_addr+0x43/0x70
[ 1010.716002] ? exc_general_protection+0x199/0x2f0
[ 1010.716016] ? asm_exc_general_protection+0x1e/0x30
[ 1010.716026] ? gtp_newlink+0x4d7/0x9c0 [gtp]
[ 1010.716034] ? gtp_net_exit+0x150/0x150 [gtp]
[ 1010.716042] __rtnl_newlink+0x1063/0x1700
[ 1010.716051] ? rtnl_setlink+0x3c0/0x3c0
[ 1010.716063] ? is_bpf_text_address+0xc0/0x1f0
[ 1010.716070] ? kernel_text_address.part.0+0xbb/0xd0
[ 1010.716076] ? __kernel_text_address+0x56/0xa0
[ 1010.716084] ? unwind_get_return_address+0x5a/0xa0
[ 1010.716091] ? create_prof_cpu_mask+0x30/0x30
[ 1010.716098] ? arch_stack_walk+0x9e/0xf0
[ 1010.716106] ? stack_trace_save+0x91/0xd0
[ 1010.716113] ? stack_trace_consume_entry+0x170/0x170
[ 1010.716121] ? __lock_acquire+0x15c5/0x5380
[ 1010.716139] ? mark_held_locks+0x9e/0xe0
[ 1010.716148] ? kmem_cache_alloc_trace+0x35f/0x3c0
[ 1010.716155] ? __rtnl_newlink+0x1700/0x1700
[ 1010.716160] rtnl_newlink+0x69/0xa0
[ 1010.716166] rtnetlink_rcv_msg+0x43b/0xc50
[ 1010.716172] ? rtnl_fdb_dump+0x9f0/0x9f0
[ 1010.716179] ? lock_acquire+0x1fe/0x560
[ 1010.716188] ? netlink_deliver_tap+0x12f/0xd50
[ 1010.716196] netlink_rcv_skb+0x14d/0x440
[ 1010.716202] ? rtnl_fdb_dump+0x9f0/0x9f0
[ 1010.716208] ? netlink_ack+0xab0/0xab0
[ 1010.716213] ? netlink_deliver_tap+0x202/0xd50
[ 1010.716220] ? netlink_deliver_tap+0x218/0xd50
[ 1010.716226] ? __virt_addr_valid+0x30b/0x590
[ 1010.716233] netlink_unicast+0x54b/0x800
[ 1010.716240] ? netlink_attachskb+0x870/0x870
[ 1010.716248] ? __check_object_size+0x2de/0x3b0
[ 1010.716254] netlink_sendmsg+0x938/0xe40
[ 1010.716261] ? netlink_unicast+0x800/0x800
[ 1010.716269] ? __import_iovec+0x292/0x510
[ 1010.716276] ? netlink_unicast+0x800/0x800
[ 1010.716284] __sock_sendmsg+0x159/0x190
[ 1010.716290] ____sys_sendmsg+0x712/0x880
[ 1010.716297] ? sock_write_iter+0x3d0/0x3d0
[ 1010.716304] ? __ia32_sys_recvmmsg+0x270/0x270
[ 1010.716309] ? lock_acquire+0x1fe/0x560
[ 1010.716315] ? drain_array_locked+0x90/0x90
[ 1010.716324] ___sys_sendmsg+0xf8/0x170
[ 1010.716331] ? sendmsg_copy_msghdr+0x170/0x170
[ 1010.716337] ? lockdep_init_map
---truncated---
Max CVSS
5.5
EPSS Score
0.04%
Published
2024-04-04
Updated
2024-04-04
In the Linux kernel, the following vulnerability has been resolved:
btrfs: fix double free of anonymous device after snapshot creation failure
When creating a snapshot we may do a double free of an anonymous device
in case there's an error committing the transaction. The second free may
result in freeing an anonymous device number that was allocated by some
other subsystem in the kernel or another btrfs filesystem.
The steps that lead to this:
1) At ioctl.c:create_snapshot() we allocate an anonymous device number
and assign it to pending_snapshot->anon_dev;
2) Then we call btrfs_commit_transaction() and end up at
transaction.c:create_pending_snapshot();
3) There we call btrfs_get_new_fs_root() and pass it the anonymous device
number stored in pending_snapshot->anon_dev;
4) btrfs_get_new_fs_root() frees that anonymous device number because
btrfs_lookup_fs_root() returned a root - someone else did a lookup
of the new root already, which could some task doing backref walking;
5) After that some error happens in the transaction commit path, and at
ioctl.c:create_snapshot() we jump to the 'fail' label, and after
that we free again the same anonymous device number, which in the
meanwhile may have been reallocated somewhere else, because
pending_snapshot->anon_dev still has the same value as in step 1.
Recently syzbot ran into this and reported the following trace:
------------[ cut here ]------------
ida_free called for id=51 which is not allocated.
WARNING: CPU: 1 PID: 31038 at lib/idr.c:525 ida_free+0x370/0x420 lib/idr.c:525
Modules linked in:
CPU: 1 PID: 31038 Comm: syz-executor.2 Not tainted 6.8.0-rc4-syzkaller-00410-gc02197fc9076 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/25/2024
RIP: 0010:ida_free+0x370/0x420 lib/idr.c:525
Code: 10 42 80 3c 28 (...)
RSP: 0018:ffffc90015a67300 EFLAGS: 00010246
RAX: be5130472f5dd000 RBX: 0000000000000033 RCX: 0000000000040000
RDX: ffffc90009a7a000 RSI: 000000000003ffff RDI: 0000000000040000
RBP: ffffc90015a673f0 R08: ffffffff81577992 R09: 1ffff92002b4cdb4
R10: dffffc0000000000 R11: fffff52002b4cdb5 R12: 0000000000000246
R13: dffffc0000000000 R14: ffffffff8e256b80 R15: 0000000000000246
FS: 00007fca3f4b46c0(0000) GS:ffff8880b9500000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f167a17b978 CR3: 000000001ed26000 CR4: 0000000000350ef0
Call Trace:
<TASK>
btrfs_get_root_ref+0xa48/0xaf0 fs/btrfs/disk-io.c:1346
create_pending_snapshot+0xff2/0x2bc0 fs/btrfs/transaction.c:1837
create_pending_snapshots+0x195/0x1d0 fs/btrfs/transaction.c:1931
btrfs_commit_transaction+0xf1c/0x3740 fs/btrfs/transaction.c:2404
create_snapshot+0x507/0x880 fs/btrfs/ioctl.c:848
btrfs_mksubvol+0x5d0/0x750 fs/btrfs/ioctl.c:998
btrfs_mksnapshot+0xb5/0xf0 fs/btrfs/ioctl.c:1044
__btrfs_ioctl_snap_create+0x387/0x4b0 fs/btrfs/ioctl.c:1306
btrfs_ioctl_snap_create_v2+0x1ca/0x400 fs/btrfs/ioctl.c:1393
btrfs_ioctl+0xa74/0xd40
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:871 [inline]
__se_sys_ioctl+0xfe/0x170 fs/ioctl.c:857
do_syscall_64+0xfb/0x240
entry_SYSCALL_64_after_hwframe+0x6f/0x77
RIP: 0033:0x7fca3e67dda9
Code: 28 00 00 00 (...)
RSP: 002b:00007fca3f4b40c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007fca3e7abf80 RCX: 00007fca3e67dda9
RDX: 00000000200005c0 RSI: 0000000050009417 RDI: 0000000000000003
RBP: 00007fca3e6ca47a R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 000000000000000b R14: 00007fca3e7abf80 R15: 00007fff6bf95658
</TASK>
Where we get an explicit message where we attempt to free an anonymous
device number that is not currently allocated. It happens in a different
code path from the example below, at btrfs_get_root_ref(), so this change
may not fix the case triggered by sy
---truncated---
Max CVSS
5.5
EPSS Score
0.05%
Published
2024-04-04
Updated
2024-04-04
In the Linux kernel, the following vulnerability has been resolved:
iommufd: Fix protection fault in iommufd_test_syz_conv_iova
Syzkaller reported the following bug:
general protection fault, probably for non-canonical address 0xdffffc0000000038: 0000 [#1] SMP KASAN
KASAN: null-ptr-deref in range [0x00000000000001c0-0x00000000000001c7]
Call Trace:
lock_acquire
lock_acquire+0x1ce/0x4f0
down_read+0x93/0x4a0
iommufd_test_syz_conv_iova+0x56/0x1f0
iommufd_test_access_rw.isra.0+0x2ec/0x390
iommufd_test+0x1058/0x1e30
iommufd_fops_ioctl+0x381/0x510
vfs_ioctl
__do_sys_ioctl
__se_sys_ioctl
__x64_sys_ioctl+0x170/0x1e0
do_syscall_x64
do_syscall_64+0x71/0x140
This is because the new iommufd_access_change_ioas() sets access->ioas to
NULL during its process, so the lock might be gone in a concurrent racing
context.
Fix this by doing the same access->ioas sanity as iommufd_access_rw() and
iommufd_access_pin_pages() functions do.
Max CVSS
5.5
EPSS Score
0.04%
Published
2024-04-04
Updated
2024-04-04
In the Linux kernel, the following vulnerability has been resolved:
mptcp: fix double-free on socket dismantle
when MPTCP server accepts an incoming connection, it clones its listener
socket. However, the pointer to 'inet_opt' for the new socket has the same
value as the original one: as a consequence, on program exit it's possible
to observe the following splat:
BUG: KASAN: double-free in inet_sock_destruct+0x54f/0x8b0
Free of addr ffff888485950880 by task swapper/25/0
CPU: 25 PID: 0 Comm: swapper/25 Kdump: loaded Not tainted 6.8.0-rc1+ #609
Hardware name: Supermicro SYS-6027R-72RF/X9DRH-7TF/7F/iTF/iF, BIOS 3.0 07/26/2013
Call Trace:
<IRQ>
dump_stack_lvl+0x32/0x50
print_report+0xca/0x620
kasan_report_invalid_free+0x64/0x90
__kasan_slab_free+0x1aa/0x1f0
kfree+0xed/0x2e0
inet_sock_destruct+0x54f/0x8b0
__sk_destruct+0x48/0x5b0
rcu_do_batch+0x34e/0xd90
rcu_core+0x559/0xac0
__do_softirq+0x183/0x5a4
irq_exit_rcu+0x12d/0x170
sysvec_apic_timer_interrupt+0x6b/0x80
</IRQ>
<TASK>
asm_sysvec_apic_timer_interrupt+0x16/0x20
RIP: 0010:cpuidle_enter_state+0x175/0x300
Code: 30 00 0f 84 1f 01 00 00 83 e8 01 83 f8 ff 75 e5 48 83 c4 18 44 89 e8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc fb 45 85 ed <0f> 89 60 ff ff ff 48 c1 e5 06 48 c7 43 18 00 00 00 00 48 83 44 2b
RSP: 0018:ffff888481cf7d90 EFLAGS: 00000202
RAX: 0000000000000000 RBX: ffff88887facddc8 RCX: 0000000000000000
RDX: 1ffff1110ff588b1 RSI: 0000000000000019 RDI: ffff88887fac4588
RBP: 0000000000000004 R08: 0000000000000002 R09: 0000000000043080
R10: 0009b02ea273363f R11: ffff88887fabf42b R12: ffffffff932592e0
R13: 0000000000000004 R14: 0000000000000000 R15: 00000022c880ec80
cpuidle_enter+0x4a/0xa0
do_idle+0x310/0x410
cpu_startup_entry+0x51/0x60
start_secondary+0x211/0x270
secondary_startup_64_no_verify+0x184/0x18b
</TASK>
Allocated by task 6853:
kasan_save_stack+0x1c/0x40
kasan_save_track+0x10/0x30
__kasan_kmalloc+0xa6/0xb0
__kmalloc+0x1eb/0x450
cipso_v4_sock_setattr+0x96/0x360
netlbl_sock_setattr+0x132/0x1f0
selinux_netlbl_socket_post_create+0x6c/0x110
selinux_socket_post_create+0x37b/0x7f0
security_socket_post_create+0x63/0xb0
__sock_create+0x305/0x450
__sys_socket_create.part.23+0xbd/0x130
__sys_socket+0x37/0xb0
__x64_sys_socket+0x6f/0xb0
do_syscall_64+0x83/0x160
entry_SYSCALL_64_after_hwframe+0x6e/0x76
Freed by task 6858:
kasan_save_stack+0x1c/0x40
kasan_save_track+0x10/0x30
kasan_save_free_info+0x3b/0x60
__kasan_slab_free+0x12c/0x1f0
kfree+0xed/0x2e0
inet_sock_destruct+0x54f/0x8b0
__sk_destruct+0x48/0x5b0
subflow_ulp_release+0x1f0/0x250
tcp_cleanup_ulp+0x6e/0x110
tcp_v4_destroy_sock+0x5a/0x3a0
inet_csk_destroy_sock+0x135/0x390
tcp_fin+0x416/0x5c0
tcp_data_queue+0x1bc8/0x4310
tcp_rcv_state_process+0x15a3/0x47b0
tcp_v4_do_rcv+0x2c1/0x990
tcp_v4_rcv+0x41fb/0x5ed0
ip_protocol_deliver_rcu+0x6d/0x9f0
ip_local_deliver_finish+0x278/0x360
ip_local_deliver+0x182/0x2c0
ip_rcv+0xb5/0x1c0
__netif_receive_skb_one_core+0x16e/0x1b0
process_backlog+0x1e3/0x650
__napi_poll+0xa6/0x500
net_rx_action+0x740/0xbb0
__do_softirq+0x183/0x5a4
The buggy address belongs to the object at ffff888485950880
which belongs to the cache kmalloc-64 of size 64
The buggy address is located 0 bytes inside of
64-byte region [ffff888485950880, ffff8884859508c0)
The buggy address belongs to the physical page:
page:0000000056d1e95e refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888485950700 pfn:0x485950
flags: 0x57ffffc0000800(slab|node=1|zone=2|lastcpupid=0x1fffff)
page_type: 0xffffffff()
raw: 0057ffffc0000800 ffff88810004c640 ffffea00121b8ac0 dead000000000006
raw: ffff888485950700 0000000000200019 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff888485950780: fa fb fb
---truncated---
Max CVSS
5.5
EPSS Score
0.04%
Published
2024-04-04
Updated
2024-04-04
In the Linux kernel, the following vulnerability has been resolved:
powerpc/pseries/iommu: IOMMU table is not initialized for kdump over SR-IOV
When kdump kernel tries to copy dump data over SR-IOV, LPAR panics due
to NULL pointer exception:
Kernel attempted to read user page (0) - exploit attempt? (uid: 0)
BUG: Kernel NULL pointer dereference on read at 0x00000000
Faulting instruction address: 0xc000000020847ad4
Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
Modules linked in: mlx5_core(+) vmx_crypto pseries_wdt papr_scm libnvdimm mlxfw tls psample sunrpc fuse overlay squashfs loop
CPU: 12 PID: 315 Comm: systemd-udevd Not tainted 6.4.0-Test102+ #12
Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1060.00 (NH1060_008) hv:phyp pSeries
NIP: c000000020847ad4 LR: c00000002083b2dc CTR: 00000000006cd18c
REGS: c000000029162ca0 TRAP: 0300 Not tainted (6.4.0-Test102+)
MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 48288244 XER: 00000008
CFAR: c00000002083b2d8 DAR: 0000000000000000 DSISR: 40000000 IRQMASK: 1
...
NIP _find_next_zero_bit+0x24/0x110
LR bitmap_find_next_zero_area_off+0x5c/0xe0
Call Trace:
dev_printk_emit+0x38/0x48 (unreliable)
iommu_area_alloc+0xc4/0x180
iommu_range_alloc+0x1e8/0x580
iommu_alloc+0x60/0x130
iommu_alloc_coherent+0x158/0x2b0
dma_iommu_alloc_coherent+0x3c/0x50
dma_alloc_attrs+0x170/0x1f0
mlx5_cmd_init+0xc0/0x760 [mlx5_core]
mlx5_function_setup+0xf0/0x510 [mlx5_core]
mlx5_init_one+0x84/0x210 [mlx5_core]
probe_one+0x118/0x2c0 [mlx5_core]
local_pci_probe+0x68/0x110
pci_call_probe+0x68/0x200
pci_device_probe+0xbc/0x1a0
really_probe+0x104/0x540
__driver_probe_device+0xb4/0x230
driver_probe_device+0x54/0x130
__driver_attach+0x158/0x2b0
bus_for_each_dev+0xa8/0x130
driver_attach+0x34/0x50
bus_add_driver+0x16c/0x300
driver_register+0xa4/0x1b0
__pci_register_driver+0x68/0x80
mlx5_init+0xb8/0x100 [mlx5_core]
do_one_initcall+0x60/0x300
do_init_module+0x7c/0x2b0
At the time of LPAR dump, before kexec hands over control to kdump
kernel, DDWs (Dynamic DMA Windows) are scanned and added to the FDT.
For the SR-IOV case, default DMA window "ibm,dma-window" is removed from
the FDT and DDW added, for the device.
Now, kexec hands over control to the kdump kernel.
When the kdump kernel initializes, PCI busses are scanned and IOMMU
group/tables created, in pci_dma_bus_setup_pSeriesLP(). For the SR-IOV
case, there is no "ibm,dma-window". The original commit: b1fc44eaa9ba,
fixes the path where memory is pre-mapped (direct mapped) to the DDW.
When TCEs are direct mapped, there is no need to initialize IOMMU
tables.
iommu_table_setparms_lpar() only considers "ibm,dma-window" property
when initiallizing IOMMU table. In the scenario where TCEs are
dynamically allocated for SR-IOV, newly created IOMMU table is not
initialized. Later, when the device driver tries to enter TCEs for the
SR-IOV device, NULL pointer execption is thrown from iommu_area_alloc().
The fix is to initialize the IOMMU table with DDW property stored in the
FDT. There are 2 points to remember:
1. For the dedicated adapter, kdump kernel would encounter both
default and DDW in FDT. In this case, DDW property is used to
initialize the IOMMU table.
2. A DDW could be direct or dynamic mapped. kdump kernel would
initialize IOMMU table and mark the existing DDW as
"dynamic". This works fine since, at the time of table
initialization, iommu_table_clear() makes some space in the
DDW, for some predefined number of TCEs which are needed for
kdump to succeed.
Max CVSS
5.5
EPSS Score
0.05%
Published
2024-04-04
Updated
2024-04-04