| CVE |
Vendors |
Products |
Updated |
CVSS v3.1 |
| In the Linux kernel, the following vulnerability has been resolved:
mm, swap: fix potential UAF issue for VMA readahead
Since commit 78524b05f1a3 ("mm, swap: avoid redundant swap device
pinning"), the common helper for allocating and preparing a folio in the
swap cache layer no longer tries to get a swap device reference
internally, because all callers of __read_swap_cache_async are already
holding a swap entry reference. The repeated swap device pinning isn't
needed on the same swap device.
Caller of VMA readahead is also holding a reference to the target entry's
swap device, but VMA readahead walks the page table, so it might encounter
swap entries from other devices, and call __read_swap_cache_async on
another device without holding a reference to it.
So it is possible to cause a UAF when swapoff of device A raced with
swapin on device B, and VMA readahead tries to read swap entries from
device A. It's not easy to trigger, but in theory, it could cause real
issues.
Make VMA readahead try to get the device reference first if the swap
device is a different one from the target entry. |
| In the Linux kernel, the following vulnerability has been resolved:
exfat: fix improper check of dentry.stream.valid_size
We found an infinite loop bug in the exFAT file system that can lead to a
Denial-of-Service (DoS) condition. When a dentry in an exFAT filesystem is
malformed, the following system calls — SYS_openat, SYS_ftruncate, and
SYS_pwrite64 — can cause the kernel to hang.
Root cause analysis shows that the size validation code in exfat_find()
does not check whether dentry.stream.valid_size is negative. As a result,
the system calls mentioned above can succeed and eventually trigger the DoS
issue.
This patch adds a check for negative dentry.stream.valid_size to prevent
this vulnerability. |
| In the Linux kernel, the following vulnerability has been resolved:
mm/secretmem: fix use-after-free race in fault handler
When a page fault occurs in a secret memory file created with
`memfd_secret(2)`, the kernel will allocate a new folio for it, mark the
underlying page as not-present in the direct map, and add it to the file
mapping.
If two tasks cause a fault in the same page concurrently, both could end
up allocating a folio and removing the page from the direct map, but only
one would succeed in adding the folio to the file mapping. The task that
failed undoes the effects of its attempt by (a) freeing the folio again
and (b) putting the page back into the direct map. However, by doing
these two operations in this order, the page becomes available to the
allocator again before it is placed back in the direct mapping.
If another task attempts to allocate the page between (a) and (b), and the
kernel tries to access it via the direct map, it would result in a
supervisor not-present page fault.
Fix the ordering to restore the direct map before the folio is freed. |
| In the Linux kernel, the following vulnerability has been resolved:
Bluetooth: MGMT: cancel mesh send timer when hdev removed
mesh_send_done timer is not canceled when hdev is removed, which causes
crash if the timer triggers after hdev is gone.
Cancel the timer when MGMT removes the hdev, like other MGMT timers.
Should fix the BUG: sporadically seen by BlueZ test bot
(in "Mesh - Send cancel - 1" test).
Log:
------
BUG: KASAN: slab-use-after-free in run_timer_softirq+0x76b/0x7d0
...
Freed by task 36:
kasan_save_stack+0x24/0x50
kasan_save_track+0x14/0x30
__kasan_save_free_info+0x3a/0x60
__kasan_slab_free+0x43/0x70
kfree+0x103/0x500
device_release+0x9a/0x210
kobject_put+0x100/0x1e0
vhci_release+0x18b/0x240
------ |
| In the Linux kernel, the following vulnerability has been resolved:
f2fs: fix to invalidate dcc->f2fs_issue_discard in error path
Syzbot reports a NULL pointer dereference issue as below:
__refcount_add include/linux/refcount.h:193 [inline]
__refcount_inc include/linux/refcount.h:250 [inline]
refcount_inc include/linux/refcount.h:267 [inline]
get_task_struct include/linux/sched/task.h:110 [inline]
kthread_stop+0x34/0x1c0 kernel/kthread.c:703
f2fs_stop_discard_thread+0x3c/0x5c fs/f2fs/segment.c:1638
kill_f2fs_super+0x5c/0x194 fs/f2fs/super.c:4522
deactivate_locked_super+0x70/0xe8 fs/super.c:332
deactivate_super+0xd0/0xd4 fs/super.c:363
cleanup_mnt+0x1f8/0x234 fs/namespace.c:1186
__cleanup_mnt+0x20/0x30 fs/namespace.c:1193
task_work_run+0xc4/0x14c kernel/task_work.c:177
exit_task_work include/linux/task_work.h:38 [inline]
do_exit+0x26c/0xbe0 kernel/exit.c:795
do_group_exit+0x60/0xe8 kernel/exit.c:925
__do_sys_exit_group kernel/exit.c:936 [inline]
__se_sys_exit_group kernel/exit.c:934 [inline]
__wake_up_parent+0x0/0x40 kernel/exit.c:934
__invoke_syscall arch/arm64/kernel/syscall.c:38 [inline]
invoke_syscall arch/arm64/kernel/syscall.c:52 [inline]
el0_svc_common+0x138/0x220 arch/arm64/kernel/syscall.c:142
do_el0_svc+0x48/0x164 arch/arm64/kernel/syscall.c:206
el0_svc+0x58/0x150 arch/arm64/kernel/entry-common.c:636
el0t_64_sync_handler+0x84/0xf0 arch/arm64/kernel/entry-common.c:654
el0t_64_sync+0x18c/0x190 arch/arm64/kernel/entry.S:581
The root cause of this issue is in error path of f2fs_start_discard_thread(),
it missed to invalidate dcc->f2fs_issue_discard, later kthread_stop() may
access invalid pointer. |
| In the Linux kernel, the following vulnerability has been resolved:
media: dvb-usb: fix memory leak in dvb_usb_adapter_init()
Syzbot reports a memory leak in "dvb_usb_adapter_init()".
The leak is due to not accounting for and freeing current iteration's
adapter->priv in case of an error. Currently if an error occurs,
it will exit before incrementing "num_adapters_initalized",
which is used as a reference counter to free all adap->priv
in "dvb_usb_adapter_exit()". There are multiple error paths that
can exit from before incrementing the counter. Including the
error handling paths for "dvb_usb_adapter_stream_init()",
"dvb_usb_adapter_dvb_init()" and "dvb_usb_adapter_frontend_init()"
within "dvb_usb_adapter_init()".
This means that in case of an error in any of these functions the
current iteration is not accounted for and the current iteration's
adap->priv is not freed.
Fix this by freeing the current iteration's adap->priv in the
"stream_init_err:" label in the error path. The rest of the
(accounted for) adap->priv objects are freed in dvb_usb_adapter_exit()
as expected using the num_adapters_initalized variable.
Syzbot report:
BUG: memory leak
unreferenced object 0xffff8881172f1a00 (size 512):
comm "kworker/0:2", pid 139, jiffies 4294994873 (age 10.960s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
backtrace:
[<ffffffff844af012>] dvb_usb_adapter_init drivers/media/usb/dvb-usb/dvb-usb-init.c:75 [inline]
[<ffffffff844af012>] dvb_usb_init drivers/media/usb/dvb-usb/dvb-usb-init.c:184 [inline]
[<ffffffff844af012>] dvb_usb_device_init.cold+0x4e5/0x79e drivers/media/usb/dvb-usb/dvb-usb-init.c:308
[<ffffffff830db21d>] dib0700_probe+0x8d/0x1b0 drivers/media/usb/dvb-usb/dib0700_core.c:883
[<ffffffff82d3fdc7>] usb_probe_interface+0x177/0x370 drivers/usb/core/driver.c:396
[<ffffffff8274ab37>] call_driver_probe drivers/base/dd.c:542 [inline]
[<ffffffff8274ab37>] really_probe.part.0+0xe7/0x310 drivers/base/dd.c:621
[<ffffffff8274ae6c>] really_probe drivers/base/dd.c:583 [inline]
[<ffffffff8274ae6c>] __driver_probe_device+0x10c/0x1e0 drivers/base/dd.c:752
[<ffffffff8274af6a>] driver_probe_device+0x2a/0x120 drivers/base/dd.c:782
[<ffffffff8274b786>] __device_attach_driver+0xf6/0x140 drivers/base/dd.c:899
[<ffffffff82747c87>] bus_for_each_drv+0xb7/0x100 drivers/base/bus.c:427
[<ffffffff8274b352>] __device_attach+0x122/0x260 drivers/base/dd.c:970
[<ffffffff827498f6>] bus_probe_device+0xc6/0xe0 drivers/base/bus.c:487
[<ffffffff82745cdb>] device_add+0x5fb/0xdf0 drivers/base/core.c:3405
[<ffffffff82d3d202>] usb_set_configuration+0x8f2/0xb80 drivers/usb/core/message.c:2170
[<ffffffff82d4dbfc>] usb_generic_driver_probe+0x8c/0xc0 drivers/usb/core/generic.c:238
[<ffffffff82d3f49c>] usb_probe_device+0x5c/0x140 drivers/usb/core/driver.c:293
[<ffffffff8274ab37>] call_driver_probe drivers/base/dd.c:542 [inline]
[<ffffffff8274ab37>] really_probe.part.0+0xe7/0x310 drivers/base/dd.c:621
[<ffffffff8274ae6c>] really_probe drivers/base/dd.c:583 [inline]
[<ffffffff8274ae6c>] __driver_probe_device+0x10c/0x1e0 drivers/base/dd.c:752 |
| In the Linux kernel, the following vulnerability has been resolved:
USB: usbtmc: Fix direction for 0-length ioctl control messages
The syzbot fuzzer found a problem in the usbtmc driver: When a user
submits an ioctl for a 0-length control transfer, the driver does not
check that the direction is set to OUT:
------------[ cut here ]------------
usb 3-1: BOGUS control dir, pipe 80000b80 doesn't match bRequestType fd
WARNING: CPU: 0 PID: 5100 at drivers/usb/core/urb.c:411 usb_submit_urb+0x14a7/0x1880 drivers/usb/core/urb.c:411
Modules linked in:
CPU: 0 PID: 5100 Comm: syz-executor428 Not tainted 6.3.0-syzkaller-12049-g58390c8ce1bd #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/14/2023
RIP: 0010:usb_submit_urb+0x14a7/0x1880 drivers/usb/core/urb.c:411
Code: 7c 24 40 e8 1b 13 5c fb 48 8b 7c 24 40 e8 21 1d f0 fe 45 89 e8 44 89 f1 4c 89 e2 48 89 c6 48 c7 c7 e0 b5 fc 8a e8 19 c8 23 fb <0f> 0b e9 9f ee ff ff e8 ed 12 5c fb 0f b6 1d 12 8a 3c 08 31 ff 41
RSP: 0018:ffffc90003d2fb00 EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffff8880789e9058 RCX: 0000000000000000
RDX: ffff888029593b80 RSI: ffffffff814c1447 RDI: 0000000000000001
RBP: ffff88801ea742f8 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000001 R12: ffff88802915e528
R13: 00000000000000fd R14: 0000000080000b80 R15: ffff8880222b3100
FS: 0000555556ca63c0(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f9ef4d18150 CR3: 0000000073e5b000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
usb_start_wait_urb+0x101/0x4b0 drivers/usb/core/message.c:58
usb_internal_control_msg drivers/usb/core/message.c:102 [inline]
usb_control_msg+0x320/0x4a0 drivers/usb/core/message.c:153
usbtmc_ioctl_request drivers/usb/class/usbtmc.c:1954 [inline]
usbtmc_ioctl+0x1b3d/0x2840 drivers/usb/class/usbtmc.c:2097
To fix this, we must override the direction in the bRequestType field
of the control request structure when the length is 0. |
| In the Linux kernel, the following vulnerability has been resolved:
spi: atmel-quadspi: Free resources even if runtime resume failed in .remove()
An early error exit in atmel_qspi_remove() doesn't prevent the device
unbind. So this results in an spi controller with an unbound parent
and unmapped register space (because devm_ioremap_resource() is undone).
So using the remaining spi controller probably results in an oops.
Instead unregister the controller unconditionally and only skip hardware
access and clk disable.
Also add a warning about resume failing and return zero unconditionally.
The latter has the only effect to suppress a less helpful error message by
the spi core. |
| In the Linux kernel, the following vulnerability has been resolved:
regmap-irq: Fix out-of-bounds access when allocating config buffers
When allocating the 2D array for handling IRQ type registers in
regmap_add_irq_chip_fwnode(), the intent is to allocate a matrix
with num_config_bases rows and num_config_regs columns.
This is currently handled by allocating a buffer to hold a pointer for
each row (i.e. num_config_bases). After that, the logic attempts to
allocate the memory required to hold the register configuration for
each row. However, instead of doing this allocation for each row
(i.e. num_config_bases allocations), the logic erroneously does this
allocation num_config_regs number of times.
This scenario can lead to out-of-bounds accesses when num_config_regs
is greater than num_config_bases. Fix this by updating the terminating
condition of the loop that allocates the memory for holding the register
configuration to allocate memory only for each row in the matrix.
Amit Pundir reported a crash that was occurring on his db845c device
due to memory corruption (see "Closes" tag for Amit's report). The KASAN
report below helped narrow it down to this issue:
[ 14.033877][ T1] ==================================================================
[ 14.042507][ T1] BUG: KASAN: invalid-access in regmap_add_irq_chip_fwnode+0x594/0x1364
[ 14.050796][ T1] Write of size 8 at addr 06ffff8081021850 by task init/1
[ 14.242004][ T1] The buggy address belongs to the object at ffffff8081021850
[ 14.242004][ T1] which belongs to the cache kmalloc-8 of size 8
[ 14.255669][ T1] The buggy address is located 0 bytes inside of
[ 14.255669][ T1] 8-byte region [ffffff8081021850, ffffff8081021858) |
| In the Linux kernel, the following vulnerability has been resolved:
xfrm: also call xfrm_state_delete_tunnel at destroy time for states that were never added
In commit b441cf3f8c4b ("xfrm: delete x->tunnel as we delete x"), I
missed the case where state creation fails between full
initialization (->init_state has been called) and being inserted on
the lists.
In this situation, ->init_state has been called, so for IPcomp
tunnels, the fallback tunnel has been created and added onto the
lists, but the user state never gets added, because we fail before
that. The user state doesn't go through __xfrm_state_delete, so we
don't call xfrm_state_delete_tunnel for those states, and we end up
leaking the FB tunnel.
There are several codepaths affected by this: the add/update paths, in
both net/key and xfrm, and the migrate code (xfrm_migrate,
xfrm_state_migrate). A "proper" rollback of the init_state work would
probably be doable in the add/update code, but for migrate it gets
more complicated as multiple states may be involved.
At some point, the new (not-inserted) state will be destroyed, so call
xfrm_state_delete_tunnel during xfrm_state_gc_destroy. Most states
will have their fallback tunnel cleaned up during __xfrm_state_delete,
which solves the issue that b441cf3f8c4b (and other patches before it)
aimed at. All states (including FB tunnels) will be removed from the
lists once xfrm_state_fini has called flush_work(&xfrm_state_gc_work). |
| In the Linux kernel, the following vulnerability has been resolved:
platform/x86: int3472: Fix double free of GPIO device during unregister
regulator_unregister() already frees the associated GPIO device. On
ThinkPad X9 (Lunar Lake), this causes a double free issue that leads to
random failures when other drivers (typically Intel THC) attempt to
allocate interrupts. The root cause is that the reference count of the
pinctrl_intel_platform module unexpectedly drops to zero when this
driver defers its probe.
This behavior can also be reproduced by unloading the module directly.
Fix the issue by removing the redundant release of the GPIO device
during regulator unregistration. |
| In the Linux kernel, the following vulnerability has been resolved:
fbdev: Add bounds checking in bit_putcs to fix vmalloc-out-of-bounds
Add bounds checking to prevent writes past framebuffer boundaries when
rendering text near screen edges. Return early if the Y position is off-screen
and clip image height to screen boundary. Break from the rendering loop if the
X position is off-screen. When clipping image width to fit the screen, update
the character count to match the clipped width to prevent buffer size
mismatches.
Without the character count update, bit_putcs_aligned and bit_putcs_unaligned
receive mismatched parameters where the buffer is allocated for the clipped
width but cnt reflects the original larger count, causing out-of-bounds writes. |
| In the Linux kernel, the following vulnerability has been resolved:
NFSD: Define actions for the new time_deleg FATTR4 attributes
NFSv4 clients won't send legitimate GETATTR requests for these new
attributes because they are intended to be used only with CB_GETATTR
and SETATTR. But NFSD has to do something besides crashing if it
ever sees a GETATTR request that queries these attributes.
RFC 8881 Section 18.7.3 states:
> The server MUST return a value for each attribute that the client
> requests if the attribute is supported by the server for the
> target file system. If the server does not support a particular
> attribute on the target file system, then it MUST NOT return the
> attribute value and MUST NOT set the attribute bit in the result
> bitmap. The server MUST return an error if it supports an
> attribute on the target but cannot obtain its value. In that case,
> no attribute values will be returned.
Further, RFC 9754 Section 5 states:
> These new attributes are invalid to be used with GETATTR, VERIFY,
> and NVERIFY, and they can only be used with CB_GETATTR and SETATTR
> by a client holding an appropriate delegation.
Thus there does not appear to be a specific server response mandated
by specification. Taking the guidance that querying these attributes
via GETATTR is "invalid", NFSD will return nfserr_inval, failing the
request entirely. |
| In the Linux kernel, the following vulnerability has been resolved:
comedi: check device's attached status in compat ioctls
Syzbot identified an issue [1] that crashes kernel, seemingly due to
unexistent callback dev->get_valid_routes(). By all means, this should
not occur as said callback must always be set to
get_zero_valid_routes() in __comedi_device_postconfig().
As the crash seems to appear exclusively in i386 kernels, at least,
judging from [1] reports, the blame lies with compat versions
of standard IOCTL handlers. Several of them are modified and
do not use comedi_unlocked_ioctl(). While functionality of these
ioctls essentially copy their original versions, they do not
have required sanity check for device's attached status. This,
in turn, leads to a possibility of calling select IOCTLs on a
device that has not been properly setup, even via COMEDI_DEVCONFIG.
Doing so on unconfigured devices means that several crucial steps
are missed, for instance, specifying dev->get_valid_routes()
callback.
Fix this somewhat crudely by ensuring device's attached status before
performing any ioctls, improving logic consistency between modern
and compat functions.
[1] Syzbot report:
BUG: kernel NULL pointer dereference, address: 0000000000000000
...
CR2: ffffffffffffffd6 CR3: 000000006c717000 CR4: 0000000000352ef0
Call Trace:
<TASK>
get_valid_routes drivers/comedi/comedi_fops.c:1322 [inline]
parse_insn+0x78c/0x1970 drivers/comedi/comedi_fops.c:1401
do_insnlist_ioctl+0x272/0x700 drivers/comedi/comedi_fops.c:1594
compat_insnlist drivers/comedi/comedi_fops.c:3208 [inline]
comedi_compat_ioctl+0x810/0x990 drivers/comedi/comedi_fops.c:3273
__do_compat_sys_ioctl fs/ioctl.c:695 [inline]
__se_compat_sys_ioctl fs/ioctl.c:638 [inline]
__ia32_compat_sys_ioctl+0x242/0x370 fs/ioctl.c:638
do_syscall_32_irqs_on arch/x86/entry/syscall_32.c:83 [inline]
... |
| In the Linux kernel, the following vulnerability has been resolved:
net/sched: sch_cake: Fix incorrect qlen reduction in cake_drop
In cake_drop(), qdisc_tree_reduce_backlog() is used to update the qlen
and backlog of the qdisc hierarchy. Its caller, cake_enqueue(), assumes
that the parent qdisc will enqueue the current packet. However, this
assumption breaks when cake_enqueue() returns NET_XMIT_CN: the parent
qdisc stops enqueuing current packet, leaving the tree qlen/backlog
accounting inconsistent. This mismatch can lead to a NULL dereference
(e.g., when the parent Qdisc is qfq_qdisc).
This patch computes the qlen/backlog delta in a more robust way by
observing the difference before and after the series of cake_drop()
calls, and then compensates the qdisc tree accounting if cake_enqueue()
returns NET_XMIT_CN.
To ensure correct compensation when ACK thinning is enabled, a new
variable is introduced to keep qlen unchanged. |
| In the Linux kernel, the following vulnerability has been resolved:
io_uring: fix regbuf vector size truncation
There is a report of io_estimate_bvec_size() truncating the calculated
number of segments that leads to corruption issues. Check it doesn't
overflow "int"s used later. Rough but simple, can be improved on top. |
| In the Linux kernel, the following vulnerability has been resolved:
gve: Implement gettimex64 with -EOPNOTSUPP
gve implemented a ptp_clock for sole use of do_aux_work at this time.
ptp_clock_gettime() and ptp_sys_offset() assume every ptp_clock has
implemented either gettimex64 or gettime64. Stub gettimex64 and return
-EOPNOTSUPP to prevent NULL dereferencing. |
| In the Linux kernel, the following vulnerability has been resolved:
nfsd: fix refcount leak in nfsd_set_fh_dentry()
nfsd exports a "pseudo root filesystem" which is used by NFSv4 to find
the various exported filesystems using LOOKUP requests from a known root
filehandle. NFSv3 uses the MOUNT protocol to find those exported
filesystems and so is not given access to the pseudo root filesystem.
If a v3 (or v2) client uses a filehandle from that filesystem,
nfsd_set_fh_dentry() will report an error, but still stores the export
in "struct svc_fh" even though it also drops the reference (exp_put()).
This means that when fh_put() is called an extra reference will be dropped
which can lead to use-after-free and possible denial of service.
Normal NFS usage will not provide a pseudo-root filehandle to a v3
client. This bug can only be triggered by the client synthesising an
incorrect filehandle.
To fix this we move the assignments to the svc_fh later, after all
possible error cases have been detected. |
| In the Linux kernel, the following vulnerability has been resolved:
usb: uas: fix urb unmapping issue when the uas device is remove during ongoing data transfer
When a UAS device is unplugged during data transfer, there is
a probability of a system panic occurring. The root cause is
an access to an invalid memory address during URB callback handling.
Specifically, this happens when the dma_direct_unmap_sg() function
is called within the usb_hcd_unmap_urb_for_dma() interface, but the
sg->dma_address field is 0 and the sg data structure has already been
freed.
The SCSI driver sends transfer commands by invoking uas_queuecommand_lck()
in uas.c, using the uas_submit_urbs() function to submit requests to USB.
Within the uas_submit_urbs() implementation, three URBs (sense_urb,
data_urb, and cmd_urb) are sequentially submitted. Device removal may
occur at any point during uas_submit_urbs execution, which may result
in URB submission failure. However, some URBs might have been successfully
submitted before the failure, and uas_submit_urbs will return the -ENODEV
error code in this case. The current error handling directly calls
scsi_done(). In the SCSI driver, this eventually triggers scsi_complete()
to invoke scsi_end_request() for releasing the sgtable. The successfully
submitted URBs, when being unlinked to giveback, call
usb_hcd_unmap_urb_for_dma() in hcd.c, leading to exceptions during sg
unmapping operations since the sg data structure has already been freed.
This patch modifies the error condition check in the uas_submit_urbs()
function. When a UAS device is removed but one or more URBs have already
been successfully submitted to USB, it avoids immediately invoking
scsi_done() and save the cmnd to devinfo->cmnd array. If the successfully
submitted URBs is completed before devinfo->resetting being set, then
the scsi_done() function will be called within uas_try_complete() after
all pending URB operations are finalized. Otherwise, the scsi_done()
function will be called within uas_zap_pending(), which is executed after
usb_kill_anchored_urbs().
The error handling only takes effect when uas_queuecommand_lck() calls
uas_submit_urbs() and returns the error value -ENODEV . In this case,
the device is disconnected, and the flow proceeds to uas_disconnect(),
where uas_zap_pending() is invoked to call uas_try_complete(). |
| In the Linux kernel, the following vulnerability has been resolved:
nbd: defer config put in recv_work
There is one uaf issue in recv_work when running NBD_CLEAR_SOCK and
NBD_CMD_RECONFIGURE:
nbd_genl_connect // conf_ref=2 (connect and recv_work A)
nbd_open // conf_ref=3
recv_work A done // conf_ref=2
NBD_CLEAR_SOCK // conf_ref=1
nbd_genl_reconfigure // conf_ref=2 (trigger recv_work B)
close nbd // conf_ref=1
recv_work B
config_put // conf_ref=0
atomic_dec(&config->recv_threads); -> UAF
Or only running NBD_CLEAR_SOCK:
nbd_genl_connect // conf_ref=2
nbd_open // conf_ref=3
NBD_CLEAR_SOCK // conf_ref=2
close nbd
nbd_release
config_put // conf_ref=1
recv_work
config_put // conf_ref=0
atomic_dec(&config->recv_threads); -> UAF
Commit 87aac3a80af5 ("nbd: call nbd_config_put() before notifying the
waiter") moved nbd_config_put() to run before waking up the waiter in
recv_work, in order to ensure that nbd_start_device_ioctl() would not
be woken up while nbd->task_recv was still uncleared.
However, in nbd_start_device_ioctl(), after being woken up it explicitly
calls flush_workqueue() to make sure all current works are finished.
Therefore, there is no need to move the config put ahead of the wakeup.
Move nbd_config_put() to the end of recv_work, so that the reference is
held for the whole lifetime of the worker thread. This makes sure the
config cannot be freed while recv_work is still running, even if clear
+ reconfigure interleave.
In addition, we don't need to worry about recv_work dropping the last
nbd_put (which causes deadlock):
path A (netlink with NBD_CFLAG_DESTROY_ON_DISCONNECT):
connect // nbd_refs=1 (trigger recv_work)
open nbd // nbd_refs=2
NBD_CLEAR_SOCK
close nbd
nbd_release
nbd_disconnect_and_put
flush_workqueue // recv_work done
nbd_config_put
nbd_put // nbd_refs=1
nbd_put // nbd_refs=0
queue_work
path B (netlink without NBD_CFLAG_DESTROY_ON_DISCONNECT):
connect // nbd_refs=2 (trigger recv_work)
open nbd // nbd_refs=3
NBD_CLEAR_SOCK // conf_refs=2
close nbd
nbd_release
nbd_config_put // conf_refs=1
nbd_put // nbd_refs=2
recv_work done // conf_refs=0, nbd_refs=1
rmmod // nbd_refs=0
Depends-on: e2daec488c57 ("nbd: Fix hungtask when nbd_config_put") |