| CVE |
Vendors |
Products |
Updated |
CVSS v3.1 |
| In the Linux kernel, the following vulnerability has been resolved:
net: stmmac: fix integer underflow in chain mode
The jumbo_frm() chain-mode implementation unconditionally computes
len = nopaged_len - bmax;
where nopaged_len = skb_headlen(skb) (linear bytes only) and bmax is
BUF_SIZE_8KiB or BUF_SIZE_2KiB. However, the caller stmmac_xmit()
decides to invoke jumbo_frm() based on skb->len (total length including
page fragments):
is_jumbo = stmmac_is_jumbo_frm(priv, skb->len, enh_desc);
When a packet has a small linear portion (nopaged_len <= bmax) but a
large total length due to page fragments (skb->len > bmax), the
subtraction wraps as an unsigned integer, producing a huge len value
(~0xFFFFxxxx). This causes the while (len != 0) loop to execute
hundreds of thousands of iterations, passing skb->data + bmax * i
pointers far beyond the skb buffer to dma_map_single(). On IOMMU-less
SoCs (the typical deployment for stmmac), this maps arbitrary kernel
memory to the DMA engine, constituting a kernel memory disclosure and
potential memory corruption from hardware.
Fix this by introducing a buf_len local variable clamped to
min(nopaged_len, bmax). Computing len = nopaged_len - buf_len is then
always safe: it is zero when the linear portion fits within a single
descriptor, causing the while (len != 0) loop to be skipped naturally,
and the fragment loop in stmmac_xmit() handles page fragments afterward. |
| In the Linux kernel, the following vulnerability has been resolved:
LoongArch: Fix missing NULL checks for kstrdup()
1. Replace "of_find_node_by_path("/")" with "of_root" to avoid multiple
calls to "of_node_put()".
2. Fix a potential kernel oops during early boot when memory allocation
fails while parsing CPU model from device tree. |
| In the Linux kernel, the following vulnerability has been resolved:
mm: filemap: fix nr_pages calculation overflow in filemap_map_pages()
When running stress-ng on my Arm64 machine with v7.0-rc3 kernel, I
encountered some very strange crash issues showing up as "Bad page state":
"
[ 734.496287] BUG: Bad page state in process stress-ng-env pfn:415735fb
[ 734.496427] page: refcount:0 mapcount:1 mapping:0000000000000000 index:0x4cf316 pfn:0x415735fb
[ 734.496434] flags: 0x57fffe000000800(owner_2|node=1|zone=2|lastcpupid=0x3ffff)
[ 734.496439] raw: 057fffe000000800 0000000000000000 dead000000000122 0000000000000000
[ 734.496440] raw: 00000000004cf316 0000000000000000 0000000000000000 0000000000000000
[ 734.496442] page dumped because: nonzero mapcount
"
After analyzing this page’s state, it is hard to understand why the
mapcount is not 0 while the refcount is 0, since this page is not where
the issue first occurred. By enabling the CONFIG_DEBUG_VM config, I can
reproduce the crash as well and captured the first warning where the issue
appears:
"
[ 734.469226] page: refcount:33 mapcount:0 mapping:00000000bef2d187 index:0x81a0 pfn:0x415735c0
[ 734.469304] head: order:5 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
[ 734.469315] memcg:ffff000807a8ec00
[ 734.469320] aops:ext4_da_aops ino:100b6f dentry name(?):"stress-ng-mmaptorture-9397-0-2736200540"
[ 734.469335] flags: 0x57fffe400000069(locked|uptodate|lru|head|node=1|zone=2|lastcpupid=0x3ffff)
......
[ 734.469364] page dumped because: VM_WARN_ON_FOLIO((_Generic((page + nr_pages - 1),
const struct page *: (const struct folio *)_compound_head(page + nr_pages - 1), struct page *:
(struct folio *)_compound_head(page + nr_pages - 1))) != folio)
[ 734.469390] ------------[ cut here ]------------
[ 734.469393] WARNING: ./include/linux/rmap.h:351 at folio_add_file_rmap_ptes+0x3b8/0x468,
CPU#90: stress-ng-mlock/9430
[ 734.469551] folio_add_file_rmap_ptes+0x3b8/0x468 (P)
[ 734.469555] set_pte_range+0xd8/0x2f8
[ 734.469566] filemap_map_folio_range+0x190/0x400
[ 734.469579] filemap_map_pages+0x348/0x638
[ 734.469583] do_fault_around+0x140/0x198
......
[ 734.469640] el0t_64_sync+0x184/0x188
"
The code that triggers the warning is: "VM_WARN_ON_FOLIO(page_folio(page +
nr_pages - 1) != folio, folio)", which indicates that set_pte_range()
tried to map beyond the large folio’s size.
By adding more debug information, I found that 'nr_pages' had overflowed
in filemap_map_pages(), causing set_pte_range() to establish mappings for
a range exceeding the folio size, potentially corrupting fields of pages
that do not belong to this folio (e.g., page->_mapcount).
After above analysis, I think the possible race is as follows:
CPU 0 CPU 1
filemap_map_pages() ext4_setattr()
//get and lock folio with old inode->i_size
next_uptodate_folio()
.......
//shrink the inode->i_size
i_size_write(inode, attr->ia_size);
//calculate the end_pgoff with the new inode->i_size
file_end = DIV_ROUND_UP(i_size_read(mapping->host), PAGE_SIZE) - 1;
end_pgoff = min(end_pgoff, file_end);
......
//nr_pages can be overflowed, cause xas.xa_index > end_pgoff
end = folio_next_index(folio) - 1;
nr_pages = min(end, end_pgoff) - xas.xa_index + 1;
......
//map large folio
filemap_map_folio_range()
......
//truncate folios
truncate_pagecache(inode, inode->i_size);
To fix this issue, move the 'end_pgoff' calculation before
next_uptodate_folio(), so the retrieved folio stays consistent with the
file end to avoid
---truncated--- |
| In the Linux kernel, the following vulnerability has been resolved:
wifi: rt2x00usb: fix devres lifetime
USB drivers bind to USB interfaces and any device managed resources
should have their lifetime tied to the interface rather than parent USB
device. This avoids issues like memory leaks when drivers are unbound
without their devices being physically disconnected (e.g. on probe
deferral or configuration changes).
Fix the USB anchor lifetime so that it is released on driver unbind. |
| In the Linux kernel, the following vulnerability has been resolved:
xfrm_user: fix info leak in build_report()
struct xfrm_user_report is a __u8 proto field followed by a struct
xfrm_selector which means there is three "empty" bytes of padding, but
the padding is never zeroed before copying to userspace. Fix that up by
zeroing the structure before setting individual member variables. |
| In the Linux kernel, the following vulnerability has been resolved:
net: rfkill: prevent unlimited numbers of rfkill events from being created
Userspace can create an unlimited number of rfkill events if the system
is so configured, while not consuming them from the rfkill file
descriptor, causing a potential out of memory situation. Prevent this
from bounding the number of pending rfkill events at a "large" number
(i.e. 1000) to prevent abuses like this. |
| In the Linux kernel, the following vulnerability has been resolved:
mptcp: fix slab-use-after-free in __inet_lookup_established
The ehash table lookups are lockless and rely on
SLAB_TYPESAFE_BY_RCU to guarantee socket memory stability
during RCU read-side critical sections. Both tcp_prot and
tcpv6_prot have their slab caches created with this flag
via proto_register().
However, MPTCP's mptcp_subflow_init() copies tcpv6_prot into
tcpv6_prot_override during inet_init() (fs_initcall, level 5),
before inet6_init() (module_init/device_initcall, level 6) has
called proto_register(&tcpv6_prot). At that point,
tcpv6_prot.slab is still NULL, so tcpv6_prot_override.slab
remains NULL permanently.
This causes MPTCP v6 subflow child sockets to be allocated via
kmalloc (falling into kmalloc-4k) instead of the TCPv6 slab
cache. The kmalloc-4k cache lacks SLAB_TYPESAFE_BY_RCU, so
when these sockets are freed without SOCK_RCU_FREE (which is
cleared for child sockets by design), the memory can be
immediately reused. Concurrent ehash lookups under
rcu_read_lock can then access freed memory, triggering a
slab-use-after-free in __inet_lookup_established.
Fix this by splitting the IPv6-specific initialization out of
mptcp_subflow_init() into a new mptcp_subflow_v6_init(), called
from mptcp_proto_v6_init() before protocol registration. This
ensures tcpv6_prot_override.slab correctly inherits the
SLAB_TYPESAFE_BY_RCU slab cache. |
| In the Linux kernel, the following vulnerability has been resolved:
seg6: separate dst_cache for input and output paths in seg6 lwtunnel
The seg6 lwtunnel uses a single dst_cache per encap route, shared
between seg6_input_core() and seg6_output_core(). These two paths
can perform the post-encap SID lookup in different routing contexts
(e.g., ip rules matching on the ingress interface, or VRF table
separation). Whichever path runs first populates the cache, and the
other reuses it blindly, bypassing its own lookup.
Fix this by splitting the cache into cache_input and cache_output,
so each path maintains its own cached dst independently. |
| In the Linux kernel, the following vulnerability has been resolved:
Input: uinput - fix circular locking dependency with ff-core
A lockdep circular locking dependency warning can be triggered
reproducibly when using a force-feedback gamepad with uinput (for
example, playing ELDEN RING under Wine with a Flydigi Vader 5
controller):
ff->mutex -> udev->mutex -> input_mutex -> dev->mutex -> ff->mutex
The cycle is caused by four lock acquisition paths:
1. ff upload: input_ff_upload() holds ff->mutex and calls
uinput_dev_upload_effect() -> uinput_request_submit() ->
uinput_request_send(), which acquires udev->mutex.
2. device create: uinput_ioctl_handler() holds udev->mutex and calls
uinput_create_device() -> input_register_device(), which acquires
input_mutex.
3. device register: input_register_device() holds input_mutex and
calls kbd_connect() -> input_register_handle(), which acquires
dev->mutex.
4. evdev release: evdev_release() calls input_flush_device() under
dev->mutex, which calls input_ff_flush() acquiring ff->mutex.
Fix this by introducing a new state_lock spinlock to protect
udev->state and udev->dev access in uinput_request_send() instead of
acquiring udev->mutex. The function only needs to atomically check
device state and queue an input event into the ring buffer via
uinput_dev_event() -- both operations are safe under a spinlock
(ktime_get_ts64() and wake_up_interruptible() do not sleep). This
breaks the ff->mutex -> udev->mutex link since a spinlock is a leaf in
the lock ordering and cannot form cycles with mutexes.
To keep state transitions visible to uinput_request_send(), protect
writes to udev->state in uinput_create_device() and
uinput_destroy_device() with the same state_lock spinlock.
Additionally, move init_completion(&request->done) from
uinput_request_send() to uinput_request_submit() before
uinput_request_reserve_slot(). Once the slot is allocated,
uinput_flush_requests() may call complete() on it at any time from
the destroy path, so the completion must be initialised before the
request becomes visible.
Lock ordering after the fix:
ff->mutex -> state_lock (spinlock, leaf)
udev->mutex -> state_lock (spinlock, leaf)
udev->mutex -> input_mutex -> dev->mutex -> ff->mutex (no back-edge) |
| In the Linux kernel, the following vulnerability has been resolved:
btrfs: fix incorrect return value after changing leaf in lookup_extent_data_ref()
After commit 1618aa3c2e01 ("btrfs: simplify return variables in
lookup_extent_data_ref()"), the err and ret variables were merged into
a single ret variable. However, when btrfs_next_leaf() returns 0
(success), ret is overwritten from -ENOENT to 0. If the first key in
the next leaf does not match (different objectid or type), the function
returns 0 instead of -ENOENT, making the caller believe the lookup
succeeded when it did not. This can lead to operations on the wrong
extent tree item, potentially causing extent tree corruption.
Fix this by returning -ENOENT directly when the key does not match,
instead of relying on the ret variable. |
| In the Linux kernel, the following vulnerability has been resolved:
netfilter: nft_ct: fix use-after-free in timeout object destroy
nft_ct_timeout_obj_destroy() frees the timeout object with kfree()
immediately after nf_ct_untimeout(), without waiting for an RCU grace
period. Concurrent packet processing on other CPUs may still hold
RCU-protected references to the timeout object obtained via
rcu_dereference() in nf_ct_timeout_data().
Add an rcu_head to struct nf_ct_timeout and use kfree_rcu() to defer
freeing until after an RCU grace period, matching the approach already
used in nfnetlink_cttimeout.c.
KASAN report:
BUG: KASAN: slab-use-after-free in nf_conntrack_tcp_packet+0x1381/0x29d0
Read of size 4 at addr ffff8881035fe19c by task exploit/80
Call Trace:
nf_conntrack_tcp_packet+0x1381/0x29d0
nf_conntrack_in+0x612/0x8b0
nf_hook_slow+0x70/0x100
__ip_local_out+0x1b2/0x210
tcp_sendmsg_locked+0x722/0x1580
__sys_sendto+0x2d8/0x320
Allocated by task 75:
nft_ct_timeout_obj_init+0xf6/0x290
nft_obj_init+0x107/0x1b0
nf_tables_newobj+0x680/0x9c0
nfnetlink_rcv_batch+0xc29/0xe00
Freed by task 26:
nft_obj_destroy+0x3f/0xa0
nf_tables_trans_destroy_work+0x51c/0x5c0
process_one_work+0x2c4/0x5a0 |
| In the Linux kernel, the following vulnerability has been resolved:
xfrm: clear trailing padding in build_polexpire()
build_expire() clears the trailing padding bytes of struct
xfrm_user_expire after setting the hard field via memset_after(),
but the analogous function build_polexpire() does not do this for
struct xfrm_user_polexpire.
The padding bytes after the __u8 hard field are left
uninitialized from the heap allocation, and are then sent to
userspace via netlink multicast to XFRMNLGRP_EXPIRE listeners,
leaking kernel heap memory contents.
Add the missing memset_after() call, matching build_expire(). |
| In the Linux kernel, the following vulnerability has been resolved:
xfrm: hold dev ref until after transport_finish NF_HOOK
After async crypto completes, xfrm_input_resume() calls dev_put()
immediately on re-entry before the skb reaches transport_finish.
The skb->dev pointer is then used inside NF_HOOK and its okfn,
which can race with device teardown.
Remove the dev_put from the async resumption entry and instead
drop the reference after the NF_HOOK call in transport_finish,
using a saved device pointer since NF_HOOK may consume the skb.
This covers NF_DROP, NF_QUEUE and NF_STOLEN paths that skip
the okfn.
For non-transport exits (decaps, gro, drop) and secondary
async return points, release the reference inline when
async is set. |
| In the Linux kernel, the following vulnerability has been resolved:
btrfs: fix leak of kobject name for sub-group space_info
When create_space_info_sub_group() allocates elements of
space_info->sub_group[], kobject_init_and_add() is called for each
element via btrfs_sysfs_add_space_info_type(). However, when
check_removing_space_info() frees these elements, it does not call
btrfs_sysfs_remove_space_info() on them. As a result, kobject_put() is
not called and the associated kobj->name objects are leaked.
This memory leak is reproduced by running the blktests test case
zbd/009 on kernels built with CONFIG_DEBUG_KMEMLEAK. The kmemleak
feature reports the following error:
unreferenced object 0xffff888112877d40 (size 16):
comm "mount", pid 1244, jiffies 4294996972
hex dump (first 16 bytes):
64 61 74 61 2d 72 65 6c 6f 63 00 c4 c6 a7 cb 7f data-reloc......
backtrace (crc 53ffde4d):
__kmalloc_node_track_caller_noprof+0x619/0x870
kstrdup+0x42/0xc0
kobject_set_name_vargs+0x44/0x110
kobject_init_and_add+0xcf/0x150
btrfs_sysfs_add_space_info_type+0xfc/0x210 [btrfs]
create_space_info_sub_group.constprop.0+0xfb/0x1b0 [btrfs]
create_space_info+0x211/0x320 [btrfs]
btrfs_init_space_info+0x15a/0x1b0 [btrfs]
open_ctree+0x33c7/0x4a50 [btrfs]
btrfs_get_tree.cold+0x9f/0x1ee [btrfs]
vfs_get_tree+0x87/0x2f0
vfs_cmd_create+0xbd/0x280
__do_sys_fsconfig+0x3df/0x990
do_syscall_64+0x136/0x1540
entry_SYSCALL_64_after_hwframe+0x76/0x7e
To avoid the leak, call btrfs_sysfs_remove_space_info() instead of
kfree() for the elements. |
| In the Linux kernel, the following vulnerability has been resolved:
netfs: Fix kernel BUG in netfs_limit_iter() for ITER_KVEC iterators
When a process crashes and the kernel writes a core dump to a 9P
filesystem, __kernel_write() creates an ITER_KVEC iterator. This
iterator reaches netfs_limit_iter() via netfs_unbuffered_write(), which
only handles ITER_FOLIOQ, ITER_BVEC and ITER_XARRAY iterator types,
hitting the BUG() for any other type.
Fix this by adding netfs_limit_kvec() following the same pattern as
netfs_limit_bvec(), since both kvec and bvec are simple segment arrays
with pointer and length fields. Dispatch it from netfs_limit_iter() when
the iterator type is ITER_KVEC. |
| In the Linux kernel, the following vulnerability has been resolved:
dmaengine: idxd: Fix memory leak when a wq is reset
idxd_wq_disable_cleanup() which is called from the reset path for a
workqueue, sets the wq type to NONE, which for other parts of the
driver mean that the wq is empty (all its resources were released).
Only set the wq type to NONE after its resources are released. |
| In the Linux kernel, the following vulnerability has been resolved:
ksmbd: fix use-after-free and NULL deref in smb_grant_oplock()
smb_grant_oplock() has two issues in the oplock publication sequence:
1) opinfo is linked into ci->m_op_list (via opinfo_add) before
add_lease_global_list() is called. If add_lease_global_list()
fails (kmalloc returns NULL), the error path frees the opinfo
via __free_opinfo() while it is still linked in ci->m_op_list.
Concurrent m_op_list readers (opinfo_get_list, or direct iteration
in smb_break_all_levII_oplock) dereference the freed node.
2) opinfo->o_fp is assigned after add_lease_global_list() publishes
the opinfo on the global lease list. A concurrent
find_same_lease_key() can walk the lease list and dereference
opinfo->o_fp->f_ci while o_fp is still NULL.
Fix by restructuring the publication sequence to eliminate post-publish
failure:
- Set opinfo->o_fp before any list publication (fixes NULL deref).
- Preallocate lease_table via alloc_lease_table() before opinfo_add()
so add_lease_global_list() becomes infallible after publication.
- Keep the original m_op_list publication order (opinfo_add before
lease list) so concurrent opens via same_client_has_lease() and
opinfo_get_list() still see the in-flight grant.
- Use opinfo_put() instead of __free_opinfo() on err_out so that
the RCU-deferred free path is used.
This also requires splitting add_lease_global_list() to take a
preallocated lease_table and changing its return type from int to void,
since it can no longer fail. |
| In the Linux kernel, the following vulnerability has been resolved:
mm/damon/core: avoid use of half-online-committed context
One major usage of damon_call() is online DAMON parameters update. It is
done by calling damon_commit_ctx() inside the damon_call() callback
function. damon_commit_ctx() can fail for two reasons: 1) invalid
parameters and 2) internal memory allocation failures. In case of
failures, the damon_ctx that attempted to be updated (commit destination)
can be partially updated (or, corrupted from a perspective), and therefore
shouldn't be used anymore. The function only ensures the damon_ctx object
can safely deallocated using damon_destroy_ctx().
The API callers are, however, calling damon_commit_ctx() only after
asserting the parameters are valid, to avoid damon_commit_ctx() fails due
to invalid input parameters. But it can still theoretically fail if the
internal memory allocation fails. In the case, DAMON may run with the
partially updated damon_ctx. This can result in unexpected behaviors
including even NULL pointer dereference in case of damos_commit_dests()
failure [1]. Such allocation failure is arguably too small to fail, so
the real world impact would be rare. But, given the bad consequence, this
needs to be fixed.
Avoid such partially-committed (maybe-corrupted) damon_ctx use by saving
the damon_commit_ctx() failure on the damon_ctx object. For this,
introduce damon_ctx->maybe_corrupted field. damon_commit_ctx() sets it
when it is failed. kdamond_call() checks if the field is set after each
damon_call_control->fn() is executed. If it is set, ignore remaining
callback requests and return. All kdamond_call() callers including
kdamond_fn() also check the maybe_corrupted field right after
kdamond_call() invocations. If the field is set, break the kdamond_fn()
main loop so that DAMON sill doesn't use the context that might be
corrupted.
[sj@kernel.org: let kdamond_call() with cancel regardless of maybe_corrupted] |
| In the Linux kernel, the following vulnerability has been resolved:
ext4: validate p_idx bounds in ext4_ext_correct_indexes
ext4_ext_correct_indexes() walks up the extent tree correcting
index entries when the first extent in a leaf is modified. Before
accessing path[k].p_idx->ei_block, there is no validation that
p_idx falls within the valid range of index entries for that
level.
If the on-disk extent header contains a corrupted or crafted
eh_entries value, p_idx can point past the end of the allocated
buffer, causing a slab-out-of-bounds read.
Fix this by validating path[k].p_idx against EXT_LAST_INDEX() at
both access sites: before the while loop and inside it. Return
-EFSCORRUPTED if the index pointer is out of range, consistent
with how other bounds violations are handled in the ext4 extent
tree code. |
| In the Linux kernel, the following vulnerability has been resolved:
ext4: convert inline data to extents when truncate exceeds inline size
Add a check in ext4_setattr() to convert files from inline data storage
to extent-based storage when truncate() grows the file size beyond the
inline capacity. This prevents the filesystem from entering an
inconsistent state where the inline data flag is set but the file size
exceeds what can be stored inline.
Without this fix, the following sequence causes a kernel BUG_ON():
1. Mount filesystem with inode that has inline flag set and small size
2. truncate(file, 50MB) - grows size but inline flag remains set
3. sendfile() attempts to write data
4. ext4_write_inline_data() hits BUG_ON(write_size > inline_capacity)
The crash occurs because ext4_write_inline_data() expects inline storage
to accommodate the write, but the actual inline capacity (~60 bytes for
i_block + ~96 bytes for xattrs) is far smaller than the file size and
write request.
The fix checks if the new size from setattr exceeds the inode's actual
inline capacity (EXT4_I(inode)->i_inline_size) and converts the file to
extent-based storage before proceeding with the size change.
This addresses the root cause by ensuring the inline data flag and file
size remain consistent during truncate operations. |