Skip to content

Commit

Permalink
hugetlbfs: fix hugetlb page migration/fault race causing SIGBUS
Browse files Browse the repository at this point in the history
Li Wang discovered that LTP/move_page12 V2 sometimes triggers SIGBUS in
the kernel-v5.2.3 testing.  This is caused by a race between hugetlb
page migration and page fault.

If a hugetlb page can not be allocated to satisfy a page fault, the task
is sent SIGBUS.  This is normal hugetlbfs behavior.  A hugetlb fault
mutex exists to prevent two tasks from trying to instantiate the same
page.  This protects against the situation where there is only one
hugetlb page, and both tasks would try to allocate.  Without the mutex,
one would fail and SIGBUS even though the other fault would be
successful.

There is a similar race between hugetlb page migration and fault.
Migration code will allocate a page for the target of the migration.  It
will then unmap the original page from all page tables.  It does this
unmap by first clearing the pte and then writing a migration entry.  The
page table lock is held for the duration of this clear and write
operation.  However, the beginnings of the hugetlb page fault code
optimistically checks the pte without taking the page table lock.  If
clear (as it can be during the migration unmap operation), a hugetlb
page allocation is attempted to satisfy the fault.  Note that the page
which will eventually satisfy this fault was already allocated by the
migration code.  However, the allocation within the fault path could
fail which would result in the task incorrectly being sent SIGBUS.

Ideally, we could take the hugetlb fault mutex in the migration code
when modifying the page tables.  However, locks must be taken in the
order of hugetlb fault mutex, page lock, page table lock.  This would
require significant rework of the migration code.  Instead, the issue is
addressed in the hugetlb fault code.  After failing to allocate a huge
page, take the page table lock and check for huge_pte_none before
returning an error.  This is the same check that must be made further in
the code even if page allocation is successful.

Link: http://lkml.kernel.org/r/[email protected]
Fixes: 290408d ("hugetlb: hugepage migration core")
Signed-off-by: Mike Kravetz <[email protected]>
Reported-by: Li Wang <[email protected]>
Tested-by: Li Wang <[email protected]>
Reviewed-by: Naoya Horiguchi <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Cc: Cyril Hrubis <[email protected]>
Cc: Xishi Qiu <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
  • Loading branch information
mjkravetz authored and torvalds committed Aug 13, 2019
1 parent 28360f3 commit 4643d67
Showing 1 changed file with 19 additions and 0 deletions.
19 changes: 19 additions & 0 deletions mm/hugetlb.c
Original file line number Diff line number Diff line change
Expand Up @@ -3856,6 +3856,25 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm,

page = alloc_huge_page(vma, haddr, 0);
if (IS_ERR(page)) {
/*
* Returning error will result in faulting task being
* sent SIGBUS. The hugetlb fault mutex prevents two
* tasks from racing to fault in the same page which
* could result in false unable to allocate errors.
* Page migration does not take the fault mutex, but
* does a clear then write of pte's under page table
* lock. Page fault code could race with migration,
* notice the clear pte and try to allocate a page
* here. Before returning error, get ptl and make
* sure there really is no pte entry.
*/
ptl = huge_pte_lock(h, mm, ptep);
if (!huge_pte_none(huge_ptep_get(ptep))) {
ret = 0;
spin_unlock(ptl);
goto out;
}
spin_unlock(ptl);
ret = vmf_error(PTR_ERR(page));
goto out;
}
Expand Down

0 comments on commit 4643d67

Please sign in to comment.