Skip to content

Commit

Permalink
mm/hugetlb: make walk_hugetlb_range() safe to pmd unshare
Browse files Browse the repository at this point in the history
Since walk_hugetlb_range() walks the pgtable, it needs the vma lock to
make sure the pgtable page will not be freed concurrently.

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Peter Xu <[email protected]>
Reviewed-by: Mike Kravetz <[email protected]>
Reviewed-by: John Hubbard <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: James Houghton <[email protected]>
Cc: Jann Horn <[email protected]>
Cc: Miaohe Lin <[email protected]>
Cc: Muchun Song <[email protected]>
Cc: Nadav Amit <[email protected]>
Cc: Rik van Riel <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
  • Loading branch information
xzpeter authored and akpm00 committed Jan 19, 2023
1 parent eefc7fa commit dd361e5
Show file tree
Hide file tree
Showing 3 changed files with 26 additions and 2 deletions.
11 changes: 10 additions & 1 deletion include/linux/pagewalk.h
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,16 @@ struct mm_walk;
* depth is -1 if not known, 0:PGD, 1:P4D, 2:PUD, 3:PMD.
* Any folded depths (where PTRS_PER_P?D is equal to 1)
* are skipped.
* @hugetlb_entry: if set, called for each hugetlb entry
* @hugetlb_entry: if set, called for each hugetlb entry. This hook
* function is called with the vma lock held, in order to
* protect against a concurrent freeing of the pte_t* or
* the ptl. In some cases, the hook function needs to drop
* and retake the vma lock in order to avoid deadlocks
* while calling other functions. In such cases the hook
* function must either refrain from accessing the pte or
* ptl after dropping the vma lock, or else revalidate
* those items after re-acquiring the vma lock and before
* accessing them.
* @test_walk: caller specific callback function to determine whether
* we walk over the current vma or not. Returning 0 means
* "do page table walk over the current vma", returning
Expand Down
15 changes: 14 additions & 1 deletion mm/hmm.c
Original file line number Diff line number Diff line change
Expand Up @@ -492,8 +492,21 @@ static int hmm_vma_walk_hugetlb_entry(pte_t *pte, unsigned long hmask,
required_fault =
hmm_pte_need_fault(hmm_vma_walk, pfn_req_flags, cpu_flags);
if (required_fault) {
int ret;

spin_unlock(ptl);
return hmm_vma_fault(addr, end, required_fault, walk);
hugetlb_vma_unlock_read(vma);
/*
* Avoid deadlock: drop the vma lock before calling
* hmm_vma_fault(), which will itself potentially take and
* drop the vma lock. This is also correct from a
* protection point of view, because there is no further
* use here of either pte or ptl after dropping the vma
* lock.
*/
ret = hmm_vma_fault(addr, end, required_fault, walk);
hugetlb_vma_lock_read(vma);
return ret;
}

pfn = pte_pfn(entry) + ((start & ~hmask) >> PAGE_SHIFT);
Expand Down
2 changes: 2 additions & 0 deletions mm/pagewalk.c
Original file line number Diff line number Diff line change
Expand Up @@ -302,6 +302,7 @@ static int walk_hugetlb_range(unsigned long addr, unsigned long end,
const struct mm_walk_ops *ops = walk->ops;
int err = 0;

hugetlb_vma_lock_read(vma);
do {
next = hugetlb_entry_end(h, addr, end);
pte = huge_pte_offset(walk->mm, addr & hmask, sz);
Expand All @@ -314,6 +315,7 @@ static int walk_hugetlb_range(unsigned long addr, unsigned long end,
if (err)
break;
} while (addr = next, addr != end);
hugetlb_vma_unlock_read(vma);

return err;
}
Expand Down

0 comments on commit dd361e5

Please sign in to comment.