Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Panic in stage 4 in BIOS boot #331

Closed
an-owl opened this issue Jan 21, 2023 · 6 comments · Fixed by #337
Closed

Panic in stage 4 in BIOS boot #331

an-owl opened this issue Jan 21, 2023 · 6 comments · Fixed by #337

Comments

@an-owl
Copy link

an-owl commented Jan 21, 2023

The bootloader panics on line 242 with failed to identity map fame PhysFrame[4kib](20a000): PageAlreadyMapped(PhysFrame[4Kib](0x20a000))

// identity-map context switch function, so that we don't get an immediate pagefault
// after switching the active page table
let context_switch_function = PhysAddr::new(context_switch as *const () as u64);
let context_switch_function_start_frame: PhysFrame =
PhysFrame::containing_address(context_switch_function);
for frame in PhysFrame::range_inclusive(
context_switch_function_start_frame,
context_switch_function_start_frame + 1,
) {
match unsafe {
kernel_page_table.identity_map(frame, PageTableFlags::PRESENT, frame_allocator)
} {
Ok(tlb) => tlb.flush(),
Err(err) => panic!("failed to identity map frame {:?}: {:?}", frame, err),
}
}

I checked the mapping for 0x20b000 and QEMU says its mapped to 0x20b000

A potental fix would be to just ignore the error with an extra match arm Err(MapToError::PageAlreadyMapped(frame)) => {}

@phil-opp
Copy link
Member

Thanks for reporting! Could you give us instructions how to reproduce this error?

@Freax13
Copy link
Member

Freax13 commented Jan 21, 2023

I suspect that this could be caused by a non-PIE kernel that maps memory to the same address we use for the context switch in the bootloader. The error occurs because we load the context switch function into the kernel's address space. We currently need to do this because the address space switch is not the last instruction of the context switch, so the context switch function needs to be mapped in the kernel's address space to complete.

Similar errors could probably be provoked by setting other addresses (e.g. frame buffer, boot info) to this address.

This might be fixed by #240.

A potental fix would be to just ignore the error with an extra match arm Err(MapToError::PageAlreadyMapped(frame)) => {}

No, this wouldn't work, the current implementation expects the context switch function to be mapped in both address spaces. Ignoring the error would mean that the context switch function would jump into rogue kernel memory after doing the address space switch.

@an-owl
Copy link
Author

an-owl commented Jan 21, 2023

After removing "-C","relocation-model=static" and "-C","link-arg=-no-pie" from rustflags the bootloader now boots correctly so it seems that is the likely cause.

A potental fix would be to just ignore the error with an extra match arm Err(MapToError::PageAlreadyMapped(frame)) > > => {}

No, this wouldn't work, the current implementation expects the context switch function to be mapped in both address spaces. Ignoring the error would mean that the context switch function would jump into rogue kernel memory after doing the address space switch.

I was supposed to put a match guard in there. Using gva2gpa on the page that threw the error and the ones before and after it all seem to be identity mapped already.

@phil-opp
Copy link
Member

After removing "-C","relocation-model=static" and "-C","link-arg=-no-pie" from rustflags the bootloader now boots correctly so it seems that is the likely cause.

I just tried to reproduce the error by adding these flags, but it still works fine for a small test kernel. Do you have the project where this happens online somewhere so that we can try to reproduce the error?

@an-owl
Copy link
Author

an-owl commented Jan 27, 2023

Do you have the project where this happens online somewhere so that we can try to reproduce the error?

Yep its here

@phil-opp
Copy link
Member

Thanks a lot! I created a PR to avoid that issue for now at #337.

In the long term, I agree that we need a more general fix for this, such as #240 (which I still haven't had time to review yet unfortunately).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants