Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLVM-compiled CheriBSD kernel experiences alignment fault: adjacent but unsuitably aligned 'sw' instructions merged? #309

Closed
rwatson opened this issue Mar 7, 2016 · 5 comments

Comments

@rwatson
Copy link
Member

rwatson commented Mar 7, 2016

When compiling a CHERI_DE4_USBROOT kernel for CHERI, I experience an alignment fault in which the p_sigqueue member of struct proc appears to have insufficient alignment for 64-bit pointers:

Time=   3049189856080 : ffffffff8032c3e0: 67bdffc0      daddiu  sp,sp,-64  DestReg <- 0x9800000000933240 {0}
Time=   3049189856081 : ffffffff8032c3e4: ffbf0038      sd      ra,56(sp)       []  Address 0x9800000000933278 <- 0xffffffff802c5358 {0}
Time=   3049189856082 : ffffffff8032c3e8: ffbe0030      sd      fp,48(sp)       []  Address 0x9800000000933270 <- 0x9800000000933280 {0}
Time=   3049189856083 : ffffffff8032c3ec: ffbc0028      sd      gp,40(sp)       []  Address 0x9800000000933268 <- 0xffffffff807ed0c0 {0}
Time=   3049189856084 : ffffffff8032c3f0: ffb20020      sd      s2,32(sp)       []  Address 0x9800000000933260 <- 0x0000000000000000 {0}
Time=   3049189856085 : ffffffff8032c3f4: ffb10018      sd      s1,24(sp)       []  Address 0x9800000000933258 <- 0x9800000001108a80 {0}
Time=   3049189856087 : ffffffff8032c3f8: ffb00010      sd      s0,16(sp)       []  Address 0x9800000000933250 <- 0x9800000001108a80 {0}
Time=   3049189856088 : ffffffff8032c3fc: 03a0f025      or      fp,sp,zr  DestReg <- 0x9800000000933240 {0}
Time=   3049189856125 : ffffffff8032c400: 00a08825      or      s1,a1,zr  DestReg <- 0x980000000110d000 {0}
Time=   3049189856130 : ffffffff8032c404: 00808025      or      s0,a0,zr  DestReg <- 0x9800000001108a80 {0}
Time=   3049189856131 : ffffffff8032c408: 3c01004c      lui     at,0x4c  DestReg <- 0x00000000004c0000 {0}
Time=   3049189856132 : ffffffff8032c40c: 0039082d      daddu   at,at,t9  DestReg <- 0xffffffff807ec3e0 {0}
Time=   3049189856133 : ffffffff8032c410: 64320ce0      daddiu  s2,at,3296  DestReg <- 0xffffffff807ed0c0 {0}
Time=   3049189856134 : ffffffff8032c414: 660401a8      daddiu  a0,s0,424  DestReg <- 0x9800000001108c28 {0}
Time=   3049189856211 : ffffffff8032c418: de5979e0      ld      t9,31200(s2)    []  DestReg <- 0xffffffff803154b0 from Address 0xffffffff807f4aa0 {0}
Time=   3049189856212 : ffffffff8032c41c: 02002825      or      a1,s0,zr  DestReg <- 0x9800000001108a80 {0}
Time=   3049189856214 : ffffffff8032c420: 0320f809      jalr    ra,t9 branch to 0xffffffff803154b0  DestReg <- 0xffffffff8032c428 {0}
Time=   3049189856215 : ffffffff8032c424: 0240e025      or      gp,s2,zr  DestReg <- 0xffffffff807ed0c0 {0}
Time=   3049189855243 : ffffffff803154b0: 67bdffe0      daddiu  sp,sp,-32  DestReg <- 0x9800000000933220 {0}
  CPI 7.3499999999999996
Time=   3049189856271 : ffffffff803154b4: ffbe0018      sd      fp,24(sp)       []  Address 0x9800000000933238 <- 0x9800000000933240 {0}
Time=   3049189856272 : ffffffff803154b8: 03a0f025      or      fp,sp,zr  DestReg <- 0x9800000000933220 {0}
Time=   3049189856273 : ffffffff803154bc: ac800000      sw      zr,0(a0)        []  Address 0x9800000001108c28 <- 0x0000000000000000 {0}
Time=   3049189856274 : ffffffff803154c0: ac800004      sw      zr,4(a0)        []  Address 0x9800000001108c2c <- 0x0000000000000000 {0}
Time=   3049189856275 : ffffffff803154c4: ac800008      sw      zr,8(a0)        []  Address 0x9800000001108c30 <- 0x0000000000000000 {0}
Time=   3049189856276 : ffffffff803154c8: ac80000c      sw      zr,12(a0)       []  Address 0x9800000001108c34 <- 0x0000000000000000 {0}
Time=   3049189856277 : ffffffff803154cc: ac800010      sw      zr,16(a0)       []  Address 0x9800000001108c38 <- 0x0000000000000000 {0}
  Exception Code:0x05(AddrErrStore) Time=   3049189856278 : ffffffff803154d0: fc800014  sd      zr,20(a0)       []  Address 0x9800000001108c3c <- 0x0000000000000000 {0}

Caller, proc_linkup(9):

nested-externs -Wstrict-prototypes  -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual  -Wundef -Wno-pointer-sign -D__printf__=__freebsd_kprintf__  -Wmisvoid
proc_linkup(struct proc *p, struct thread *td)
{
ffffffff8032c3e0:       67bdffc0        daddiu  sp,sp,-64
ffffffff8032c3e4:       ffbf0038        sd      ra,56(sp)
ffffffff8032c3e8:       ffbe0030        sd      s8,48(sp)
ffffffff8032c3ec:       ffbc0028        sd      gp,40(sp)
ffffffff8032c3f0:       ffb20020        sd      s2,32(sp)
ffffffff8032c3f4:       ffb10018        sd      s1,24(sp)
ffffffff8032c3f8:       ffb00010        sd      s0,16(sp)
ffffffff8032c3fc:       03a0f025        move    s8,sp
ffffffff8032c400:       00a08825        move    s1,a1
ffffffff8032c404:       00808025        move    s0,a0
ffffffff8032c408:       3c01004c        lui     at,0x4c
ffffffff8032c40c:       0039082d        daddu   at,at,t9
ffffffff8032c410:       64320ce0        daddiu  s2,at,3296

        sigqueue_init(&p->p_sigqueue, p);
ffffffff8032c414:       660401a8        daddiu  a0,s0,424
ffffffff8032c418:       de5979e0        ld      t9,31200(s2)
ffffffff8032c41c:       02002825        move    a1,s0
ffffffff8032c420:       0320f809        jalr    t9
ffffffff8032c424:       0240e025        move    gp,s2

And callee, sigqueue_init(9):

void
sigqueue_init(sigqueue_t *list, struct proc *p)
{
ffffffff803154b0:       67bdffe0        daddiu  sp,sp,-32
ffffffff803154b4:       ffbe0018        sd      s8,24(sp)
ffffffff803154b8:       03a0f025        move    s8,sp
        SIGEMPTYSET(list->sq_signals);
ffffffff803154bc:       ac800000        sw      zero,0(a0)
ffffffff803154c0:       ac800004        sw      zero,4(a0)
ffffffff803154c4:       ac800008        sw      zero,8(a0)
ffffffff803154c8:       ac80000c        sw      zero,12(a0)
        SIGEMPTYSET(list->sq_kill);
ffffffff803154cc:       ac800010        sw      zero,16(a0)
ffffffff803154d0:       fc800014        sd      zero,20(a0)           <--- boom here
ffffffff803154d4:       ac80001c        sw      zero,28(a0)
        TAILQ_INIT(&list->sq_list);
ffffffff803154d8:       64810020        daddiu  at,a0,32
ffffffff803154dc:       fc800020        sd      zero,32(a0)
ffffffff803154e0:       fc810028        sd      at,40(a0)
        list->sq_proc = p;
ffffffff803154e4:       fc850030        sd      a1,48(a0)
        list->sq_flags = SQ_INIT;
ffffffff803154e8:       24010001        li      at,1
ffffffff803154ec:       ac810038        sw      at,56(a0)
}
ffffffff803154f0:       03c0e825        move    sp,s8
ffffffff803154f4:       dfbe0018        ld      s8,24(sp)
ffffffff803154f8:       03e00008        jr      ra
ffffffff803154fc:       67bd0020        daddiu  sp,sp,32

ffffffff80315500 <sigqueue_take>:
        return (signo);
}

The process pointer p ($a1 at 0xffffffff803154b0) is properly aligned with value 0x9800000001108a80. Its member p_sigqueue has

The p_sigqueue field has type sigqueue_t:

typedef struct sigqueue {
        sigset_t        sq_signals;     /* All pending signals. */
        sigset_t        sq_kill;        /* Legacy depth 1 queue. */ 
        TAILQ_HEAD(, ksiginfo)  sq_list;/* Queued signal info. */  
        struct proc     *sq_proc;
        int             sq_flags;
} sigqueue_t;

sigset_t is four uint32_ts in a row:

typedef struct __sigset {
        __uint32_t __bits[_SIG_WORDS];
} __sigset_t;

It appears that two of the adjacent (but not 64-bit aligned) 32-bit stores have been coalesced into a single store double?

The build command line is:

[rnw24@vica ~/obj/mips.mips64/home/rnw24/git/cheribsd/sys/CHERI_DE4_USBROOT]$ /home/rnw24/sdk/sdk/bin/clang -c -O -pipe  -g -nostdinc  -I. -I/home/rnw24/git/cheribsd/sys -I/home/rnw24/git/cheribsd/sys/contrib/libfdt -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h  -fno-pic -mno-abicalls -G0 -DKERNLOADADDR=0xffffffff80100000 -march=mips64 -mabi=64 -MD -MP -MF.depend.kern_sig.o -MTkern_sig.o -msoft-float -ffreestanding -fwrapv -gdwarf-2 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes  -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual  -Wundef -Wno-pointer-sign -D__printf__=__freebsd_kprintf__  -Wmissing-include-dirs -fdiagnostics-show-option  -Wno-unknown-pragmas  -Wno-error-tautological-compare -Wno-error-empty-body  -Wno-error-parentheses-equality -Wno-error-unused-function  -Wno-error-pointer-sign -Wno-error-shift-negative-value    -std=iso9899:1999 -Werror  /home/rnw24/git/cheribsd/sys/kern/kern_sig.c

Preprocessed kern_sig.c: https://gist.github.com/rwatson/269dc1d9621e6530af6c

LLVM version: 388f6926b8f9bb0557c65b74badb8a34734f13dc
Clang version: 473591c52d2160071616e8574dc80305abfdda52

@davidchisnall
Copy link
Member

This is expected, I believe. We have told LLVM that CHERI supports unaligned loads and stores, so it should emit the code assuming that this is the case and, if the CPU doesn't support it (e.g. if the values span cache lines) then the fault handler should fix it up.

@rwatson
Copy link
Member Author

rwatson commented Mar 8, 2016

That's tricky to do in kernel C code, as the exception handler is implemented in C!

More generally: something a bit odd is going on here -- the zeroing of the prior instance of the structure is not similarly getting coalesced into a larger store just a few instructions earlier, and the kernel boots and runs for a BERI MIPS configuration fine as well -- it's when we configure a CHERI kernel that it goes wrong using LLVM. Perhaps we're just unlucky as memory allocation / layout has changed slightly between the two.

@davidchisnall
Copy link
Member

Presumably when targeting BERI, we are telling the compiler that it's a generic MIPS III CPU, whereas when it's targeting CHERI we are telling the compiler that it's a CHERI CPU. The former is defined in LLVM not to support unaligned loads and stores, the latter to support them. Given that we have no clwr / clwr instructions, we end up generating very bad code if we don't expect that CHERI supports unaligned access (and there's no mechanism for saying that we only support unaligned access via some pointer types).

@davidchisnall
Copy link
Member

It is now possible for targets to support unaligned access for a subset of address spaces. Would it help if clang emitted lwl / lwr and friends for accesses via integer pointers, but assumed that all capability loads and stores always worked?

@arichardson arichardson transferred this issue from CTSRD-CHERI/llvm Mar 13, 2019
@arichardson
Copy link
Member

arichardson commented Feb 20, 2024

Seems to be fixed now, MIPS uses sdl/sdr and CHERI-MIPS two csw instructions.

https://cheri-compiler-explorer.cl.cam.ac.uk/z/j7KoK1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants