Sunday, April 30, 2023

Re: OpenBSD 7.2 on Oracle Cloud

Hi,

what qemu version are you using? I cannot reproduce this with qemu 7.2.
Can you try with a newer qemu?

Cheers,
Stefan

Am 25.04.23 um 14:53 schrieb Aaron Mason:
>>>> Yeah I'm getting the same thing. Trying a build in QEMU and
>>>> transferring in to see if that helps. Will report back.
>>>>
>>>
>>> Ok, good news, it still crashes at the same spot, but this time I've
>>> got more data. Copying in tech@ - if I've forgotten anything let me
>>> know and I'll fire up a fresh instance.
>>>
>>> [REDACTED]
>>> vioscsi_req_done(e,ffff800000024a00,fffffd803f81c338,e,ffff800000024a00,ffff800
>>> 0000d3228) at vioscsi_req_done+0x26
>>> [REDACTED]
>>
>> Ok, so based on the trace I got, I was able to trace the stop itself
>> back to line 299 of vioscsi.c (thank. you. random relink. And
>> anonymous CVS):
>>
>> 293 vioscsi_req_done(struct vioscsi_softc *sc, struct virtio_softc *vsc,
>> 294 struct vioscsi_req *vr)
>> 295 {
>> 296 struct scsi_xfer *xs = vr->vr_xs;
>> 297 DPRINTF("vioscsi_req_done: enter vr: %p xs: %p\n", vr, xs);
>> 298
>> -->299 int isread = !!(xs->flags & SCSI_DATA_IN);
>> 300 bus_dmamap_sync(vsc->sc_dmat, vr->vr_control,
>> 301 offsetof(struct vioscsi_req, vr_req),
>> 302 sizeof(struct virtio_scsi_req_hdr),
>> 303 BUS_DMASYNC_POSTWRITE);
>>
>> Maybe if I follow the rabbit hole enough, I might find out what's
>> going wrong between the driver and OCI. I've got a day off tomorrow
>> (yay for war I guess), I'll give it a bash and see where we end up.
>>
>> --
>> Aaron Mason - Programmer, open source addict
>> I've taken my software vows - for beta or for worse
>
> I enabled debugging on the vioscsi driver, rebuilt the RAMDISK kernel
> with those drivers enabled, and got this:
>
> vioscsi0 at virtio1: qsize 128
> scsibus0 at vioscsi0: 255 targets
> vioscsi_req_get: 0xfffffd803f80d338
> vioscsi_scsi_cmd: enter
> vioscsi_scsi_cmd: polling...
> vioscsi_scsi_cmd: polling timeout
> vioscsi_scsi_cmd: done (timeout=0)
> vioscsi_scsi_cmd: enter
> vioscsi_scsi_cmd: polling...
> vioscsi_vq_done: enter
> vioscsi_vq_done: slot=127
> vioscsi_req_done: enter vr: 0xfffffd803f80d338 xs: 0xfffffd803f8a5e58
> vioscsi_req_done: done 0, 2, 0
> vioscsi_vq_done: slot=127
> vioscsi_req_done: enter vr: 0xfffffd803f80d338 xs: 0x0
> uvm_fault(0xffffffff813ec2e0, 0x8, 0, 1) -> e
> fatal page fault in supervisor mode
> trap type 6 code 0 rip ffffffff810e6190 cs 8 rflags 10286 cr2 8 cpl e
> rsp ffffffff81606670
> gsbase 0xffffffff813dfff0 kgsbase 0x0
> panic: trap type 6, code=0, pc=ffffffff810e6190
>
> That "xs: 0x0" bit feels like a clue. It should be trivial to pick up
> and handle, but what would be the correct way to handle that?
>
> If I have it return if "xs" is found to be NULL, it continues - the
> debugging suggests it goes through each possible target before
> finishing up. I don't know if that's correct, but it seems to continue
> booting after that even if my example didn't detect the drive with the
> kernel I built (I used the RAMDISK kernel and it was pretty stripped
> down).
>
> I'm about to attempt a -STABLE build (I've got 7.3 installed and thus
> can't yet build a snapshot, but I will do that if this test succeeds)
> - here's the patch that hopefully fixes the problem. (and hopefully
> gmail doesn't clobber the tabs)
>
> Index: sys/dev/pv/vioscsi.c
> ===================================================================
> RCS file: /cvs/src/sys/dev/pv/vioscsi.c,v
> retrieving revision 1.30
> diff -u -p -u -p -r1.30 vioscsi.c
> --- sys/dev/pv/vioscsi.c 16 Apr 2022 19:19:59 -0000 1.30
> +++ sys/dev/pv/vioscsi.c 25 Apr 2023 12:51:16 -0000
> @@ -296,6 +296,7 @@ vioscsi_req_done(struct vioscsi_softc *s
> struct scsi_xfer *xs = vr->vr_xs;
> DPRINTF("vioscsi_req_done: enter vr: %p xs: %p\n", vr, xs);
>
> + if (xs == NULL) return;
> int isread = !!(xs->flags & SCSI_DATA_IN);
> bus_dmamap_sync(vsc->sc_dmat, vr->vr_control,
> offsetof(struct vioscsi_req, vr_req),
>
>

No comments:

Post a Comment