linux

mirror of https://github.com/raspberrypi/linux.git synced 2026-01-02 07:43:34 +00:00

Author	SHA1	Message	Date
Suwan Kim	37fafe6b61	virtio-blk: Fix WARN_ON_ONCE in virtio_queue_rq() If a request fails at virtio_queue_rqs(), it is inserted to requeue_list and passed to virtio_queue_rq(). Then blk_mq_start_request() can be called again at virtio_queue_rq() and trigger WARN_ON_ONCE like below trace because request state was already set to MQ_RQ_IN_FLIGHT in virtio_queue_rqs() despite the failure. [ 1.890468] ------------[ cut here ]------------ [ 1.890776] WARNING: CPU: 2 PID: 122 at block/blk-mq.c:1143 blk_mq_start_request+0x8a/0xe0 [ 1.891045] Modules linked in: [ 1.891250] CPU: 2 PID: 122 Comm: journal-offline Not tainted 5.19.0+ #44 [ 1.891504] Hardware name: ChromiumOS crosvm, BIOS 0 [ 1.891739] RIP: 0010:blk_mq_start_request+0x8a/0xe0 [ 1.891961] Code: 12 80 74 22 48 8b 4b 10 8b 89 64 01 00 00 8b 53 20 83 fa ff 75 08 ba 00 00 00 80 0b 53 24 c1 e1 10 09 d1 89 48 34 5b 41 5e c3 <0f> 0b eb b8 65 8b 05 2b 39 b6 7e 89 c0 48 0f a3 05 39 77 5b 01 0f [ 1.892443] RSP: 0018:ffffc900002777b0 EFLAGS: 00010202 [ 1.892673] RAX: 0000000000000000 RBX: ffff888004bc0000 RCX: 0000000000000000 [ 1.892952] RDX: 0000000000000000 RSI: ffff888003d7c200 RDI: ffff888004bc0000 [ 1.893228] RBP: 0000000000000000 R08: 0000000000000001 R09: ffff888004bc0100 [ 1.893506] R10: ffffffffffffffff R11: ffffffff8185ca10 R12: ffff888004bc0000 [ 1.893797] R13: ffffc90000277900 R14: ffff888004ab2340 R15: ffff888003d86e00 [ 1.894060] FS: 00007ffa143a4640(0000) GS:ffff88807dd00000(0000) knlGS:0000000000000000 [ 1.894412] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1.894682] CR2: 00005648577d9088 CR3: 00000000053da004 CR4: 0000000000170ee0 [ 1.894953] Call Trace: [ 1.895139] <TASK> [ 1.895303] virtblk_prep_rq+0x1e5/0x280 [ 1.895509] virtio_queue_rq+0x5c/0x310 [ 1.895710] ? virtqueue_add_sgs+0x95/0xb0 [ 1.895905] ? _raw_spin_unlock_irqrestore+0x16/0x30 [ 1.896133] ? virtio_queue_rqs+0x340/0x390 [ 1.896453] ? sbitmap_get+0xfa/0x220 [ 1.896678] __blk_mq_issue_directly+0x41/0x180 [ 1.896906] blk_mq_plug_issue_direct+0xd8/0x2c0 [ 1.897115] blk_mq_flush_plug_list+0x115/0x180 [ 1.897342] blk_add_rq_to_plug+0x51/0x130 [ 1.897543] blk_mq_submit_bio+0x3a1/0x570 [ 1.897750] submit_bio_noacct_nocheck+0x418/0x520 [ 1.897985] ? submit_bio_noacct+0x1e/0x260 [ 1.897989] ext4_bio_write_page+0x222/0x420 [ 1.898000] mpage_process_page_bufs+0x178/0x1c0 [ 1.899451] mpage_prepare_extent_to_map+0x2d2/0x440 [ 1.899603] ext4_writepages+0x495/0x1020 [ 1.899733] do_writepages+0xcb/0x220 [ 1.899871] ? __seccomp_filter+0x171/0x7e0 [ 1.900006] file_write_and_wait_range+0xcd/0xf0 [ 1.900167] ext4_sync_file+0x72/0x320 [ 1.900308] __x64_sys_fsync+0x66/0xa0 [ 1.900449] do_syscall_64+0x31/0x50 [ 1.900595] entry_SYSCALL_64_after_hwframe+0x63/0xcd [ 1.900747] RIP: 0033:0x7ffa16ec96ea [ 1.900883] Code: b8 4a 00 00 00 0f 05 48 3d 00 f0 ff ff 77 41 c3 48 83 ec 18 89 7c 24 0c e8 e3 02 f8 ff 8b 7c 24 0c 89 c2 b8 4a 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 36 89 d7 89 44 24 0c e8 43 03 f8 ff 8b 44 24 [ 1.901302] RSP: 002b:00007ffa143a3ac0 EFLAGS: 00000293 ORIG_RAX: 000000000000004a [ 1.901499] RAX: ffffffffffffffda RBX: 0000560277ec6fe0 RCX: 00007ffa16ec96ea [ 1.901696] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000016 [ 1.901884] RBP: 0000560277ec5910 R08: 0000000000000000 R09: 00007ffa143a4640 [ 1.902082] R10: 00007ffa16e4d39e R11: 0000000000000293 R12: 00005602773f59e0 [ 1.902459] R13: 0000000000000000 R14: 00007fffbfc007ff R15: 00007ffa13ba4000 [ 1.902763] </TASK> [ 1.902877] ---[ end trace 0000000000000000 ]--- To avoid calling blk_mq_start_request() twice, This patch moves the execution of blk_mq_start_request() to the end of virtblk_prep_rq(). And instead of requeuing failed request to plug list in the error path of virtblk_add_req_batch(), it uses blk_mq_requeue_request() to change failed request state to MQ_RQ_IDLE. Then virtblk can safely handle the request on the next trial. Fixes: `0e9911fa76` ("virtio-blk: support mq_ops->queue_rqs()") Reported-by: Alexandre Courbot <acourbot@chromium.org> Tested-by: Alexandre Courbot <acourbot@chromium.org> Signed-off-by: Suwan Kim <suwan.kim027@gmail.com> Message-Id: <20220830150153.12627-1-suwan.kim027@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Pankaj Raghav <p.raghav@samsung.com>	2022-09-27 18:30:49 -04:00
Linus Torvalds	7a53e17acc	Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost Pull virtio updates from Michael Tsirkin: - A huge patchset supporting vq resize using the new vq reset capability - Features, fixes, and cleanups all over the place * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (88 commits) vdpa/mlx5: Fix possible uninitialized return value vdpa_sim_blk: add support for discard and write-zeroes vdpa_sim_blk: add support for VIRTIO_BLK_T_FLUSH vdpa_sim_blk: make vdpasim_blk_check_range usable by other requests vdpa_sim_blk: check if sector is 0 for commands other than read or write vdpa_sim: Implement suspend vdpa op vhost-vdpa: uAPI to suspend the device vhost-vdpa: introduce SUSPEND backend feature bit vdpa: Add suspend operation virtio-blk: Avoid use-after-free on suspend/resume virtio_vdpa: support the arg sizes of find_vqs() vhost-vdpa: Call ida_simple_remove() when failed vDPA: fix 'cast to restricted le16' warnings in vdpa.c vDPA: !FEATURES_OK should not block querying device config space vDPA/ifcvf: support userspace to query features and MQ of a management device vDPA/ifcvf: get_config_size should return a value no greater than dev implementation vhost scsi: Allow user to control num virtqueues vhost-scsi: Fix max number of virtqueues vdpa/mlx5: Support different address spaces for control and data vdpa/mlx5: Implement susupend virtqueue callback ...	2022-08-12 09:50:34 -07:00
Shigeru Yoshida	8d12ec1029	virtio-blk: Avoid use-after-free on suspend/resume hctx->user_data is set to vq in virtblk_init_hctx(). However, vq is freed on suspend and reallocated on resume. So, hctx->user_data is invalid after resume, and it will cause use-after-free accessing which will result in the kernel crash something like below: [ 22.428391] Call Trace: [ 22.428899] <TASK> [ 22.429339] virtqueue_add_split+0x3eb/0x620 [ 22.430035] ? __blk_mq_alloc_requests+0x17f/0x2d0 [ 22.430789] ? kvm_clock_get_cycles+0x14/0x30 [ 22.431496] virtqueue_add_sgs+0xad/0xd0 [ 22.432108] virtblk_add_req+0xe8/0x150 [ 22.432692] virtio_queue_rqs+0xeb/0x210 [ 22.433330] blk_mq_flush_plug_list+0x1b8/0x280 [ 22.434059] __blk_flush_plug+0xe1/0x140 [ 22.434853] blk_finish_plug+0x20/0x40 [ 22.435512] read_pages+0x20a/0x2e0 [ 22.436063] ? folio_add_lru+0x62/0xa0 [ 22.436652] page_cache_ra_unbounded+0x112/0x160 [ 22.437365] filemap_get_pages+0xe1/0x5b0 [ 22.437964] ? context_to_sid+0x70/0x100 [ 22.438580] ? sidtab_context_to_sid+0x32/0x400 [ 22.439979] filemap_read+0xcd/0x3d0 [ 22.440917] xfs_file_buffered_read+0x4a/0xc0 [ 22.441984] xfs_file_read_iter+0x65/0xd0 [ 22.442970] __kernel_read+0x160/0x2e0 [ 22.443921] bprm_execve+0x21b/0x640 [ 22.444809] do_execveat_common.isra.0+0x1a8/0x220 [ 22.446008] __x64_sys_execve+0x2d/0x40 [ 22.446920] do_syscall_64+0x37/0x90 [ 22.447773] entry_SYSCALL_64_after_hwframe+0x63/0xcd This patch fixes this issue by getting vq from vblk, and removes virtblk_init_hctx(). Fixes: `4e04005256` ("virtio-blk: support polling I/O") Cc: "Suwan Kim" <suwan.kim027@gmail.com> Signed-off-by: Shigeru Yoshida <syoshida@redhat.com> Message-Id: <20220810160948.959781-1-syoshida@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-08-11 04:30:59 -04:00
Christoph Hellwig	8b9ab62662	block: remove blk_cleanup_disk blk_cleanup_disk is nothing but a trivial wrapper for put_disk now, so remove it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20220619060552.1850436-7-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-06-28 06:33:15 -06:00
Christoph Hellwig	6f8191fdf4	block: simplify disk shutdown Set the queue dying flag and call blk_mq_exit_queue from del_gendisk for all disks that do not have separately allocated queues, and thus remove the need to call blk_cleanup_queue for them. Rename blk_cleanup_disk to blk_mq_destroy_queue to make it clear that this function is intended only for separately allocated blk-mq queues. This saves an extra queue freeze for devices without a separately allocated queue. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20220619060552.1850436-6-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-06-28 06:30:26 -06:00
Suwan Kim	0e9911fa76	virtio-blk: support mq_ops->queue_rqs() This patch supports mq_ops->queue_rqs() hook. It has an advantage of batch submission to virtio-blk driver. It also helps polling I/O because polling uses batched completion of block layer. Batch submission in queue_rqs() can boost polling performance. In queue_rqs(), it iterates plug->mq_list, collects requests that belong to same HW queue until it encounters a request from other HW queue or sees the end of the list. Then, virtio-blk adds requests into virtqueue and kicks virtqueue to submit requests. If there is an error, it inserts error request to requeue_list and passes it to ordinary block layer path. For verification, I did fio test. (io_uring, randread, direct=1, bs=4K, iodepth=64 numjobs=N) I set 4 vcpu and 2 virtio-blk queues for VM and run fio test 5 times. It shows about 2% improvement. \| numjobs=2 \| numjobs=4 ----------------------------------------------------------- fio without queue_rqs() \| 291K IOPS \| 238K IOPS ----------------------------------------------------------- fio with queue_rqs() \| 295K IOPS \| 243K IOPS For polling I/O performance, I also did fio test as below. (io_uring, hipri, randread, direct=1, bs=512, iodepth=64 numjobs=4) I set 4 vcpu and 2 poll queues for VM. It shows about 2% improvement in polling I/O. \| IOPS \| avg latency ----------------------------------------------------------- fio poll without queue_rqs() \| 424K \| 613.05 usec ----------------------------------------------------------- fio poll with queue_rqs() \| 435K \| 601.01 usec Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Suwan Kim <suwan.kim027@gmail.com> Message-Id: <20220406153207.163134-3-suwan.kim027@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>	2022-05-31 12:44:23 -04:00
Suwan Kim	4e04005256	virtio-blk: support polling I/O This patch supports polling I/O via virtio-blk driver. Polling feature is enabled by module parameter "poll_queues" and it sets dedicated polling queues for virtio-blk. This patch improves the polling I/O throughput and latency. The virtio-blk driver doesn't not have a poll function and a poll queue and it has been operating in interrupt driven method even if the polling function is called in the upper layer. virtio-blk polling is implemented upon 'batched completion' of block layer. virtblk_poll() queues completed request to io_comp_batch->req_list and later, virtblk_complete_batch() calls unmap function and ends the requests in batch. virtio-blk reads the number of poll queues from module parameter "poll_queues". If VM sets queue parameter as below, ("num-queues=N" [QEMU property], "poll_queues=M" [module parameter]) It allocates N virtqueues to virtio_blk->vqs[N] and it uses [0..(N-M-1)] as default queues and [(N-M)..(N-1)] as poll queues. Unlike the default queues, the poll queues have no callback function. Regarding HW-SW queue mapping, the default queue mapping uses the existing method that condsiders MSI irq vector. But the poll queue doesn't have an irq, so it uses the regular blk-mq cpu mapping. For verifying the improvement, I did Fio polling I/O performance test with io_uring engine with the options below. (io_uring, hipri, randread, direct=1, bs=512, iodepth=64 numjobs=N) I set 4 vcpu and 4 virtio-blk queues - 2 default queues and 2 poll queues for VM. As a result, IOPS and average latency improved about 10%. Test result: - Fio io_uring poll without virtio-blk poll support -- numjobs=1 : IOPS = 339K, avg latency = 188.33us -- numjobs=2 : IOPS = 367K, avg latency = 347.33us -- numjobs=4 : IOPS = 383K, avg latency = 682.06us - Fio io_uring poll with virtio-blk poll support -- numjobs=1 : IOPS = 385K, avg latency = 165.94us -- numjobs=2 : IOPS = 408K, avg latency = 313.28us -- numjobs=4 : IOPS = 424K, avg latency = 613.05us Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by: Suwan Kim <suwan.kim027@gmail.com> Message-Id: <20220406153207.163134-2-suwan.kim027@gmail.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>	2022-05-31 12:44:23 -04:00
Christoph Hellwig	62952cc5bc	virtio_blk: fix the discard_granularity and discard_alignment queue limits The discard_alignment queue limit is named a bit misleading means the offset into the block device at which the discard granularity starts. On the other hand the discard_sector_alignment from the virtio 1.1 looks similar to what Linux uses as discard granularity (even if not very well described): "discard_sector_alignment can be used by OS when splitting a request based on alignment. " And at least qemu does set it to the discard granularity. So stop setting the discard_alignment and use the virtio discard_sector_alignment to set the discard granularity. Fixes: `1f23816b8e` ("virtio_blk: add discard and write zeroes support") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20220418045314.360785-5-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-05-03 10:38:50 -06:00
Christoph Hellwig	70200574cc	block: remove QUEUE_FLAG_DISCARD Just use a non-zero max_discard_sectors as an indicator for discard support, similar to what is done for write zeroes. The only places where needs special attention is the RAID5 driver, which must clear discard support for security reasons by default, even if the default stacking rules would allow for it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Acked-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> [drbd] Acked-by: Jan Höppner <hoeppner@linux.ibm.com> [s390] Acked-by: Coly Li <colyli@suse.de> [bcache] Acked-by: David Sterba <dsterba@suse.com> [btrfs] Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220415045258.199825-25-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-04-17 19:49:59 -06:00
Linus Torvalds	69d1dea852	Merge tag 'for-5.18/drivers-2022-03-18' of git://git.kernel.dk/linux-block Pull block driver updates from Jens Axboe: - NVMe updates via Christoph: - add vectored-io support for user-passthrough (Kanchan Joshi) - add verbose error logging (Alan Adamson) - support buffered I/O on block devices in nvmet (Chaitanya Kulkarni) - central discovery controller support (Martin Belanger) - fix and extended the globally unique idenfier validation (Christoph) - move away from the deprecated IDA APIs (Sagi Grimberg) - misc code cleanup (Keith Busch, Max Gurtovoy, Qinghua Jin, Chaitanya Kulkarni) - add lockdep annotations for in-kernel sockets (Chris Leech) - use vmalloc for ANA log buffer (Hannes Reinecke) - kerneldoc fixes (Chaitanya Kulkarni) - cleanups (Guoqing Jiang, Chaitanya Kulkarni, Christoph) - warn about shared namespaces without multipathing (Christoph) - MD updates via Song with a set of cleanups (Christoph, Mariusz, Paul, Erik, Dirk) - loop cleanups and queue depth configuration (Chaitanya) - null_blk cleanups and fixes (Chaitanya) - Use descriptive init/exit names in virtio_blk (Randy) - Use bvec_kmap_local() in drivers (Christoph) - bcache fixes (Mingzhe) - xen blk-front persistent grant speedups (Juergen) - rnbd fix and cleanup (Gioh) - Misc fixes (Christophe, Colin) * tag 'for-5.18/drivers-2022-03-18' of git://git.kernel.dk/linux-block: (76 commits) virtio_blk: eliminate anonymous module_init & module_exit nvme: warn about shared namespaces without CONFIG_NVME_MULTIPATH nvme: remove nvme_alloc_request and nvme_alloc_request_qid nvme: cleanup how disk->disk_name is assigned nvmet: move the call to nvmet_ns_changed out of nvmet_ns_revalidate nvmet: use snprintf() with PAGE_SIZE in configfs nvmet: don't fold lines nvmet-rdma: fix kernel-doc warning for nvmet_rdma_device_removal nvmet-fc: fix kernel-doc warning for nvmet_fc_unregister_targetport nvmet-fc: fix kernel-doc warning for nvmet_fc_register_targetport nvme-tcp: lockdep: annotate in-kernel sockets nvme-tcp: don't fold the line nvme-tcp: don't initialize ret variable nvme-multipath: call bio_io_error in nvme_ns_head_submit_bio nvme-multipath: use vmalloc for ANA log buffer xen/blkfront: speed up purge_persistent_grants() raid5: initialize the stripe_head embeeded bios as needed raid5-cache: statically allocate the recovery ra bio raid5-cache: fully initialize flush_bio when needed raid5-ppl: fully initialize the bio in ppl_new_iounit ...	2022-03-21 17:16:01 -07:00
Linus Torvalds	616355cc81	Merge tag 'for-5.18/block-2022-03-18' of git://git.kernel.dk/linux-block Pull block updates from Jens Axboe: - BFQ cleanups and fixes (Yu, Zhang, Yahu, Paolo) - blk-rq-qos completion fix (Tejun) - blk-cgroup merge fix (Tejun) - Add offline error return value to distinguish it from an IO error on the device (Song) - IO stats fixes (Zhang, Christoph) - blkcg refcount fixes (Ming, Yu) - Fix for indefinite dispatch loop softlockup (Shin'ichiro) - blk-mq hardware queue management improvements (Ming) - sbitmap dead code removal (Ming, John) - Plugging merge improvements (me) - Show blk-crypto capabilities in sysfs (Eric) - Multiple delayed queue run improvement (David) - Block throttling fixes (Ming) - Start deprecating auto module loading based on dev_t (Christoph) - bio allocation improvements (Christoph, Chaitanya) - Get rid of bio_devname (Christoph) - bio clone improvements (Christoph) - Block plugging improvements (Christoph) - Get rid of genhd.h header (Christoph) - Ensure drivers use appropriate flush helpers (Christoph) - Refcounting improvements (Christoph) - Queue initialization and teardown improvements (Ming, Christoph) - Misc fixes/improvements (Barry, Chaitanya, Colin, Dan, Jiapeng, Lukas, Nian, Yang, Eric, Chengming) * tag 'for-5.18/block-2022-03-18' of git://git.kernel.dk/linux-block: (127 commits) block: cancel all throttled bios in del_gendisk() block: let blkcg_gq grab request queue's refcnt block: avoid use-after-free on throttle data block: limit request dispatch loop duration block/bfq-iosched: Fix spelling mistake "tenative" -> "tentative" sr: simplify the local variable initialization in sr_block_open() block: don't merge across cgroup boundaries if blkcg is enabled block: fix rq-qos breakage from skipping rq_qos_done_bio() block: flush plug based on hardware and software queue order block: ensure plug merging checks the correct queue at least once block: move rq_qos_exit() into disk_release() block: do more work in elevator_exit block: move blk_exit_queue into disk_release block: move q_usage_counter release into blk_queue_release block: don't remove hctx debugfs dir from blk_mq_exit_queue block: move blkcg initialization/destroy into disk allocation/release handler sr: implement ->free_disk to simplify refcounting sd: implement ->free_disk to simplify refcounting sd: delay calling free_opal_dev sd: call sd_zbc_release_disk before releasing the scsi_device reference ...	2022-03-21 16:48:55 -07:00
Randy Dunlap	bcfe9b6cbb	virtio_blk: eliminate anonymous module_init & module_exit Eliminate anonymous module_init() and module_exit(), which can lead to confusion or ambiguity when reading System.map, crashes/oops/bugs, or an initcall_debug log. Give each of these init and exit functions unique driver-specific names to eliminate the anonymous names. Example 1: (System.map) ffffffff832fc78c t init ffffffff832fc79e t init ffffffff832fc8f8 t init Example 2: (initcall_debug log) calling init+0x0/0x12 @ 1 initcall init+0x0/0x12 returned 0 after 15 usecs calling init+0x0/0x60 @ 1 initcall init+0x0/0x60 returned 0 after 2 usecs calling init+0x0/0x9a @ 1 initcall init+0x0/0x9a returned 0 after 74 usecs Fixes: `e467cde238` ("Block driver using virtio.") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: virtualization@lists.linux-foundation.org Cc: Jens Axboe <axboe@kernel.dk> Cc: linux-block@vger.kernel.org Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/20220316192010.19001-2-rdunlap@infradead.org Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-03-17 10:11:29 -06:00
Xie Yongji	e030759a1d	virtio-blk: Remove BUG_ON() in virtio_queue_rq() Currently we have a BUG_ON() to make sure the number of sg list does not exceed queue_max_segments() in virtio_queue_rq(). However, the block layer uses queue_max_discard_segments() instead of queue_max_segments() to limit the sg list for discard requests. So the BUG_ON() might be triggered if virtio-blk device reports a larger value for max discard segment than queue_max_segments(). To fix it, let's simply remove the BUG_ON() which has become unnecessary after commit 02746e26c39e("virtio-blk: avoid preallocating big SGL for data"). And the unused vblk->sg_elems can also be removed together. Fixes: `1f23816b8e` ("virtio_blk: add discard and write zeroes support") Suggested-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Link: https://lore.kernel.org/r/20220304100058.116-2-xieyongji@bytedance.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-03-06 06:06:50 -05:00
Xie Yongji	dacc73ed0b	virtio-blk: Don't use MAX_DISCARD_SEGMENTS if max_discard_seg is zero Currently the value of max_discard_segment will be set to MAX_DISCARD_SEGMENTS (256) with no basis in hardware if device set 0 to max_discard_seg in configuration space. It's incorrect since the device might not be able to handle such large descriptors. To fix it, let's follow max_segments restrictions in this case. Fixes: `1f23816b8e` ("virtio_blk: add discard and write zeroes support") Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Link: https://lore.kernel.org/r/20220304100058.116-1-xieyongji@bytedance.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-03-06 06:06:50 -05:00
Christoph Hellwig	24b45e6c25	virtio_blk: simplify refcounting Implement the ->free_disk method to free the virtio_blk structure only once the last gendisk reference goes away instead of keeping a local refcount. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://lore.kernel.org/r/20220215094514.3828912-6-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2022-02-16 19:44:24 -07:00
Linus Torvalds	3bf6a9e36e	Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost Pull virtio updates from Michael Tsirkin: "virtio,vdpa,qemu_fw_cfg: features, cleanups, and fixes. - partial support for < MAX_ORDER - 1 granularity for virtio-mem - driver_override for vdpa - sysfs ABI documentation for vdpa - multiqueue config support for mlx5 vdpa - and misc fixes, cleanups" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (42 commits) vdpa/mlx5: Fix tracking of current number of VQs vdpa/mlx5: Fix is_index_valid() to refer to features vdpa: Protect vdpa reset with cf_mutex vdpa: Avoid taking cf_mutex lock on get status vdpa/vdpa_sim_net: Report max device capabilities vdpa: Use BIT_ULL for bit operations vdpa/vdpa_sim: Configure max supported virtqueues vdpa/mlx5: Report max device capabilities vdpa: Support reporting max device capabilities vdpa/mlx5: Restore cur_num_vqs in case of failure in change_num_qps() vdpa: Add support for returning device configuration information vdpa/mlx5: Support configuring max data virtqueue vdpa/mlx5: Fix config_attr_mask assignment vdpa: Allow to configure max data virtqueues vdpa: Read device configuration only if FEATURES_OK vdpa: Sync calls set/get config/status with cf_mutex vdpa/mlx5: Distribute RX virtqueues in RQT object vdpa: Provide interface to read driver features vdpa: clean up get_config_size ret value handling virtio_ring: mark ring unused on error ...	2022-01-18 10:05:48 +02:00
Michael S. Tsirkin	d9679d0013	virtio: wrap config->reset calls This will enable cleanups down the road. The idea is to disable cbs, then add "flush_queued_cbs" callback as a parameter, this way drivers can flush any work queued after callbacks have been disabled. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Link: https://lore.kernel.org/r/20211013105226.20225-1-mst@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2022-01-14 18:50:52 -05:00
Christoph Hellwig	b84ba30b6c	block: remove the gendisk argument to blk_execute_rq Remove the gendisk aregument to blk_execute_rq and blk_execute_rq_nowait given that it is unused now. Also convert the boolean at_head parameter to actually use the bool type while touching the prototype. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20211126121802.2090656-5-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-11-29 06:41:29 -07:00
Christoph Hellwig	1ebe2e5f9d	block: remove GENHD_FL_EXT_DEVT All modern drivers can support extra partitions using the extended dev_t. In fact except for the ioctl method drivers never even see partitions in normal operation. So remove the GENHD_FL_EXT_DEVT and allow extra partitions for all block devices that do support partitions, and require those that do not support partitions to explicit disallow them using GENHD_FL_NO_PART. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20211122130625.1136848-12-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-11-29 06:38:35 -07:00
Ye Guojin	0466a39bd0	virtio-blk: modify the value type of num in virtio_queue_rq() This was found by coccicheck: ./drivers/block/virtio_blk.c, 334, 14-17, WARNING Unsigned expression compared with zero num < 0 Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Ye Guojin <ye.guojin@zte.com.cn> Link: https://lore.kernel.org/r/20211117063955.160777-1-ye.guojin@zte.com.cn Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Fixes: `02746e26c3` ("virtio-blk: avoid preallocating big SGL for data") Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2021-11-24 19:00:28 -05:00
Michael S. Tsirkin	2b17d9f848	Revert "virtio-blk: don't let virtio core to validate used length" This reverts commit `a40392edf1`. Attempts to validate length in the core did not work out. We'll drop them, so revert the dependent changes in drivers. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-11-24 18:47:17 -05:00
Linus Torvalds	43e1b12927	Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost Pull virtio updates from Michael Tsirkin: "vhost and virtio fixes and features: - Hardening work by Jason - vdpa driver for Alibaba ENI - Performance tweaks for virtio blk - virtio rng rework using an internal buffer - mac/mtu programming for mlx5 vdpa - Misc fixes, cleanups" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (45 commits) vdpa/mlx5: Forward only packets with allowed MAC address vdpa/mlx5: Support configuration of MAC vdpa/mlx5: Fix clearing of VIRTIO_NET_F_MAC feature bit vdpa_sim_net: Enable user to set mac address and mtu vdpa: Enable user to set mac and mtu of vdpa device vdpa: Use kernel coding style for structure comments vdpa: Introduce query of device config layout vdpa: Introduce and use vdpa device get, set config helpers virtio-scsi: don't let virtio core to validate used buffer length virtio-blk: don't let virtio core to validate used length virtio-net: don't let virtio core to validate used length virtio_ring: validate used buffer length virtio_blk: correct types for status handling virtio_blk: allow 0 as num_request_queues i2c: virtio: Add support for zero-length requests virtio-blk: fixup coccinelle warnings virtio_ring: fix typos in vring_desc_extra virtio-pci: harden INTX interrupts virtio_pci: harden MSI-X interrupts virtio_config: introduce a new .enable_cbs method ...	2021-11-03 15:00:39 -07:00
Linus Torvalds	71ae42629e	Merge tag 'for-5.16/passthrough-flag-2021-10-29' of git://git.kernel.dk/linux-block Pull QUEUE_FLAG_SCSI_PASSTHROUGH removal from Jens Axboe: "This contains a series leading to the removal of the QUEUE_FLAG_SCSI_PASSTHROUGH queue flag" * tag 'for-5.16/passthrough-flag-2021-10-29' of git://git.kernel.dk/linux-block: block: remove blk_{get,put}_request block: remove QUEUE_FLAG_SCSI_PASSTHROUGH block: remove the initialize_rq_fn blk_mq_ops method scsi: add a scsi_alloc_request helper bsg-lib: initialize the bsg_job in bsg_transport_sg_io_fn nfsd/blocklayout: use ->get_unique_id instead of sending SCSI commands sd: implement ->get_unique_id block: add a ->get_unique_id method	2021-11-01 10:12:44 -07:00
Jason Wang	a40392edf1	virtio-blk: don't let virtio core to validate used length We never tries to use used length, so the patch prevents the virtio core from validating used length. Signed-off-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20211027022107.14357-4-jasowang@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-11-01 05:26:48 -04:00
Michael S. Tsirkin	f083937247	virtio_blk: correct types for status handling virtblk_setup_cmd returns blk_status_t in an int, callers then assign it back to a blk_status_t variable. blk_status_t is either u32 or (more typically) u8 so it works, but is inelegant and causes sparse warnings. Pass the status in blk_status_t in a consistent way. Reported-by: kernel test robot <lkp@intel.com> Fixes: b2c5221fd074 ("virtio-blk: avoid preallocating big SGL for data") Cc: Max Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2021-11-01 05:26:48 -04:00
Michael S. Tsirkin	ead65f7695	virtio_blk: allow 0 as num_request_queues The default value is 0 meaning "no limit". However if 0 is specified on the command line it is instead silently converted to 1. Further, the value is already validated at point of use, there's no point in duplicating code validating the value when it is set. Simplify the code while making the behaviour more consistent by using plain module_param. Fixes: 1a662cf6cb9a ("virtio-blk: add num_request_queues module parameter") Cc: Max Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-11-01 05:26:48 -04:00
Ye Guojin	f1aa12f535	virtio-blk: fixup coccinelle warnings coccicheck complains about the use of snprintf() in sysfs show functions: WARNING use scnprintf or sprintf Use sysfs_emit instead of scnprintf or sprintf makes more sense. Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Ye Guojin <ye.guojin@zte.com.cn> Link: https://lore.kernel.org/r/20211021065111.1047824-1-ye.guojin@zte.com.cn Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>	2021-11-01 05:26:48 -04:00
Colin Ian King	63b4ffa4fa	virtio_blk: Fix spelling mistake: "advertisted" -> "advertised" There is a spelling mistake in a dev_err error message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Link: https://lore.kernel.org/r/20211025102240.22801-1-colin.i.king@gmail.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Acked-by: Jason Wang <jasowang@redhat.com>	2021-11-01 05:26:48 -04:00
Jason Wang	6ae6ff6f6e	virtio-blk: validate num_queues during probe If an untrusted device neogitates BLK_F_MQ but advertises a zero num_queues, the driver may end up trying to allocating zero size buffers where ZERO_SIZE_PTR is returned which may pass the checking against the NULL. This will lead unexpected results. Fixing this by failing the probe in this case. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/20211019070152.8236-2-jasowang@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2021-11-01 05:26:48 -04:00
Max Gurtovoy	0989c41bed	virtio-blk: add num_request_queues module parameter Sometimes a user would like to control the amount of request queues to be created for a block device. For example, for limiting the memory footprint of virtio-blk devices. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Link: https://lore.kernel.org/r/20210902204622.54354-1-mgurtovoy@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-11-01 04:29:47 -04:00
Max Gurtovoy	02746e26c3	virtio-blk: avoid preallocating big SGL for data No need to pre-allocate a big buffer for the IO SGL anymore. If a device has lots of deep queues, preallocation for the sg list can consume substantial amounts of memory. For HW virtio-blk device, nr_hw_queues can be 64 or 128 and each queue's depth might be 128. This means the resulting preallocation for the data SGLs is big. Switch to runtime allocation for SGL for lists longer than 2 entries. This is the approach used by NVMe drivers so it should be reasonable for virtio block as well. Runtime SGL allocation has always been the case for the legacy I/O path so this is nothing new. The preallocated small SGL depends on SG_CHAIN so if the ARCH doesn't support SG_CHAIN, use only runtime allocation for the SGL. Re-organize the setup of the IO request to fit the new sg chain mechanism. No performance degradation was seen (fio libaio engine with 16 jobs and 128 iodepth): IO size IOPs Rand Read (before/after) IOPs Rand Write (before/after) -------- --------------------------------- ---------------------------------- 512B 318K/316K 329K/325K 4KB 323K/321K 353K/349K 16KB 199K/208K 250K/275K 128KB 36K/36.1K 39.2K/41.7K Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: Israel Rukshin <israelr@nvidia.com> Link: https://lore.kernel.org/r/20210901131434.31158-1-mgurtovoy@nvidia.com Reviewed-by: Feng Li <lifeng1519@gmail.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de> # kconfig fixups Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-11-01 04:29:47 -04:00
Christoph Hellwig	0bf6d96cb8	block: remove blk_{get,put}_request These are now pointless wrappers around blk_mq_{alloc,free}_request, so remove them. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20211025070517.1548584-3-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-29 06:50:52 -06:00
Xie Yongji	57a13a5b81	virtio-blk: Use blk_validate_block_size() to validate block size The block layer can't support a block size larger than page size yet. And a block size that's too small or not a power of two won't work either. If a misconfigured device presents an invalid block size in configuration space, it will result in the kernel crash something like below: [ 506.154324] BUG: kernel NULL pointer dereference, address: 0000000000000008 [ 506.160416] RIP: 0010:create_empty_buffers+0x24/0x100 [ 506.174302] Call Trace: [ 506.174651] create_page_buffers+0x4d/0x60 [ 506.175207] block_read_full_page+0x50/0x380 [ 506.175798] ? __mod_lruvec_page_state+0x60/0xa0 [ 506.176412] ? __add_to_page_cache_locked+0x1b2/0x390 [ 506.177085] ? blkdev_direct_IO+0x4a0/0x4a0 [ 506.177644] ? scan_shadow_nodes+0x30/0x30 [ 506.178206] ? lru_cache_add+0x42/0x60 [ 506.178716] do_read_cache_page+0x695/0x740 [ 506.179278] ? read_part_sector+0xe0/0xe0 [ 506.179821] read_part_sector+0x36/0xe0 [ 506.180337] adfspart_check_ICS+0x32/0x320 [ 506.180890] ? snprintf+0x45/0x70 [ 506.181350] ? read_part_sector+0xe0/0xe0 [ 506.181906] bdev_disk_changed+0x229/0x5c0 [ 506.182483] blkdev_get_whole+0x6d/0x90 [ 506.183013] blkdev_get_by_dev+0x122/0x2d0 [ 506.183562] device_add_disk+0x39e/0x3c0 [ 506.184472] virtblk_probe+0x3f8/0x79b [virtio_blk] [ 506.185461] virtio_dev_probe+0x15e/0x1d0 [virtio] So let's use a block layer helper to validate the block size. Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://lore.kernel.org/r/20211026144015.188-5-xieyongji@bytedance.com Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-10-27 14:15:54 -06:00
Michael S. Tsirkin	ff63198850	Revert "virtio-blk: Add validation for block size in config space" It turns out that access to config space before completing the feature negotiation is broken for big endian guests at least with QEMU hosts up to 6.1 inclusive. This affects any device that accesses config space in the validate callback: at the moment that is virtio-net with VIRTIO_NET_F_MTU but since `82e89ea077` ("virtio-blk: Add validation for block size in config space") that also started affecting virtio-blk with VIRTIO_BLK_F_BLK_SIZE. Further, unlike VIRTIO_NET_F_MTU which is off by default on QEMU, VIRTIO_BLK_F_BLK_SIZE is on by default, which resulted in lots of people not being able to boot VMs on BE. The spec is very clear that what we are doing is legal so QEMU needs to be fixed, but given it's been broken for so many years and no one noticed, we need to give QEMU a bit more time before applying this. Further, this patch is incomplete (does not check blk size is a power of two) and it duplicates the logic from nbd. Revert for now, and we'll reapply a cleaner logic in the next release. Cc: stable@vger.kernel.org Fixes: `82e89ea077` ("virtio-blk: Add validation for block size in config space") Cc: Xie Yongji <xieyongji@bytedance.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-10-13 08:35:36 -04:00
Linus Torvalds	78e709522d	Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost Pull virtio updates from Michael Tsirkin: - vduse driver ("vDPA Device in Userspace") supporting emulated virtio block devices - virtio-vsock support for end of record with SEQPACKET - vdpa: mac and mq support for ifcvf and mlx5 - vdpa: management netlink for ifcvf - virtio-i2c, gpio dt bindings - misc fixes and cleanups * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (39 commits) Documentation: Add documentation for VDUSE vduse: Introduce VDUSE - vDPA Device in Userspace vduse: Implement an MMU-based software IOTLB vdpa: Support transferring virtual addressing during DMA mapping vdpa: factor out vhost_vdpa_pa_map() and vhost_vdpa_pa_unmap() vdpa: Add an opaque pointer for vdpa_config_ops.dma_map() vhost-iotlb: Add an opaque pointer for vhost IOTLB vhost-vdpa: Handle the failure of vdpa_reset() vdpa: Add reset callback in vdpa_config_ops vdpa: Fix some coding style issues file: Export receive_fd() to modules eventfd: Export eventfd_wake_count to modules iova: Export alloc_iova_fast() and free_iova_fast() virtio-blk: remove unneeded "likely" statements virtio-balloon: Use virtio_find_vqs() helper vdpa: Make use of PFN_PHYS/PFN_UP/PFN_DOWN helper macro vsock_test: update message bounds test for MSG_EOR af_vsock: rename variables in receive loop virtio/vsock: support MSG_EOR bit processing vhost/vsock: support MSG_EOR bit processing ...	2021-09-11 14:48:42 -07:00
Max Gurtovoy	6105d1fe6f	virtio-blk: remove unneeded "likely" statements Usually we use "likely/unlikely" to optimize the fast path. Remove redundant "likely/unlikely" statements in the control path to simplify the code and make it easier to read. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> Link: https://lore.kernel.org/r/20210905085717.7427-1-mgurtovoy@nvidia.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Chaitanya Kulkarni <ckulkarnilinux@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2021-09-06 07:20:56 -04:00
Linus Torvalds	679369114e	Merge tag 'for-5.15/block-2021-08-30' of git://git.kernel.dk/linux-block Pull block updates from Jens Axboe: "Nothing major in here - lots of good cleanups and tech debt handling, which is also evident in the diffstats. In particular: - Add disk sequence numbers (Matteo) - Discard merge fix (Ming) - Relax disk zoned reporting restrictions (Niklas) - Bio error handling zoned leak fix (Pavel) - Start of proper add_disk() error handling (Luis, Christoph) - blk crypto fix (Eric) - Non-standard GPT location support (Dmitry) - IO priority improvements and cleanups (Damien)o - blk-throtl improvements (Chunguang) - diskstats_show() stack reduction (Abd-Alrhman) - Loop scheduler selection (Bart) - Switch block layer to use kmap_local_page() (Christoph) - Remove obsolete disk_name helper (Christoph) - block_device refcounting improvements (Christoph) - Ensure gendisk always has a request queue reference (Christoph) - Misc fixes/cleanups (Shaokun, Oliver, Guoqing)" * tag 'for-5.15/block-2021-08-30' of git://git.kernel.dk/linux-block: (129 commits) sg: pass the device name to blk_trace_setup block, bfq: cleanup the repeated declaration blk-crypto: fix check for too-large dun_bytes blk-zoned: allow BLKREPORTZONE without CAP_SYS_ADMIN blk-zoned: allow zone management send operations without CAP_SYS_ADMIN block: mark blkdev_fsync static block: refine the disk_live check in del_gendisk mmc: sdhci-tegra: Enable MMC_CAP2_ALT_GPT_TEGRA mmc: block: Support alternative_gpt_sector() operation partitions/efi: Support non-standard GPT location block: Add alternative_gpt_sector() operation bio: fix page leak bio_add_hw_page failure block: remove CONFIG_DEBUG_BLOCK_EXT_DEVT block: remove a pointless call to MINOR() in device_add_disk null_blk: add error handling support for add_disk() virtio_blk: add error handling support for add_disk() block: add error handling for device_add_disk / add_disk block: return errors from disk_alloc_events block: return errors from blk_integrity_add block: call blk_register_queue earlier in device_add_disk ...	2021-08-30 18:52:11 -07:00
Luis Chamberlain	dbb301f91f	virtio_blk: add error handling support for add_disk() We never checked for errors on add_disk() as this function returned void. Now that this is fixed, use the shiny new error handling. Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Hannes Reinecke <hare@suse.de> Link: https://lore.kernel.org/r/20210818144542.19305-11-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-08-23 12:55:45 -06:00
Christoph Hellwig	358b348b91	virtio_blk: use bvec_virt Use bvec_virt instead of open coding it. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Link: https://lore.kernel.org/r/20210804095634.460779-9-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-08-16 10:50:33 -06:00
Xie Yongji	82e89ea077	virtio-blk: Add validation for block size in config space An untrusted device might presents an invalid block size in configuration space. This tries to add validation for it in the validate callback and clear the VIRTIO_BLK_F_BLK_SIZE feature bit if the value is out of the supported range. And we also double check the value in virtblk_probe() in case that it's changed after the validation. Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Link: https://lore.kernel.org/r/20210809101609.148-1-xieyongji@bytedance.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2021-08-11 06:44:24 -04:00
Stefan Hajnoczi	63947b3434	virtio-blk: limit seg_max to a safe value The struct virtio_blk_config seg_max value is read from the device and incremented by 2 to account for the request header and status byte descriptors added by the driver. In preparation for supporting untrusted virtio-blk devices, protect against integer overflow and limit the value to a safe maximum. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Link: https://lore.kernel.org/r/20210524154020.98195-1-stefanha@redhat.com Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-07-03 04:50:53 -04:00
Xie Yongji	b71ba22e7c	virtio-blk: Fix memory leak among suspend/resume procedure The vblk->vqs should be freed before we call init_vqs() in virtblk_restore(). Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Link: https://lore.kernel.org/r/20210517084332.280-1-xieyongji@bytedance.com Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-07-03 04:50:52 -04:00
Sohaib	4f118472d4	virtio_blk: cleanups: remove check obsoleted by CONFIG_LBDAF removal Prior to `72deb455b5` ("block: remove CONFIG_LBDAF"), it was optional if the 32-bit kernel support block device and/or file sizes larger than 2 TiB (considering the sector size is 512 bytes) But now sector_t and blkcnt_t are always 64-bit in size. Suggested-by: Ahmad Fatoum <a.fatoum@pengutronix.de> Signed-off-by: Sohaib Mohammed <sohaib.amhmd@gmail.com> Link: https://lore.kernel.org/r/20210430103611.77345-1-sohaib.amhmd@gmail.com Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2021-07-03 04:50:49 -04:00
Christoph Hellwig	89a5f06565	virtio-blk: use blk_mq_alloc_disk Use the blk_mq_alloc_disk API to simplify the gendisk and request_queue allocation. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Link: https://lore.kernel.org/r/20210602065345.355274-5-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-06-11 11:53:02 -06:00
Linus Torvalds	ffc1759676	Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost Pull virtio updates from Michael Tsirkin: - new vdpa features to allow creation and deletion of new devices - virtio-blk support per-device queue depth - fixes, cleanups all over the place * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (31 commits) virtio-input: add multi-touch support virtio_mmio: fix one typo vdpa/mlx5: fix param validation in mlx5_vdpa_get_config() virtio_net: Fix fall-through warnings for Clang virtio_input: Prevent EV_MSC/MSC_TIMESTAMP loop storm for MT. virtio-blk: support per-device queue depth virtio_vdpa: don't warn when fail to disable vq virtio-pci: introduce modern device module virito-pci-modern: rename map_capability() to vp_modern_map_capability() virtio-pci-modern: introduce helper to get notification offset virtio-pci-modern: introduce helper for getting queue nums virtio-pci-modern: introduce helper for setting/geting queue size virtio-pci-modern: introduce helper to set/get queue_enable virtio-pci-modern: introduce vp_modern_queue_address() virtio-pci-modern: introduce vp_modern_set_queue_vector() virtio-pci-modern: introduce vp_modern_generation() virtio-pci-modern: introduce helpers for setting and getting features virtio-pci-modern: introduce helpers for setting and getting status virtio-pci-modern: introduce helper to set config vector virtio-pci-modern: introduce vp_modern_remove() ...	2021-02-25 12:21:08 -08:00
Joseph Qi	d1e9aa9c34	virtio-blk: support per-device queue depth module parameter 'virtblk_queue_depth' was firstly introduced for testing/benchmarking purposes described in commit `fc4324b459` ("virtio-blk: base queue-depth on virtqueue ringsize or module param"). And currently 'virtblk_queue_depth' is used as a saved value for the first probed device. Since we have different virtio-blk devices which have different capabilities, it requires that we support per-device queue depth instead of per-module. So defaultly use vq free elements if module parameter 'virtblk_queue_depth' is not set. Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/1611307306-71067-1-git-send-email-joseph.qi@linux.alibaba.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2021-02-23 07:52:59 -05:00
Guoqing Jiang	684da7628d	block: remove unnecessary argument from blk_execute_rq We can remove 'q' from blk_execute_rq as well after the previous change in blk_execute_rq_nowait. And more importantly it never really was needed to start with given that we can trivial derive it from struct request. Cc: linux-scsi@vger.kernel.org Cc: virtualization@lists.linux-foundation.org Cc: linux-ide@vger.kernel.org Cc: linux-mmc@vger.kernel.org Cc: linux-nvme@lists.infradead.org Cc: linux-nfs@vger.kernel.org Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # for mmc Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2021-01-24 21:52:39 -07:00
Christoph Hellwig	ddff331a14	virtio-blk: remove a spurious call to revalidate_disk_size revalidate_disk_size just updates the block device size from the disk size. Thus calling it from virtblk_update_cache_mode doesn't actually do anything. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2020-11-16 08:34:15 -07:00
Christoph Hellwig	449f4ec989	block: remove the update_bdev parameter to set_capacity_revalidate_and_notify The update_bdev argument is always set to true, so remove it. Also rename the function to the slighly less verbose set_capacity_and_notify, as propagating the disk size to the block device isn't really revalidation. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Petr Vorel <pvorel@suse.cz> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2020-11-16 08:34:14 -07:00
Christoph Hellwig	659e56ba86	block: add a new revalidate_disk_size helper revalidate_disk is a relative awkward helper for driver use, as it first calls an optional driver method and then updates the block device size, while most callers either don't need the method call at all, or want to keep state between the caller and the called method. Add a revalidate_disk_size helper that just performs the update of the block device size from the gendisk one, and switch all drivers that do not implement ->revalidate_disk to use the new helper instead of revalidate_disk() Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Acked-by: Song Liu <song@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2020-09-02 08:00:07 -06:00

1 2 3 4 5 ...

270 Commits