Module Uring

Io_uring is an asynchronous I/O API for Linux that uses ring buffers shared between the Linux kernel and userspace to provide an efficient mechanism to batch requests that can be handled asynchronously and in parallel. This module provides an OCaml interface to io_uring that aims to provide a thin type-safe layer for use in higher-level interfaces.

module Region : sig ... end

Region handles carving up a block of external memory into smaller chunks. This is currently just a slab allocator of a fixed size, on the basis that most IO operations operate on predictable chunks of memory. Since the block of memory in a region is contiguous, it can be used in Uring's fixed buffer model to map it into kernel space for more efficient IO.

type 'a t

'a t is a reference to an Io_uring structure.

type 'a job

A handle for a submitted job, which can be used to cancel it. If an operation returns None, this means that submission failed because the ring is full.

val create : ?polling_timeout:int -> queue_depth:int -> unit -> 'a t

create ~queue_depth will return a fresh Io_uring structure t. Initially, t has no fixed buffer. Use set_fixed_buffer if you want one.

  • parameter polling_timeout

    If given, use polling mode with the given idle timeout (in ms). This requires privileges.

val queue_depth : 'a t -> int

queue_depth t returns the total number of submission slots for the uring t

val exit : 'a t -> unit

exit t will shut down the uring t. Any subsequent requests will fail.

  • raises Invalid_argument

    if there are any requests in progress

Fixed buffers

Each uring may have associated with it a fixed region of memory that is used for the "fixed buffer" mode of io_uring to avoid data copying between userspace and the kernel.

val set_fixed_buffer : 'a t -> Cstruct.buffer -> (unit, [> `ENOMEM ]) Stdlib.result

set_fixed_buffer t buf sets buf as the fixed buffer for t.

You will normally want to wrap this with Region.alloc or similar to divide the buffer into chunks.

If t already has a buffer set, the old one will be removed.

Returns `ENOMEM if insufficient kernel resources are available or the caller's RLIMIT_MEMLOCK resource limit would be exceeded.

  • raises Invalid_argument

    if there are any requests in progress

val buf : 'a t -> Cstruct.buffer

buf t is the fixed internal memory buffer associated with uring t using set_fixed_buffer, or a zero-length buffer if none is set.

Queueing operations

val noop : 'a t -> 'a -> 'a job option

noop t d submits a no-op operation to uring t. The user data d will be returned by wait or peek upon completion.

Timeout

type clock =
  1. | Boottime
    (*

    CLOCK_BOOTTIME is a suspend-aware monotonic clock

    *)
  2. | Realtime
    (*

    CLOCK_REALTIME is a wallclock time clock that may be affected by discontinuous jumps

    *)

Represents different Linux clocks.

val timeout : ?absolute:bool -> 'a t -> clock -> int64 -> 'a -> 'a job option

timeout t clock ns d submits a timeout request to uring t.

absolute denotes how clock and ns relate to one another. Default value is false

ns is the timeout time in nanoseconds

module type FLAGS = sig ... end
module Open_flags : sig ... end

Flags that can be passed to openat2.

module Resolve : sig ... end

Flags that can be passed to openat2 to control path resolution.

val openat2 : 'a t -> access:[ `R | `W | `RW ] -> flags:Open_flags.t -> perm:Unix.file_perm -> resolve:Resolve.t -> ?fd:Unix.file_descr -> string -> 'a -> 'a job option

openat2 t ~access ~flags ~perm ~resolve ~fd path d opens path, which is resolved relative to fd (or the current directory if fd is not given). The user data d will be returned by wait or peek upon completion.

  • parameter access

    controls whether the file is opened for reading, writing, or both

  • parameter flags

    are the usual open flags

  • parameter perm

    sets the access control bits for newly created files (subject to the process's umask)

  • parameter resolve

    controls how the pathname is resolved.

module Linkat_flags : sig ... end
val linkat : 'a t -> ?old_dir_fd:Unix.file_descr -> ?new_dir_fd:Unix.file_descr -> flags:Linkat_flags.t -> old_path:string -> new_path:string -> 'a -> 'a job option

linkat t ~flags ~old_path ~new_path creates a new hard link.

If new_path already exists then it is not overwritten.

  • parameter old_dir_fd

    If provided and old_path is a relative path, it is interpreted relative to old_dir_fd.

  • parameter new_dir_fd

    If provided and new_path is a relative path, it is interpreted relative to new_dir_fd.

  • parameter old_path

    Path of the already-existing link.

  • parameter new_path

    Path for the newly created link.

unlink t ~dir ~fd path removes the directory entry path, which is resolved relative to fd. If fd is not given, then the current working directory is used. If path is a symlink, the link is removed, not the target.

  • parameter dir

    If true, this acts like rmdir (only removing empty directories). If false, it acts like unlink (only removing non-directories).

module Poll_mask : sig ... end
val poll_add : 'a t -> Unix.file_descr -> Poll_mask.t -> 'a -> 'a job option

poll_add t fd mask d will submit a poll(2) request to uring t. It completes and returns d when an event in mask is ready on fd.

type offset := Optint.Int63.t

For files, give the absolute offset, or use Optint.Int63.minus_one for the current position. For sockets, use an offset of Optint.Int63.zero (minus_one is not allowed here).

val read : 'a t -> file_offset:offset -> Unix.file_descr -> Cstruct.t -> 'a -> 'a job option

read t ~file_offset fd buf d will submit a read(2) request to uring t. It reads from absolute file_offset on the fd file descriptor and writes the results into the memory pointed to by buf. The user data d will be returned by wait or peek upon completion.

val write : 'a t -> file_offset:offset -> Unix.file_descr -> Cstruct.t -> 'a -> 'a job option

write t ~file_offset fd buf d will submit a write(2) request to uring t. It writes to absolute file_offset on the fd file descriptor from the the memory pointed to by buf. The user data d will be returned by wait or peek upon completion.

val iov_max : int

The maximum length of the list that can be passed to readv and similar.

val readv : 'a t -> file_offset:offset -> Unix.file_descr -> Cstruct.t list -> 'a -> 'a job option

readv t ~file_offset fd iov d will submit a readv(2) request to uring t. It reads from absolute file_offset on the fd file descriptor and writes the results into the memory pointed to by iov. The user data d will be returned by wait or peek upon completion.

Requires List.length iov <= Uring.iov_max

val writev : 'a t -> file_offset:offset -> Unix.file_descr -> Cstruct.t list -> 'a -> 'a job option

writev t ~file_offset fd iov d will submit a writev(2) request to uring t. It writes to absolute file_offset on the fd file descriptor from the the memory pointed to by iov. The user data d will be returned by wait or peek upon completion.

Requires List.length iov <= Uring.iov_max

val read_fixed : 'a t -> file_offset:offset -> Unix.file_descr -> off:int -> len:int -> 'a -> 'a job option

read t ~file_offset fd ~off ~len d will submit a read(2) request to uring t. It reads up to len bytes from absolute file_offset on the fd file descriptor and writes the results into the fixed memory buffer associated with uring t at offset off. The user data d will be returned by wait or peek upon completion.

val read_chunk : ?len:int -> 'a t -> file_offset:offset -> Unix.file_descr -> Region.chunk -> 'a -> 'a job option

read_chunk is like read_fixed, but gets the offset from chunk.

  • parameter len

    Restrict the read to the first len bytes of chunk.

val write_fixed : 'a t -> file_offset:offset -> Unix.file_descr -> off:int -> len:int -> 'a -> 'a job option

write t ~file_offset fd off d will submit a write(2) request to uring t. It writes up to len bytes into absolute file_offset on the fd file descriptor from the fixed memory buffer associated with uring t at offset off. The user data d will be returned by wait or peek upon completion.

val write_chunk : ?len:int -> 'a t -> file_offset:offset -> Unix.file_descr -> Region.chunk -> 'a -> 'a job option

write_chunk is like write_fixed, but gets the offset from chunk.

  • parameter len

    Restrict the write to the first len bytes of chunk.

val splice : 'a t -> src:Unix.file_descr -> dst:Unix.file_descr -> len:int -> 'a -> 'a job option

splice t ~src ~dst ~len d will submit a request to copy len bytes from src to dst. The operation returns the number of bytes transferred, or 0 for end-of-input. The result is EINVAL if the file descriptors don't support splicing.

module Statx : sig ... end
val statx : 'a t -> ?fd:Unix.file_descr -> mask:Statx.Mask.t -> string -> Statx.t -> Statx.Flags.t -> 'a -> 'a job option

statx t ?fd ~mask path stat flags stats path, which is resolved relative to fd (or the current directory if fd is not given).

val connect : 'a t -> Unix.file_descr -> Unix.sockaddr -> 'a -> 'a job option

connect t fd addr d will submit a request to connect fd to addr.

module Sockaddr : sig ... end

Holder for the peer's address in accept.

val accept : 'a t -> Unix.file_descr -> Sockaddr.t -> 'a -> 'a job option

accept t fd addr d will submit a request to accept a new connection on fd. The new FD will be configured with SOCK_CLOEXEC. The remote address will be stored in addr.

val close : 'a t -> Unix.file_descr -> 'a -> 'a job option
val cancel : 'a t -> 'a job -> 'a -> 'a job option

cancel t job d submits a request to cancel job. The cancel job itself returns 0 on success, or ENOTFOUND if job had already completed by the time the kernel processed the cancellation request.

  • raises Invalid_argument

    if the job has already been returned by e.g. wait.

module Msghdr : sig ... end
val send_msg : ?fds:Unix.file_descr list -> ?dst:Unix.sockaddr -> 'a t -> Unix.file_descr -> Cstruct.t list -> 'a -> 'a job option

send_msg t fd buffs d will submit a sendmsg(2) request. The Msghdr will be constructed from the FDs (fds), address (dst) and buffers (buffs).

Requires List.length buffs <= Uring.iov_max

  • parameter dst

    Destination address.

  • parameter fds

    Extra file descriptors to attach to the message.

val recv_msg : 'a t -> Unix.file_descr -> Msghdr.t -> 'a -> 'a job option

recv_msg t fd msghdr d will submit a recvmsg(2) request. If the request is successful then the msghdr will contain the sender address and the data received.

val fsync : 'a t -> ?off:int64 -> ?len:int -> Unix.file_descr -> 'a -> 'a job option

fsync t ?off ?len fd d will submit an fsync(2) request, with the optional offset off and length len specifying the subset of the file to perform the synchronisation on.

val fdatasync : 'a t -> ?off:int64 -> ?len:int -> Unix.file_descr -> 'a -> 'a job option

fdatasync t ?off ?len fd d will submit an fdatasync(2) request, with the optional offset off and length len specifying the subset of the file to perform the synchronisation on.

Probing

You can check which operations are supported by the running kernel.

module Op : sig ... end
type probe
val get_probe : _ t -> probe
val op_supported : probe -> Op.t -> bool

Submitting operations

val submit : 'a t -> int

submit t will submit all the outstanding queued requests on uring t to the kernel. Their results can subsequently be retrieved using wait or peek.

type 'a completion_option =
  1. | None
  2. | Some of {
    1. result : int;
    2. data : 'a;
    }

The type of results of calling wait and peek. None denotes that either there were no completions in the queue or an interrupt / timeout occurred. Some contains both the user data attached to the completed request and the integer syscall result.

val wait : ?timeout:float -> 'a t -> 'a completion_option

wait ?timeout t will block indefinitely (the default) or for timeout seconds for any outstanding events to complete on uring t. This calls submit automatically.

val get_cqe_nonblocking : 'a t -> 'a completion_option

get_cqe_nonblocking t returns the next completion entry from the uring t. It is like wait except that it returns None instead of blocking.

val peek : 'a t -> 'a completion_option
  • deprecated Renamed to Uring.get_cqe_nonblocking
val register_eventfd : 'a t -> Unix.file_descr -> unit

register_eventfd t fd will register an eventfd to the the uring t. See documentation for io_uring_register_eventfd

val error_of_errno : int -> Unix.error

error_of_errno e converts the error code abs e to a Unix error type.

val active_ops : _ t -> int

active_ops t returns the number of operations added to the ring (whether submitted or not) for which the completion event has not yet been collected.

val sqe_ready : _ t -> int

sqe_ready t is the number of unconsumed (if SQPOLL) or unsubmitted entries in the SQ ring.

module Stats : sig ... end
val get_debug_stats : _ t -> Stats.t

get_debug_stats t collects some metrics about the internal state of t.

module Private : sig ... end