Sub-device Interface

Pad-level Formats
Media Bus Formats


This is an experimental interface and may change in the future.

The complex nature of V4L2 devices, where hardware is often made of several integrated circuits that need to interact with each other in a controlled way, leads to complex V4L2 drivers. The drivers usually reflect the hardware model in software, and model the different hardware components as software blocks called sub-devices.

V4L2 sub-devices are usually kernel-only objects. If the V4L2 driver implements the media device API, they will automatically inherit from media entities. Applications will be able to enumerate the sub-devices and discover the hardware topology using the media entities, pads and links enumeration API.

In addition to make sub-devices discoverable, drivers can also choose to make them directly configurable by applications. When both the sub-device driver and the V4L2 device driver support this, sub-devices will feature a character device node on which ioctls can be called to

Sub-device character device nodes, conventionally named /dev/v4l-subdev*, use major number 81.


Most V4L2 controls are implemented by sub-device hardware. Drivers usually merge all controls and expose them through video device nodes. Applications can control all sub-devices through a single interface.

Complex devices sometimes implement the same control in different pieces of hardware. This situation is common in embedded platforms, where both sensors and image processing hardware implement identical functions, such as contrast adjustment, white balance or faulty pixels correction. As the V4L2 controls API doesn't support several identical controls in a single device, all but one of the identical controls are hidden.

Applications can access those hidden controls through the sub-device node with the V4L2 control API described in the section called “User Controls”. The ioctls behave identically as when issued on V4L2 device nodes, with the exception that they deal only with controls implemented in the sub-device.

Depending on the driver, those controls might also be exposed through one (or several) V4L2 device nodes.


V4L2 sub-devices can notify applications of events as described in the section called “Event Interface”. The API behaves identically as when used on V4L2 device nodes, with the exception that it only deals with events generated by the sub-device. Depending on the driver, those events might also be reported on one (or several) V4L2 device nodes.

Pad-level Formats


Pad-level formats are only applicable to very complex device that need to expose low-level format configuration to user space. Generic V4L2 applications do not need to use the API described in this section.


For the purpose of this section, the term format means the combination of media bus data format, frame width and frame height.

Image formats are typically negotiated on video capture and output devices using the format and selection ioctls. The driver is responsible for configuring every block in the video pipeline according to the requested format at the pipeline input and/or output.

For complex devices, such as often found in embedded systems, identical image sizes at the output of a pipeline can be achieved using different hardware configurations. One such example is shown on Figure 4.4, “Image Format Negotiation on Pipelines”, where image scaling can be performed on both the video sensor and the host image processing hardware.

Figure 4.4. Image Format Negotiation on Pipelines

High quality and high speed pipeline configuration

The sensor scaler is usually of less quality than the host scaler, but scaling on the sensor is required to achieve higher frame rates. Depending on the use case (quality vs. speed), the pipeline must be configured differently. Applications need to configure the formats at every point in the pipeline explicitly.

Drivers that implement the media API can expose pad-level image format configuration to applications. When they do, applications can use the VIDIOC_SUBDEV_G_FMT and VIDIOC_SUBDEV_S_FMT ioctls. to negotiate formats on a per-pad basis.

Applications are responsible for configuring coherent parameters on the whole pipeline and making sure that connected pads have compatible formats. The pipeline is checked for formats mismatch at VIDIOC_STREAMON time, and an EPIPE error code is then returned if the configuration is invalid.

Pad-level image format configuration support can be tested by calling the VIDIOC_SUBDEV_G_FMT ioctl on pad 0. If the driver returns an EINVAL error code pad-level format configuration is not supported by the sub-device.

Format Negotiation

Acceptable formats on pads can (and usually do) depend on a number of external parameters, such as formats on other pads, active links, or even controls. Finding a combination of formats on all pads in a video pipeline, acceptable to both application and driver, can't rely on formats enumeration only. A format negotiation mechanism is required.

Central to the format negotiation mechanism are the get/set format operations. When called with the which argument set to V4L2_SUBDEV_FORMAT_TRY, the VIDIOC_SUBDEV_G_FMT and VIDIOC_SUBDEV_S_FMT ioctls operate on a set of formats parameters that are not connected to the hardware configuration. Modifying those 'try' formats leaves the device state untouched (this applies to both the software state stored in the driver and the hardware state stored in the device itself).

While not kept as part of the device state, try formats are stored in the sub-device file handles. A VIDIOC_SUBDEV_G_FMT call will return the last try format set on the same sub-device file handle. Several applications querying the same sub-device at the same time will thus not interact with each other.

To find out whether a particular format is supported by the device, applications use the VIDIOC_SUBDEV_S_FMT ioctl. Drivers verify and, if needed, change the requested format based on device requirements and return the possibly modified value. Applications can then choose to try a different format or accept the returned value and continue.

Formats returned by the driver during a negotiation iteration are guaranteed to be supported by the device. In particular, drivers guarantee that a returned format will not be further changed if passed to an VIDIOC_SUBDEV_S_FMT call as-is (as long as external parameters, such as formats on other pads or links' configuration are not changed).

Drivers automatically propagate formats inside sub-devices. When a try or active format is set on a pad, corresponding formats on other pads of the same sub-device can be modified by the driver. Drivers are free to modify formats as required by the device. However, they should comply with the following rules when possible:

  • Formats should be propagated from sink pads to source pads. Modifying a format on a source pad should not modify the format on any sink pad.

  • Sub-devices that scale frames using variable scaling factors should reset the scale factors to default values when sink pads formats are modified. If the 1:1 scaling ratio is supported, this means that source pads formats should be reset to the sink pads formats.

Formats are not propagated across links, as that would involve propagating them from one sub-device file handle to another. Applications must then take care to configure both ends of every link explicitly with compatible formats. Identical formats on the two ends of a link are guaranteed to be compatible. Drivers are free to accept different formats matching device requirements as being compatible.

Table 4.18, “Sample Pipeline Configuration” shows a sample configuration sequence for the pipeline described in Figure 4.4, “Image Format Negotiation on Pipelines” (table columns list entity names and pad numbers).

Table 4.18. Sample Pipeline Configuration

Initial state2048x1536----
Configure frontend input2048x15362048x15362046x1534--
Configure scaler input2048x15362048x15362046x15342046x15342046x1534
Configure scaler output2048x15362048x15362046x15342046x15341280x960

  1. Initial state. The sensor output is set to its native 3MP resolution. Resolutions on the host frontend and scaler input and output pads are undefined.

  2. The application configures the frontend input pad resolution to 2048x1536. The driver propagates the format to the frontend output pad. Note that the propagated output format can be different, as in this case, than the input format, as the hardware might need to crop pixels (for instance when converting a Bayer filter pattern to RGB or YUV).

  3. The application configures the scaler input pad resolution to 2046x1534 to match the frontend output resolution. The driver propagates the format to the scaler output pad.

  4. The application configures the scaler output pad resolution to 1280x960.

When satisfied with the try results, applications can set the active formats by setting the which argument to V4L2_SUBDEV_FORMAT_ACTIVE. Active formats are changed exactly as try formats by drivers. To avoid modifying the hardware state during format negotiation, applications should negotiate try formats first and then modify the active settings using the try formats returned during the last negotiation iteration. This guarantees that the active format will be applied as-is by the driver without being modified.

Selections: cropping, scaling and composition

Many sub-devices support cropping frames on their input or output pads (or possible even on both). Cropping is used to select the area of interest in an image, typically on an image sensor or a video decoder. It can also be used as part of digital zoom implementations to select the area of the image that will be scaled up.

Crop settings are defined by a crop rectangle and represented in a struct v4l2_rect by the coordinates of the top left corner and the rectangle size. Both the coordinates and sizes are expressed in pixels.

Scaling operation changes the size of the image by scaling it to new dimensions. Some sub-devices support it. The scaled size (width and height) is represented by struct v4l2_rect. In the case of scaling, top and left will always be zero. Scaling is configured using and V4L2_SUBDEV_SEL_COMPOSE_ACTIVE selection target on the sink pad of the subdev. The scaling is performed related to the width and height of the crop rectangle on the subdev's sink pad.

As for pad formats, drivers store try and active rectangles for the selection targets of ACTIVE type Table 4.19, “V4L2 subdev selection targets”

On sink pads, cropping is applied relatively to the current pad format. The pad format represents the image size as received by the sub-device from the previous block in the pipeline, and the crop rectangle represents the sub-image that will be transmitted further inside the sub-device for processing.

On source pads, cropping is similar to sink pads, with the exception that the source size from which the cropping is performed, is the COMPOSE rectangle on the sink pad. In both sink and source pads, the crop rectangle must be entirely containted inside the source image size for the crop operation.

The drivers should always use the closest possible rectangle the user requests on all selection targets, unless specificly told otherwiseTable 4.20, “V4L2 subdev selection flags”

Types of selection targets

ACTIVE targets

ACTIVE targets reflect the actual hardware configuration at any point of time.

BOUNDS targets

BOUNDS targets is the smallest rectangle within which contains all valid ACTIVE rectangles. It may not be possible to set the ACTIVE rectangle as large as the BOUNDS rectangle, however.

Order of configuration and format propagation

Inside subdevs, the order of image processing steps will always be from the sink pad towards the source pad. This is also reflected in the order in which the configuration must be performed by the user: the changes made will be propagated to any subsequent stages. If this behaviour is not desired, the user must set V4L2_SUBDEV_SEL_FLAG_KEEP_CONFIG flag. The coordinates to a step always refer to the active size of the previous step. The exception to this rule is the source compose rectangle, which refers to the sink compose bounds rectangle --- if it is supported by the hardware.

  1. Sink pad format. The user configures the sink pad format. This format defines the parameters of the image the entity receives through the pad for further processing.
  2. Sink pad active crop selection. The sink pad crop defines the performed to the sink pad format.
  3. Sink pad active compose selection. The size of the sink pad compose rectangle defines the scaling ratio compared to the size of the sink pad crop rectangle. The location of the compose rectangle specifies the location of the active sink compose rectangle in the sink compose bounds rectangle.
  4. Source pad active crop selection. Crop on the source pad defines crop performed to the image in the sink compose bounds rectangle.
  5. Source pad format. The source pad format defines the output pixel format of the subdev, as well as the other parameters with the exception of the image width and height. Width and height are defined by the size of the source pad active crop selection.

Accessing any of the above rectangles not supported by the subdev will return EINVAL. Any rectangle referring to a previous unsupported rectangle coordinates will instead refer to the previous supported rectangle. For example, if sink crop is not supported, the compose selection will refer to the sink pad format dimensions instead.

Figure 4.5. Image processing in subdevs: simple crop example

In the above example, the subdev supports cropping on its sink pad. To configure it, the user sets the media bus format on the subdev's sink pad. Now the active crop rectangle can be set on the sink pad --- the location and size of this rectangle reflect the location and size of a rectangle to be cropped from the sink format. The size of the sink crop rectangle will also be the size of the format of the subdev's source pad.

Figure 4.6. Image processing in subdevs: scaling with multiple sources

In this example, the subdev is capable of first cropping, then scaling and finally cropping for two source pads individually from the resulting scaled image. The location of the scaled image in the cropped image is ignored in sink compose target. Both of the locations of the source crop rectangles refer to the sink scaling rectangle, independently cropping an area at location specified by the source crop rectangle from it.

Figure 4.7. Image processing in subdevs: scaling and composition with multiple sinks and sources

The subdev driver supports two sink pads and two source pads. The images from both of the sink pads are individually cropped, then scaled and further composed on the composition bounds rectangle. From that, two independent streams are cropped and sent out of the subdev from the source pads.

Media Bus Formats

Table 4.22. struct v4l2_mbus_framefmt

__u32widthImage width, in pixels.
__u32heightImage height, in pixels.
__u32codeFormat code, from enum v4l2_mbus_pixelcode.
__u32fieldField order, from enum v4l2_field. See the section called “Field Order” for details.
__u32colorspaceImage colorspace, from enum v4l2_colorspace. See the section called “Colorspaces” for details.
__u32reserved[7]Reserved for future extensions. Applications and drivers must set the array to zero.

Media Bus Pixel Codes

The media bus pixel codes describe image formats as flowing over physical busses (both between separate physical components and inside SoC devices). This should not be confused with the V4L2 pixel formats that describe, using four character codes, image formats as stored in memory.

While there is a relationship between image formats on busses and image formats in memory (a raw Bayer image won't be magically converted to JPEG just by storing it to memory), there is no one-to-one correspondance between them.

Packed RGB Formats

Those formats transfer pixel data as red, green and blue components. The format code is made of the following information.

  • The red, green and blue components order code, as encoded in a pixel sample. Possible values are RGB and BGR.

  • The number of bits per component, for each component. The values can be different for all components. Common values are 555 and 565.

  • The number of bus samples per pixel. Pixels that are wider than the bus width must be transferred in multiple samples. Common values are 1 and 2.

  • The bus width.

  • For formats where the total number of bits per pixel is smaller than the number of bus samples per pixel times the bus width, a padding value stating if the bytes are padded in their most high order bits (PADHI) or low order bits (PADLO).

  • For formats where the number of bus samples per pixel is larger than 1, an endianness value stating if the pixel is transferred MSB first (BE) or LSB first (LE).

For instance, a format where pixels are encoded as 5-bits red, 5-bits green and 5-bit blue values padded on the high bit, transferred as 2 8-bit samples per pixel with the most significant bits (padding, red and half of the green value) transferred first will be named V4L2_MBUS_FMT_RGB555_2X8_PADHI_BE.

The following tables list existing packet RGB formats.

Table 4.23. RGB formats

IdentifierCode Data organization
V4L2_MBUS_FMT_RGB444_2X8_PADHI_BE0x1001 0000r3r2r1r0
V4L2_MBUS_FMT_RGB444_2X8_PADHI_LE0x1002 g3g2g1g0b3b2b1b0
V4L2_MBUS_FMT_RGB555_2X8_PADHI_BE0x1003 0r4r3r2r1r0g4g3
V4L2_MBUS_FMT_RGB555_2X8_PADHI_LE0x1004 g2g1g0b4b3b2b1b0
V4L2_MBUS_FMT_BGR565_2X8_BE0x1005 b4b3b2b1b0g5g4g3
V4L2_MBUS_FMT_BGR565_2X8_LE0x1006 g2g1g0r4r3r2r1r0
V4L2_MBUS_FMT_RGB565_2X8_BE0x1007 r4r3r2r1r0g5g4g3
V4L2_MBUS_FMT_RGB565_2X8_LE0x1008 g2g1g0b4b3b2b1b0

Bayer Formats

Those formats transfer pixel data as red, green and blue components. The format code is made of the following information.

  • The red, green and blue components order code, as encoded in a pixel sample. The possible values are shown in Figure 4.8, “Bayer Patterns”.

  • The number of bits per pixel component. All components are transferred on the same number of bits. Common values are 8, 10 and 12.

  • If the pixel components are DPCM-compressed, a mention of the DPCM compression and the number of bits per compressed pixel component.

  • The number of bus samples per pixel. Pixels that are wider than the bus width must be transferred in multiple samples. Common values are 1 and 2.

  • The bus width.

  • For formats where the total number of bits per pixel is smaller than the number of bus samples per pixel times the bus width, a padding value stating if the bytes are padded in their most high order bits (PADHI) or low order bits (PADLO).

  • For formats where the number of bus samples per pixel is larger than 1, an endianness value stating if the pixel is transferred MSB first (BE) or LSB first (LE).

For instance, a format with uncompressed 10-bit Bayer components arranged in a red, green, green, blue pattern transferred as 2 8-bit samples per pixel with the least significant bits transferred first will be named V4L2_MBUS_FMT_SRGGB10_2X8_PADHI_LE.

Figure 4.8. Bayer Patterns

Bayer filter color patterns

The following table lists existing packet Bayer formats. The data organization is given as an example for the first pixel only.

Table 4.24. Bayer Formats

IdentifierCode Data organization
V4L2_MBUS_FMT_SBGGR8_1X80x3001 ----b7b6b5b4b3b2b1b0
V4L2_MBUS_FMT_SGBRG8_1X80x3013 ----g7g6g5g4g3g2g1g0
V4L2_MBUS_FMT_SGRBG8_1X80x3002 ----g7g6g5g4g3g2g1g0
V4L2_MBUS_FMT_SRGGB8_1X80x3014 ----r7r6r5r4r3r2r1r0
V4L2_MBUS_FMT_SBGGR10_DPCM8_1X80x300b ----b7b6b5b4b3b2b1b0
V4L2_MBUS_FMT_SGBRG10_DPCM8_1X80x300c ----g7g6g5g4g3g2g1g0
V4L2_MBUS_FMT_SGRBG10_DPCM8_1X80x3009 ----g7g6g5g4g3g2g1g0
V4L2_MBUS_FMT_SRGGB10_DPCM8_1X80x300d ----r7r6r5r4r3r2r1r0
V4L2_MBUS_FMT_SBGGR10_2X8_PADHI_BE0x3003 ----000000b9b8
V4L2_MBUS_FMT_SBGGR10_2X8_PADHI_LE0x3004 ----b7b6b5b4b3b2b1b0
V4L2_MBUS_FMT_SBGGR10_2X8_PADLO_BE0x3005 ----b9b8b7b6b5b4b3b2
V4L2_MBUS_FMT_SBGGR10_2X8_PADLO_LE0x3006 ----b1b0000000
V4L2_MBUS_FMT_SBGGR10_1X100x3007 --b9b8b7b6b5b4b3b2b1b0
V4L2_MBUS_FMT_SGBRG10_1X100x300e --g9g8g7g6g5g4g3g2g1g0
V4L2_MBUS_FMT_SGRBG10_1X100x300a --g9g8g7g6g5g4g3g2g1g0
V4L2_MBUS_FMT_SRGGB10_1X100x300f --r9r8r7r6r5r4r3r2r1r0
V4L2_MBUS_FMT_SBGGR12_1X120x3008 b11b10b9b8b7b6b5b4b3b2b1b0
V4L2_MBUS_FMT_SGBRG12_1X120x3010 g11g10g9g8g7g6g5g4g3g2g1g0
V4L2_MBUS_FMT_SGRBG12_1X120x3011 g11g10g9g8g7g6g5g4g3g2g1g0
V4L2_MBUS_FMT_SRGGB12_1X120x3012 r11r10r9r8r7r6r5r4r3r2r1r0

Packed YUV Formats

Those data formats transfer pixel data as (possibly downsampled) Y, U and V components. The format code is made of the following information.

  • The Y, U and V components order code, as transferred on the bus. Possible values are YUYV, UYVY, YVYU and VYUY.

  • The number of bits per pixel component. All components are transferred on the same number of bits. Common values are 8, 10 and 12.

  • The number of bus samples per pixel. Pixels that are wider than the bus width must be transferred in multiple samples. Common values are 1, 1.5 (encoded as 1_5) and 2.

  • The bus width. When the bus width is larger than the number of bits per pixel component, several components are packed in a single bus sample. The components are ordered as specified by the order code, with components on the left of the code transferred in the high order bits. Common values are 8 and 16.

For instance, a format where pixels are encoded as 8-bit YUV values downsampled to 4:2:2 and transferred as 2 8-bit bus samples per pixel in the U, Y, V, Y order will be named V4L2_MBUS_FMT_UYVY8_2X8.

The following table lisst existing packet YUV formats.

Table 4.25. YUV Formats

IdentifierCode Data organization
V4L2_MBUS_FMT_Y8_1X80x2001 ------------y7y6y5y4y3y2y1y0
V4L2_MBUS_FMT_UYVY8_1_5X80x2002 ------------u7u6u5u4u3u2u1u0
V4L2_MBUS_FMT_VYUY8_1_5X80x2003 ------------v7v6v5v4v3v2v1v0
V4L2_MBUS_FMT_YUYV8_1_5X80x2004 ------------y7y6y5y4y3y2y1y0
V4L2_MBUS_FMT_YVYU8_1_5X80x2005 ------------y7y6y5y4y3y2y1y0
V4L2_MBUS_FMT_UYVY8_2X80x2006 ------------u7u6u5u4u3u2u1u0
V4L2_MBUS_FMT_VYUY8_2X80x2007 ------------v7v6v5v4v3v2v1v0
V4L2_MBUS_FMT_YUYV8_2X80x2008 ------------y7y6y5y4y3y2y1y0
V4L2_MBUS_FMT_YVYU8_2X80x2009 ------------y7y6y5y4y3y2y1y0
V4L2_MBUS_FMT_Y10_1X100x200a ----------y9y8y7y6y5y4y3y2y1y0
V4L2_MBUS_FMT_YUYV10_2X100x200b ----------y9y8y7y6y5y4y3y2y1y0
V4L2_MBUS_FMT_YVYU10_2X100x200c ----------y9y8y7y6y5y4y3y2y1y0
V4L2_MBUS_FMT_Y12_1X120x2013 --------y11y10y9y8y7y6y5y4y3y2y1y0
V4L2_MBUS_FMT_UYVY8_1X160x200f ----u7u6u5u4u3u2u1u0y7y6y5y4y3y2y1y0
V4L2_MBUS_FMT_VYUY8_1X160x2010 ----v7v6v5v4v3v2v1v0y7y6y5y4y3y2y1y0
V4L2_MBUS_FMT_YUYV8_1X160x2011 ----y7y6y5y4y3y2y1y0u7u6u5u4u3u2u1u0
V4L2_MBUS_FMT_YVYU8_1X160x2012 ----y7y6y5y4y3y2y1y0v7v6v5v4v3v2v1v0
V4L2_MBUS_FMT_YUYV10_1X200x200d y9y8y7y6y5y4y3y2y1y0u9u8u7u6u5u4u3u2u1u0
V4L2_MBUS_FMT_YVYU10_1X200x200e y9y8y7y6y5y4y3y2y1y0v9v8v7v6v5v4v3v2v1v0

JPEG Compressed Formats

Those data formats consist of an ordered sequence of 8-bit bytes obtained from JPEG compression process. Additionally to the _JPEG postfix the format code is made of the following information.

  • The number of bus samples per entropy encoded byte.

  • The bus width.

For instance, for a JPEG baseline process and an 8-bit bus width the format will be named V4L2_MBUS_FMT_JPEG_1X8.

The following table lists existing JPEG compressed formats.

Table 4.26. JPEG Formats

V4L2_MBUS_FMT_JPEG_1X80x4001Besides of its usage for the parallel bus this format is recommended for transmission of JPEG data over MIPI CSI bus using the User Defined 8-bit Data types.