FFmpeg filters 分析(FFmpeg 滤镜相关的一些概念和 API)

一、概述

FFmpeg 提供了一种以管道的方式对音视频进行滤镜操作的框架。其也内置了非常多的滤镜，如果这些滤镜还不能满足需求的话也可以自行开发。

本文主要梳理 FFmpeg 滤镜相关的一些基本概念和 API，为之后对具体内置滤镜的分析做准备。

二、滤镜图和滤镜简述

1、什么是滤镜图和滤镜

滤镜图(Filter graph)是一种包含了多个已连接的滤镜的有向图。源码中，AVFilterGraph 类定义了滤镜图。避免眼花，下文使用 Filter graph 来表示滤镜图。

从表面上看滤镜是 Filter graph 节点。源码中，使用 AVFilter 类定义滤镜，运行时使用 AVFilterContext 类的实例表示位于 Filter graph 中的滤镜实例。

通常，滤镜有 0 或多个输入并且有 0 或多个输出——输入输出至少得有 1 个。一个滤镜的输出可以连接到另一个滤镜的输入，多个滤镜连接起来从而构成滤镜链。这里说的输入输出特指视频帧/音频采样在 Filter graph 中的处理流程，不是指整个 FFmpeg 的处理流程。源码中，使用 AVFilterPad 类定义滤镜有哪些输入和输出，运行时使用 AVFilterInOut 类的实例表示滤镜在 Filter graph 中的输入和输出实例。在 Filter graph 中使用 AVFilterLink 将滤镜连接起来。

截止 FFmpeg 4.3.1 版本，其包含了 460 个滤镜。可在 allfilters.c 这个源文件中搜索 “extern AVFilter” 看看搜索出来的数量。也可以用命令 ffmpeg -filters 查看安装了的滤镜。

定义滤镜的源文件在 libavfilter 目录中。一个源文件定义了一或多个滤镜。比如 buffersrc.c 就包含了 buffer 和 abuffer 这两个滤镜, 分别是视频和音频的一种 Source (Source 是特殊的滤镜见下文)。

可以用命令 ffmpeg -h filter=<名称> 看具体滤镜的使用帮助。

2、滤镜的分类

可以使用两种方式对滤镜进行大致的分类。

从输入输出(Pad)数量的角度可划分为 Sources、Sinks 和 Filters。Source 没有输入有 1 个输出，Sink 有 1 个输入没有输出，Filter 有 1 个或多个输入并有 1 个或多个输出。实际上，Source 和 Skin 本质上是特殊的滤镜。有时候为了区分，本文把 Source 和 Sink 之外的滤镜称为 普通滤镜 。

从滤镜能够处理的媒体类型的角度可划分为三类：Audio 、 Video 和 Multimedia。Audio 分类的滤镜用于处理音频，Video 分类的滤镜包含用于处理视频，而 Multmedia 分类包含用于处理视频或音频的输入以及音频通用的滤镜。特别地，单独提出了 OpenCL Video 和 VAAPI Video 这两个仅用于处理视频的滤镜小类。

另外，Multimedia 只有 Sources 和普通滤镜，没有 Sinks；OpenCL Video 只有普通滤镜; VAAPI Video 只有一个用于视频编解码的普通滤镜。

实际上，普通滤镜也可以输入和输出都没有。比如 af_volumedetect 这个滤镜可以获取音频的音量(均方根值)、最大最小音量和音量直方图并以日志形式打印出来，不需要处理数据传递给下一个节点(Pad)。另外，如果输入为 NULL(AVFilte 的 outputs 字段为 NULL) 且有 AVFILTER_FLAG_DYNAMIC_INPUTS 标志则表示有动态地多个输入, 比如 amerge 和 mix等滤镜；如果输出为 NULL(AVFilte 的 inputs 字段为 NULL) 且有 AVFILTER_FLAG_DYNAMIC_OUTPUTS 标志则表示有动态地多个输出，比如 split 和 aplit等滤镜；甚至输入和输出都为 NULL 则表示输入和输出都是动态的，且有 AVFILTER_FLAG_DYNAMIC_INPUTS 和 AVFILTER_FLAG_DYNAMIC_OUTPUTS 两个标志，比如 streamselect、astreamselect 和 concat 这三个滤镜——所以严格来说 Sources、Sinks 和 Filters 不能简单按输入输出数量来区分，这样做只是考虑一般情况。

三、滤镜相关类

FFmpeg 虽然是用 C 语言实现了，其也用到了一些 OOP 思想。在这里直接将这些 C 结构称为类。和滤镜相关的主要类如下。

类	说明
AVFilterGraph	Filter graph。
AVFilter	滤镜的定义。它一种定义类。它的实例包含用于处理命令(process_command，见下文)等操作。
AVFilterPad	滤镜的输入或输出定义。它一种定义类。它的实例的 filter_frame 指针指向实际的滤镜逻辑。
AVFilterContext	滤镜的实例。AVFitler 实例是孤立的对滤镜的描述实例，而 AVFilterContext 实例是滤镜在 Filter graph 中的实例。
AVFilterLink	AVFilterLink 类通过将两个 AVFilterContext 实例的 Pad 放在一起从而实现滤镜的连接。
AVFilterInOut	AVFilterInOut 类是用于对滤镜参数字符串进行语句解析(Parse)时的辅助类。在此过程中会将`[]` 命名的的名称构造成 AVFilterInOut。它是一个链表结构。

四、滤镜的定义

1、AVFilter 类

AVFilter 可以看做是抽象类或接口。

定义了 1 个 const AVClass *priv_class; 字段描述了实际的类型。priv_class 指向的类型并非直接“继承”自 AVFilter，它更像包含了”子类”的字段的元数据——而真正的子类是一个虚拟的存在。比如滤镜 ff_af_volume 是一个 AVFrame 型变量，可以想象其是一个虚拟的名为 AVFilterVolume 的子类型的变量。而 AVFilterVolume 的字段定义在 VolumeContext 中。

定义了 1 个 int priv_size; 字段表示子类的结构大小，用于初始化时分配内存。比如滤镜 ff_af_volume 是 VolumeContext 结构的结果大小。

定义了 3 个函数指针字段 init 、init_opaque 和 init_dict 类似于面向对象语言中的构造函数，其中 init 看做是默认构造函数， init_opaue 和 init_dict 是构造函数重载。

定义了 1 个函数指针字段 uninit 类似于面向对象语言中的析构函数。

特别地，函数指针 preinit 指向的函数在给 AVFilterContext 分配内存后调用，为处理帧同步(framesync)做准备。

以下是 AVFitler 的不完整定义：

// File: libavfilter/avfilter.h
/**
 * Filter definition. This defines the pads a filter contains, and all the
 * callback functions used to interact with the filter.
 */
typedef struct AVFilter {
    /**
     * A class for the private data, used to declare filter private AVOptions.
     * This field is NULL for filters that do not declare any options.
     *
     * If this field is non-NULL, the first member of the filter private data
     * must be a pointer to AVClass, which will be set by libavfilter generic
     * code to this class.
     */
    const AVClass *priv_class;

    /*****************************************************************
     * All fields below this line are not part of the public API. They
     * may not be used outside of libavfilter and can be changed and
     * removed at will.
     * New public fields should be added right above.
     *****************************************************************
     */

    /**
     * Filter pre-initialization function
     *
     * This callback will be called immediately after the filter context is
     * allocated, to allow allocating and initing sub-objects.
     *
     * If this callback is not NULL, the uninit callback will be called on
     * allocation failure.
     *
     * @return 0 on success,
     *         AVERROR code on failure (but the code will be
     *           dropped and treated as ENOMEM by the calling code)
     */
    int (*preinit)(AVFilterContext *ctx);

    /**
     * Filter initialization function.
     *
     * This callback will be called only once during the filter lifetime, after
     * all the options have been set, but before links between filters are
     * established and format negotiation is done.
     *
     * Basic filter initialization should be done here. Filters with dynamic
     * inputs and/or outputs should create those inputs/outputs here based on
     * provided options. No more changes to this filter's inputs/outputs can be
     * done after this callback.
     *
     * This callback must not assume that the filter links exist or frame
     * parameters are known.
     *
     * @ref AVFilter.uninit "uninit" is guaranteed to be called even if
     * initialization fails, so this callback does not have to clean up on
     * failure.
     *
     * @return 0 on success, a negative AVERROR on failure
     */
    int (*init)(AVFilterContext *ctx);

    /**
     * Should be set instead of @ref AVFilter.init "init" by the filters that
     * want to pass a dictionary of AVOptions to nested contexts that are
     * allocated during init.
     *
     * On return, the options dict should be freed and replaced with one that
     * contains all the options which could not be processed by this filter (or
     * with NULL if all the options were processed).
     *
     * Otherwise the semantics is the same as for @ref AVFilter.init "init".
     */
    int (*init_dict)(AVFilterContext *ctx, AVDictionary **options);

    /**
     * Filter uninitialization function.
     *
     * Called only once right before the filter is freed. Should deallocate any
     * memory held by the filter, release any buffer references, etc. It does
     * not need to deallocate the AVFilterContext.priv memory itself.
     *
     * This callback may be called even if @ref AVFilter.init "init" was not
     * called or failed, so it must be prepared to handle such a situation.
     */
    void (*uninit)(AVFilterContext *ctx);

    int priv_size;      ///< size of private data to allocate for the filter

    /**
     *滤镜initialization function, alternative to the init()
     * callback. Args contains the user-supplied parameters, opaque is
     * used for providing binary data.
     */
    int (*init_opaque)(AVFilterContext *ctx, void *opaque);
} AVFilter;

上述函数指针都是可 NULL 的，比如 abuffer 滤镜就只实现了 init 和 uninit：

// File: libavfilter/buffersrc.c
AVFilter ff_asrc_abuffer = {
    .name          = "abuffer",
    .description   = NULL_IF_CONFIG_SMALL("Buffer audio frames, and make them accessible to the filterchain."),
    .priv_size     = sizeof(BufferSourceContext),
    .query_formats = query_formats,

    .init      = init_audio,
    .uninit    = uninit,

    .inputs    = NULL,
    .outputs   = avfilter_asrc_abuffer_outputs,
    .priv_class = &abuffer_class,
};

再比如 abuffersink 滤镜实现了 init_opaque 、 query_formats 和 activate 而没有实现 init 和 uninit：

// File: libavfilter/buffersink.c
AVFilter ff_asink_abuffer = {
    .name        = "abuffersink",
    .description = NULL_IF_CONFIG_SMALL("Buffer audio frames, and make them available to the end of the filter graph."),
    .priv_class  = &abuffersink_class,
    .priv_size   = sizeof(BufferSinkContext),
    .init_opaque = asink_init,

    .query_formats = asink_query_formats,
    .activate    = activate,
    .inputs      = avfilter_asink_abuffer_inputs,
    .outputs     = NULL,
};

甚至 acopy 滤镜没有实现上述的任何函数：

// File: libavfilter/af_acopy.c
AVFilter ff_af_acopy = {
    .name          = "acopy",
    .description   = NULL_IF_CONFIG_SMALL("Copy the input audio unchanged to the output."),
    .inputs        = acopy_inputs,
    .outputs       = acopy_outputs,
};

上述的 ff_asrc_abuffer 、 ff_asink_abuffer 可以看做是实现了 AVFrame 接口的虚拟子类的实例，而 ff_af_acopy 直接是 AVFilter 的实例。

然后看看 AVFilter 的其他字段：

// File: libavfilter/avfilter.h
typedef struct AVFilter {
    // 滤镜名称
    /**
     *滤镜name. Must be non-NULL and unique among filters.
     */
    const char *name;

    // 滤镜描述
    /**
     * A description of the filter. May be NULL.
     *
     * You should use the NULL_IF_CONFIG_SMALL() macro to define it.
     */
    const char *description;

    // 输入列表
    /**
     * List of inputs, terminated by a zeroed element.
     *
     * NULL if there are no (static) inputs. Instances of filters with
     * AVFILTER_FLAG_DYNAMIC_INPUTS set may have more inputs than present in
     * this list.
     */
    const AVFilterPad *inputs;

    // 输出列表
    /**
     * List of outputs, terminated by a zeroed element.
     *
     * NULL if there are no (static) outputs. Instances of filters with
     * AVFILTER_FLAG_DYNAMIC_OUTPUTS set may have more outputs than present in
     * this list.
     */
    const AVFilterPad *outputs;

    // AVFILTER_FLAG_* 枚举类型。
    // AVFILTER_FLAG_DYNAMIC_INPUTS: 用于标示输入的数量是否是动态的，即不是由 inputs 决定的。
    // AVFILTER_FLAG_DYNAMIC_OUTPUTS: 用于标示输出的数量是否是动态的，即不是由 outputs 决定的。
    // AVFILTER_FLAG_SLICE_THREADS: 是否支持将 frame 分帧进行多线程并行处理。
    // AVFILTER_FLAG_SUPPORT_TIMELINE_GENERIC: 有些滤镜支持 enable 表达式，用时间线控制滤镜的启用与否。如果判定 enable 的结果为 false，则不会调用定义的 filter_frame 函数就将帧传递给下一个滤镜。
    // AVFILTER_FLAG_SUPPORT_TIMELINE_INTERNAL: 类似于 AVFILTER_FLAG_SUPPORT_TIMELINE_GENERIC 。不用之处在于调用 filter_frame 本身的时候判断，依赖于 AVFilterContext->is_disabled 的值。
    /**
     * A combination of AVFILTER_FLAG_*
     */
    int flags;

    // FFmpeg 内部使用。
    int flags_internal; ///< Additional flags for avfilter internal use only.

    // Filter graph 是使用滤镜的输入输出 Pad 构造的，AVFitler 的 `AVFitler *next` 字段与之无关。其作用只是为了注册各个 Filter，详见 `allfilters.c` 的 `av_filter_init_next` 函数。
    /**
     * Used by the filter registration system. Must not be touched by any other
     * code.
     */
    struct AVFilter *next;
} AVFilter;

再然后看看 AVFilte 的其他函数指针：

// File: libavfilter/avfilter.h
typedef struct AVFilter {
    /**
     * Query formats supported by the filter on its inputs and outputs.
     *
     * This callback is called after the filter is initialized (so the inputs
     * and outputs are fixed), shortly before the format negotiation. This
     * callback may be called more than once.
     *
     * This callback must set AVFilterLink.out_formats on every input link and
     * AVFilterLink.in_formats on every output link to a list of pixel/sample
     * formats that the filter supports on that link. For audio links, this
     * filter must also set @ref AVFilterLink.in_samplerates "in_samplerates" /
     * @ref AVFilterLink.out_samplerates "out_samplerates" and
     * @ref AVFilterLink.in_channel_layouts "in_channel_layouts" /
     * @ref AVFilterLink.out_channel_layouts "out_channel_layouts" analogously.
     *
     * This callback may be NULL for filters with one input, in which case
     * libavfilter assumes that it supports all input formats and preserves
     * them on output.
     *
     * @return zero on success, a negative value corresponding to an
     * AVERROR code otherwise
     */
    int (*query_formats)(AVFilterContext *);

    /**
     * Make the filter instance process a command.
     *
     * @param cmd    the command to process, for handling simplicity all commands must be alphanumeric only
     * @param arg    the argument for the command
     * @param res    a buffer with size res_size where the filter(s) can return a response. This must not change when the command is not supported.
     * @param flags  if AVFILTER_CMD_FLAG_FAST is set and the command would be
     *               time consuming then a filter should treat it like an unsupported command
     *
     * @returns >=0 on success otherwise an error code.
     *          AVERROR(ENOSYS) on unsupported commands
     */
    int (*process_command)(AVFilterContext *, const char *cmd, const char *arg, char *res, int res_len, int flags);

    /**
     * Filter activation function.
     *
     * Called when any processing is needed from the filter, instead of any
     * filter_frame and request_frame on pads.
     *
     * The function must examine inlinks and outlinks and perform a single
     * step of processing. If there is nothing to do, the function must do
     * nothing and not return an error. If more steps are or may be
     * possible, it must use ff_filter_set_ready() to schedule another
     * activation.
     */
    int (*activate)(AVFilterContext *ctx);
} AVFilter;

activate 在处理数据之前调用进行一些准备工作。如果没有定义该函数，会使用 ff_filter_activate_default 函数。
query_formats 用于查询输入和输出 Pad 支持的媒体格式。
process_command 用于处理命令。注意并不是处理数据，命令里可能包含设置某些参数之类的。实际处理数据的是 AVFilterPad 的 filter_frame 指针所指向的函数。而该函数一般在滤镜所在文件进行定义，然后让 AVFilterPad 的 filter_frame 指针指向它。

2、AVFilterPad 类

使用 AVFilterPad 描述滤镜的输入和输出，比如描述 buffer 滤镜的输出:

// File: libavfilter/buffersrc.c
tatic const AVFilterPad avfilter_vsrc_buffer_outputs[] = {
    {
        .name          = "default",
        .type          = AVMEDIA_TYPE_VIDEO,
        .request_frame = request_frame,
        .poll_frame    = poll_frame,
        .config_props  = config_props,
    },
    { NULL }
};

AVFilter ff_vsrc_buffer = {
    // 其他字段
    .outputs   = avfilter_vsrc_buffer_outputs,
    // 其他字段
};

// File: libavfilter/internal.c
/**
 * A filter pad used for either input or output.
 */
struct AVFilterPad {
    // Pad 名称
    /**
     * Pad name. The name is unique among inputs and among outputs, but an
     * input may have the same name as an output. This may be NULL if this
     * pad has no need to ever be referenced by name.
     */
    const char *name;

    // Pad 类型，视频或音频等。
    /**
     * AVFilterPad type.
     */
    enum AVMediaType type;

    /** 对于视频，在应用滤镜之前获取视频帧。
     * Callback function to get a video buffer. If NULL, the filter system will
     * use ff_default_get_video_buffer().
     *
     * Input video pads only.
     */
    AVFrame *(*get_video_buffer)(AVFilterLink *link, int w, int h);

    /** 对于音频，在应用滤镜之前获取音频帧。
    /**
     * Callback function to get an audio buffer. If NULL, the filter system will
     * use ff_default_get_audio_buffer().
     *
     * Input audio pads only.
     */
    AVFrame *(*get_audio_buffer)(AVFilterLink *link, int nb_samples);

    // 应用滤镜
    /**
     * Filtering callback. This is where a filter receives a frame with
     * audio/video data and should do its processing.
     *
     * Input pads only.
     *
     * @return >= 0 on success, a negative AVERROR on error. This function
     * must ensure that frame is properly unreferenced on error if it
     * hasn't been passed on to another filter.
     */
    int (*filter_frame)(AVFilterLink *link, AVFrame *frame);

    // 查询可用的数据帧的数量。
    // 目前只存在于 buffer 或 abuffer 这两个滤镜中。
    /**
     * Frame poll callback. This returns the number of immediately available
     * samples. It should return a positive value if the next request_frame()
     * is guaranteed to return one frame (with no delay).
     *
     * Defaults to just calling the source poll_frame() method.
     *
     * Output pads only.
     */
    int (*poll_frame)(AVFilterLink *link);

    // 请求数据帧，以备滤镜处理。
    /**
     * Frame request callback. A call to this should result in some progress
     * towards producing output over the given link. This should return zero
     * on success, and another value on error.
     *
     * Output pads only.
     */
    int (*request_frame)(AVFilterLink *link);

    // 配置属性，比如视频的高宽。注意这不是格式属性，滤镜之间协商格式使用的是在调用该回调之前的 query_formats 回调。
    /**
     * Link configuration callback.
     *
     * For output pads, this should set the link properties such as
     * width/height. This should NOT set the format property - that is
     * negotiated between filters by the filter system using the
     * query_formats() callback before this function is called.
     *
     * For input pads, this should check the properties of the link, and update
     * the filter's internal state as necessary.
     *
     * For both input and output filters, this should return zero on success,
     * and another value on error.
     */
    int (*config_props)(AVFilterLink *link);

    // 是否需要 FIFO 队列。如果为 true，则会在本滤镜之前插入一个 fifo 或 afifo 滤镜。
    /**
     * The filter expects a fifo to be inserted on its input link,
     * typically because it has a delay.
     *
     * input pads only.
     */
    int needs_fifo;

    // 是否需要将 AVFrame 设置为可写。如果 AVFrame 本是可写，则会拷贝数据创建新的 AVFrame。
    /**
     * The filter expects writable frames from its input link,
     * duplicating data buffers if needed.
     *
     * input pads only.
     */
    int needs_writable;
};

五、Filter graph 的创建和释放

1、创建 Filter graph

调用 avfilter_graph_alloc 函数创建 Filter graph，除了给 AVFilterGraph 结构也会给其内部使用的 AVFilterGraphInternal 型的 internal 字段分配内存。然后设置 AVOption 集的初始值。类似于面向对象语言的默认构造函数。

// File: libavfilter/avfiltergraph.c
AVFilterGraph *avfilter_graph_alloc(void)
{
    AVFilterGraph *ret = av_mallocz(sizeof(*ret));
    if (!ret)
        return NULL;

    ret->internal = av_mallocz(sizeof(*ret->internal));
    if (!ret->internal) {
        av_freep(&ret);
        return NULL;
    }

    ret->av_class = &filtergraph_class;
    av_opt_set_defaults(ret);
    ff_framequeue_global_init(&ret->internal->frame_queues);

    return ret;
}

2、释放 Filter graph

在使用 Filter graph 完成后，调用 avfilter_graph_free 函数以释放其内存。

六、滤镜(AVFilterContext)的创建和释放

AVFilterContext 实例是滤镜在 Filter graph 中的实例。它包含一个 AVFilter 实例——当然不同的滤镜实现不同。比如对于 overlay 这个滤镜，AVFilterContext 的 AVFilter 指针指向的是 ff_vf_overlay 对象。

1、创建滤镜

调用 avfilter_graph_create_filter 创建滤镜。

2、释放滤镜

在释放 Filter graph 的时候会将滤镜释放，无需手工释放。

七、Filter graph 语句解析(Parse)

AVFilterInOut 是用于对滤镜参数字符串进行语句解析(Parse)时的辅助类。在此过程中会将[] 命名的的名称构造成 AVFilterInOut。它是一个链表结构。
在使用 FFmpeg 滤镜 API (比如官方的 filter_video 这个 Demo) 时，也会手工创建以将 Source 和 Sink 与其他滤镜连接起来。
AVFilterLink 通过将两个滤镜的 Pad 放在一起从而实现滤镜的连接。滤镜相连是通过 AVFilterLink 对象而不是 AVFilterContext 或 AVFilter 等对象。

1、AVFilterInOut 类

// File: libavfilter/avfilter.h
/**
 * A linked-list of the inputs/outputs of the filter chain.
 *
 * This is mainly useful for avfilter_graph_parse() / avfilter_graph_parse2(),
 * where it is used to communicate open (unlinked) inputs and outputs from and
 * to the caller.
 * This struct specifies, per each not connected pad contained in the graph, the
 * filter context and the pad index required for establishing a link.
 */
typedef struct AVFilterInOut {
    /** unique name for this input/output in the list */
    char *name;

    /** filter context associated to this input/output */
    AVFilterContext *filter_ctx;

    /** index of the filt_ctx pad to use for linking */
    int pad_idx;

    /** next input/input in the list, NULL if this is the last */
    struct AVFilterInOut *next;
} AVFilterInOut;

2、AVFilterLink 类

// File: libavfilter/avfilter.h
/**
 * A link between two filters. This contains pointers to the source and
 * destination filters between which this link exists, and the indexes of
 * the pads involved. In addition, this link also contains the parameters
 * which have been negotiated and agreed upon between the filter, such as
 * image dimensions, format, etc.
 *
 * Applications must not normally access the link structure directly.
 * Use the buffersrc and buffersink API instead.
 * In the future, access to the header may be reserved for filters
 * implementation.
 */
struct AVFilterLink {
    AVFilterContext *src;       ///< source filter
    AVFilterPad *srcpad;        ///< output pad on the source filter

    AVFilterContext *dst;       ///< dest filter
    AVFilterPad *dstpad;        ///< input pad on the dest filter

    enum AVMediaType type;      ///< filter media type

    // ...Others
};