FFmpeg 기반 ffplay의 비디오 출력 및 크기 조정 메커니즘 분석

비디오 출력 아키텍처 개요

ffplay은 다양한 운영체제에서 일관된 미디어 재생을 제공하기 위해 SDL(Simple DirectMedia Layer)을 사용합니다. 이 크로스플랫폼 라이브러리는 윈도우, 리눅스, 맥OS 등에서 그래픽, 오디오, 입력 장치 제어를 통합하여 구현합니다.

초기화 절차

메인 함수 내부에서 다음과 같은 순서로 초기화가 진행됩니다:

SDL 시스템 초기화: 비디오, 오디오, 타이머 하위 시스템 활성화
윈도우 생성: 기본 해상도 기반으로 창 생성
렌더러 구성: 하드웨어 가속 및 수직동기(V-Sync) 지원 설정
미디어 스트림 열기: 디코딩 스레드 시작
이벤트 루프 진입: 사용자 입력과 재생 제어 담당

int main(int argc, char **argv) {
    // SDL 초기화
    if (SDL_Init(SDL_INIT_VIDEO | SDL_INIT_AUDIO | SDL_INIT_TIMER)) {
        av_log(NULL, AV_LOG_FATAL, "SDL 초기화 실패: %s\n", SDL_GetError());
        exit(1);
    }

    // 윈도우 및 렌더러 생성
    window = SDL_CreateWindow("ffplay", SDL_WINDOWPOS_UNDEFINED,
                              SDL_WINDOWPOS_UNDEFINED, 640, 480, 0);
    renderer = SDL_CreateRenderer(window, -1,
                                 SDL_RENDERER_ACCELERATED | SDL_RENDERER_PRESENTVSYNC);

    // 미디어 처리 쓰레드 시작
    is = stream_open(input_filename, file_iformat);
    event_loop(is); // 이벤트 처리 루프
}

디스플레이 영역 계산 로직

비디오 프레임을 화면에 맞게 적절히 표시하기 위해 `calculate_display_rect` 함수는 SAR(Sample Aspect Ratio) 정보를 반영한 정확한 렌더링 영역을 결정합니다.

static void calculate_display_area(SDL_Rect *area, int x_offset, int y_offset,
                                   int max_w, int max_h, int pic_w, int pic_h, AVRational sar)
{
    AVRational display_ratio = sar;
    int64_t target_h, target_w;

    // 기본 비율이 유효하지 않으면 1:1로 설정
    if (av_cmp_q(display_ratio, av_make_q(1, 1)) <= 0)
        display_ratio = av_make_q(1, 1);

    // 실제 화면 비율 = 샘플 비율 × 프레임 비율
    display_ratio = av_mul_q(display_ratio, av_make_q(pic_w, pic_h));

    // 높이 기준으로 먼저 계산
    target_h = max_h;
    target_w = av_rescale(target_h, display_ratio.num, display_ratio.den) & ~1;

    // 너비 초과 시 너비 기준으로 재계산
    if (target_w > max_w) {
        target_w = max_w;
        target_h = av_rescale(target_w, display_ratio.den, display_ratio.num) & ~1;
    }

    // 중앙 정렬 좌표 계산
    area->x = x_offset + (max_w - target_w) / 2;
    area->y = y_offset + (max_h - target_h) / 2;
    area->w = FFMAX((int)target_w, 1);
    area->h = FFMAX((int)target_h, 1);
}

비디오 렌더링 흐름

주요 렌더링 경로는 다음과 같습니다:

event_loop → refresh_wait_event → video_refresh → video_render_frame → upload_surface

프레임 동기화 처리

함수 `video_refresh`는 현재 프레임의 재생 시간을 기반으로 다음 동작을 판단합니다:

이전 프레임의 지속 시간 계산
실시간 기준 지연 시간 산출
지나친 지연 발생 시 프레임 생략 여부 결정

static double compute_frame_duration(FrameQueue *q, Frame *current, Frame *next)
{
    if (current->serial == next->serial) {
        double duration = next->pts - current->pts;
        if (!isnan(duration) && duration > 0 && duration < MAX_DURATION)
            return duration;
        else
            return current->duration; // 예외 상황에서는 기본 지속 시간 사용
    }
    return 0.0;
}

조건부 프레임 생략 전략

CPU 부하 또는 싱크 불일치 상황에서 성능 유지를 위해 일부 프레임을 건너뛸 수 있습니다.

if (frame_queue_remaining(&is->pictq) > 1) {
    Frame *future = frame_queue_peek_next(&is->pictq);
    double upcoming_duration = compute_frame_duration(&is->pictq, vp, future);
    
    if (!is->single_step_mode &&
        (framedrop_enabled || get_sync_strategy(is) != VIDEO_MASTER_SYNC) &&
        current_time > is->frame_display_time + upcoming_duration) {
        
        is->dropped_frames_count++;
        frame_queue_advance(&is->pictq); // 다음 프레임으로 이동
        goto restart_processing;
    }
}

텍스처 업데이트 및 렌더링

AVFrame 데이터를 SDL 텍스처로 변환하는 과정은 픽셀 형식 호환성에 따라 분기됩니다.

static int update_video_surface(SDL_Texture **surface, AVFrame *frame,
                                struct SwsContext **conversion_ctx)
{
    Uint32 sdl_format;
    SDL_BlendMode blend_mode;

    get_equivalent_sdl_format(frame->format, &sdl_format, &blend_mode);

    if (realloc_surface(surface, sdl_format, frame->width, frame->height, blend_mode) < 0)
        return -1;

    switch (sdl_format) {
        case SDL_PIXELFORMAT_IYUV:
            SDL_UpdateYUVTexture(*surface, NULL,
                frame->data[0], frame->linesize[0],
                frame->data[1], frame->linesize[1],
                frame->data[2], frame->linesize[2]);
            break;

        case SDL_PIXELFORMAT_UNKNOWN:
            *conversion_ctx = sws_getCachedContext(*conversion_ctx,
                frame->width, frame->height, frame->format,
                frame->width, frame->height, AV_PIX_FMT_BGRA, SWS_FAST_BILINEAR,
                NULL, NULL, NULL);

            uint8_t *raw_pixels[4];
            int line_pitches[4];
            if (!SDL_LockTexture(*surface, NULL, (void**)raw_pixels, line_pitches)) {
                sws_scale(*conversion_ctx, (const uint8_t* const*)frame->data,
                          frame->linesize, 0, frame->height, raw_pixels, line_pitches);
                SDL_UnlockTexture(*surface);
            }
            break;

        default:
            SDL_UpdateTexture(*surface, NULL, frame->data[0], frame->linesize[0]);
            break;
    }
    return 0;
}

이미지 스케일링 및 포맷 변환

libswscale은 픽셀 형식 변경과 해상도 조정을 동시에 수행할 수 있으며, 주요 API는 다음과 같습니다:

sws_getContext(): 변환 컨텍스트 생성
sws_getCachedContext(): 기존 컨텍스트 재사용 또는 갱신
sws_scale(): 실제 픽셀 데이터 변환 실행
sws_freeContext(): 자원 해제

성능 비교: 스케일링 알고리즘

알고리즘	처리 속도 (FPS)	화질 평가
SWS_FAST_BILINEAR	~228	균형 잡힌 품질, 실시간 처리에 적합
SWS_POINT	~427	빠르지만 계단 현상 발생 가능
SWS_BICUBIC	~80	선명도 우수, 고해상도 출력용
SWS_LANCZOS	~70	세밀한 디테일 보존, 연산 부담 큼

실제 적용 시에는 사용 목적에 따라 품질과 성능 사이의 트레이드오프를 고려해야 합니다. 예를 들어 실시간 스트리밍에서는 FAST_BILINEAR이 일반적으로 최적의 선택입니다.

태그: ffplay SDL libswscale 비디오 렌더링 프레임 동기화

5월 24일 04:20에 게시됨

괴물 클럽