Vue3 기반 실시간 문서 OCR 뷰어: LightOnOCR-2-1B 통합 가이드

1. 시스템 개요 및 목표

문서 업무에서 즉각적인 텍스트 추출은 생산성 향상의 핵심이다. 기존 OCR 서비스는 파일 업로드 후 결과를 기다리는 방식이었으나, 본 시스템은 문서를 업로드하는 순간부터 인식 결과를 스트리밍하는 실시간 파이프라인을 구축한다.

LightOnOCR-2-1B는 10억 파라미터 규모의 경량 엔드투엔드 OCR 모델로, 대용량 모델 대비 낮은 추론 비용으로 문서 인식에서 경쟁력 있는 성능을 보인다. Vue3의 반응형 시스템과 결합하면 문서 처리 워크플로우를 웹 브라우저 내에서 원활하게 구현할 수 있다.

핵심 가치:

  • 스트리밍 기반 결과 확인: 페이지 단위로 점진적 표시
  • 협업 환경 지원: 주석과 수정 내역 실시간 동기화
  • 에지 디바이스 대응: 낮은 리소스 요구량으로 모바일 환경 최적화
  • 운영 비용 절감: 경량 모델로 인프라 부담 최소화

2. 아키텍처 설계

2.1 데이터 흐름 설계

클라이언트-서버 간 데이터 흐름을 다음과 같이 설계한다:

클라이언트 역할

  • 문서 파일의 클라이언트 사이드 분할 업로드
  • WebSocket을 통한 서버 푸시 메시지 수신
  • Canvas 기반 뷰포트 렌더링 및 텍스트 오버레이
  • 사용자 주석 데이터의 낙관적 업데이트

서버 역할

  • LightOnOCR-2-1B 모델의 배치 추론 스케줄링
  • 업로드 청크 조립 및 페이지별 큐 관리
  • Redis Pub/Sub을 활용한 다중 인스턴스 간 상태 공유
  • 인식 결과의 구조화 저장 및 증분 전송

2.2 기술 스택 구성

영역기술용도
프론트엔드Vue 3.4 + Composition APIUI 상태 관리 및 컴포넌트 구성
프론트엔드TypeScript 5.x타입 안전성 확보
프론트엔드Native WebSocket서버-클라이언트 실시간 통신
프론트엔드OffscreenCanvas그라운드 렌더링 워커
백엔드Sanic / Uvicorn고성능 비동기 서버
백엔드python-socketioSocket.IO 프로토콜 지원
백엔드LightOnOCR-2-1B (ONNX Runtime)추론 엔진
백엔드Celery + Redis분산 태스크 큐

3. 핵심 모듈 구현

3.1 문서 업로드 및 진행률 추적

대용량 파일 업로드 시 진행 상황을 시각화하는 컴포넌트:

<!-- FileDropZone.vue -->
<template>
  <div 
    class="drop-zone"
    :class="{ 'drag-over': isDragging }"
    @dragenter.prevent="isDragging = true"
    @dragleave.prevent="isDragging = false"
    @drop.prevent="onFileDrop"
    @click="triggerFileInput"
  >
    <input 
      ref="fileInput"
      type="file" 
      hidden 
      accept="application/pdf,image/*"
      @change="onInputChange"
    />
    
    <div v-if="!activeTransfer" class="placeholder">
      <span>문서를 여기에 놓거나 클릭하여 선택</span>
    </div>
    
    <div v-else class="transfer-status">
      <div class="filename">{{ activeTransfer.filename }}</div>
      <div class="progress-track">
        <div 
          class="progress-fill" 
          :style="{ width: activeTransfer.percentage + '%' }"
        ></div>
      </div>
      <span class="percentage">{{ activeTransfer.percentage }}%</span>
    </div>
  </div>
</template>

<script setup lang="ts">
import { ref } from 'vue'

interface TransferInfo {
  filename: string
  percentage: number
  bytesSent: number
  totalBytes: number
}

const emit = defineEmits<{ (e: 'uploaded', file: File, id: string): void }>()
const fileInput = ref<HTMLInputElement | null>(null)
const isDragging = ref(false)
const activeTransfer = ref<TransferInfo | null>(null)

const triggerFileInput = () => fileInput.value?.click()

const onFileDrop = (evt: DragEvent) => {
  isDragging.value = false
  const dropped = evt.dataTransfer?.files[0]
  if (dropped) initiateUpload(dropped)
}

const onInputChange = (evt: Event) => {
  const selected = (evt.target as HTMLInputElement).files?.[0]
  if (selected) initiateUpload(selected)
}

const initiateUpload = async (file: File) => {
  const chunkSize = 256 * 1024 // 256KB 청크
  const docId = crypto.randomUUID()
  
  activeTransfer.value = {
    filename: file.name,
    percentage: 0,
    bytesSent: 0,
    totalBytes: file.size
  }

  for (let offset = 0; offset < file.size; offset += chunkSize) {
    const slice = file.slice(offset, Math.min(offset + chunkSize, file.size))
    await transmitChunk(docId, slice, offset, file.size)
    
    activeTransfer.value.bytesSent = Math.min(offset + chunkSize, file.size)
    activeTransfer.value.percentage = Math.round(
      (activeTransfer.value.bytesSent / file.size) * 100
    )
  }
  
  emit('uploaded', file, docId)
  activeTransfer.value = null
}

const transmitChunk = async (
  docId: string, 
  blob: Blob, 
  position: number, 
  total: number
) => {
  // 실제 구현 시 fetch 또는 XMLHttpRequest 사용
  console.log(`Chunk at ${position}/${total} for ${docId}`)
}
</script>

3.2 실시간 통신 계층

재연결 로직과 메시지 버퍼링을 내장한 소켓 관리자:

// realtime/channel.ts
type MessageHandler = (payload: unknown) => void

interface QueuedMessage {
  topic: string
  payload: unknown
  timestamp: number
}

export class RealtimeChannel {
  private ws: WebSocket | null = null
  private handlers = new Map<string, Set<MessageHandler>>()
  private outboundQueue: QueuedMessage[] = []
  private reconnectTimer: ReturnType<typeof setTimeout> | null = null
  private attemptCount = 0
  private readonly maxRetries = 7
  private readonly baseDelay = 500

  constructor(private endpoint: string) {}

  establish(documentToken: string) {
    const url = `${this.endpoint}/stream/${documentToken}`
    this.ws = new WebSocket(url)
    
    this.ws.onopen = () => {
      this.attemptCount = 0
      this.flushQueue()
    }
    
    this.ws.onmessage = (ev) => {
      try {
        const envelope = JSON.parse(ev.data)
        this.dispatch(envelope.topic, envelope.data)
      } catch {
        console.warn('Malformed message received')
      }
    }
    
    this.ws.onclose = () => this.scheduleReconnect(documentToken)
    this.ws.onerror = () => this.ws?.close()
  }

  subscribe(topic: string, handler: MessageHandler) {
    if (!this.handlers.has(topic)) {
      this.handlers.set(topic, new Set())
    }
    this.handlers.get(topic)!.add(handler)
    
    return () => {
      this.handlers.get(topic)?.delete(handler)
    }
  }

  publish(topic: string, payload: unknown, urgent = false) {
    const message: QueuedMessage = { topic, payload, timestamp: Date.now() }
    
    if (urgent && this.ws?.readyState === WebSocket.OPEN) {
      this.ws.send(JSON.stringify(message))
      return
    }
    
    if (this.ws?.readyState === WebSocket.OPEN) {
      this.ws.send(JSON.stringify(message))
    } else {
      this.outboundQueue.push(message)
    }
  }

  private flushQueue() {
    while (this.outboundQueue.length > 0) {
      const msg = this.outboundQueue.shift()!
      this.ws?.send(JSON.stringify(msg))
    }
  }

  private dispatch(topic: string, data: unknown) {
    this.handlers.get(topic)?.forEach(fn => {
      try { fn(data) } catch (err) { console.error(err) }
    })
  }

  private scheduleReconnect(token: string) {
    if (this.attemptCount >= this.maxRetries) return
    
    const jitter = Math.random() * 300
    const delay = Math.min(
      this.baseDelay * Math.pow(2, this.attemptCount) + jitter,
      30000
    )
    
    this.reconnectTimer = setTimeout(() => {
      this.attemptCount++
      this.establish(token)
    }, delay)
  }

  terminate() {
    this.reconnectTimer && clearTimeout(this.reconnectTimer)
    this.ws?.close()
    this.handlers.clear()
    this.outboundQueue = []
  }
}

3.3 가상화된 문서 뷰어

수천 페이지 문서에서도 일정한 메모리 사용량을 유지하는 렌더러:

<!-- VirtualDocumentViewer.vue -->
<template>
  <div 
    ref="scroller"
    class="doc-scroller"
    @scroll.passive="onScroll"
  >
    <div :style="{ height: estimatedTotalHeight + 'px' }" class="sizer"></div>
    
    <div 
      v-for="page in renderedPages" 
      :key="page.index"
      class="page-wrapper"
      :style="computePageStyle(page)"
    >
      <canvas :ref="el => bindCanvas(el, page.index)" />
      <div class="text-overlay" v-html="page.ocrHtml" />
    </div>
  </div>
</template>

<script setup lang="ts">
import { ref, computed, onMounted } from 'vue'

interface PageViewport {
  index: number
  naturalWidth: number
  naturalHeight: number
  scale: number
  ocrHtml: string
}

const props = defineProps<{
  pageCount: number
  pageSource: (idx: number) => Promise<PageViewport>
}>()

const scroller = ref<HTMLElement | null>(null)
const scrollY = ref(0)
const containerHeight = ref(800)
const averagePageHeight = 1100
const overscan = 2

const estimatedTotalHeight = computed(() => 
  props.pageCount * averagePageHeight
)

const visibleRange = computed(() => {
  const start = Math.floor(scrollY.value / averagePageHeight)
  const visibleCount = Math.ceil(containerHeight.value / averagePageHeight)
  return {
    start: Math.max(0, start - overscan),
    end: Math.min(props.pageCount, start + visibleCount + overscan)
  }
})

const pageCache = new Map<number, PageViewport>()
const renderedPages = computed(() => {
  const { start, end } = visibleRange.value
  const result: PageViewport[] = []
  
  for (let i = start; i < end; i++) {
    if (pageCache.has(i)) {
      result.push(pageCache.get(i)!)
    } else {
      props.pageSource(i).then(vp => pageCache.set(i, vp))
    }
  }
  return result
})

const computePageStyle = (page: PageViewport) => ({
  position: 'absolute' as const,
  top: `${page.index * averagePageHeight}px`,
  width: '100%',
  height: `${page.naturalHeight * page.scale}px`
})

const bindCanvas = (el: unknown, index: number) => {
  if (!(el instanceof HTMLCanvasElement)) return
  // Canvas 2D context 초기화 및 페이지 이미지 그리기
}

const onScroll = () => {
  scrollY.value = scroller.value?.scrollTop ?? 0
}

onMounted(() => {
  const ro = new ResizeObserver(entries => {
    containerHeight.value = entries[0].contentRect.height
  })
  if (scroller.value) ro.observe(scroller.value)
})
</script>

4. 상호작용형 주석 시스템

4.1 주석 데이터 구조

// annotations/schema.ts
export const ANNOTATION_TYPES = ['highlight', 'strikethrough', 'inline-note', 'region-crop'] as const
export type AnnotationKind = typeof ANNOTATION_TYPES[number]

export interface BoundingBox {
  pageIndex: number
  leftPx: number
  topPx: number
  rightPx: number
  bottomPx: number
}

export interface AnnotationRecord {
  annotationId: string
  kind: AnnotationKind
  bounds: BoundingBox
  creator: string
  createdAt: string // ISO 8601
  modifiedAt: string
  resolved: boolean
  payload: HighlightPayload | NotePayload | CropPayload
}

interface HighlightPayload {
  colorHex: string
  opacity: number
}

interface NotePayload {
  markdownContent: string
  replyTo: string | null
}

interface CropPayload {
  extractedImageUrl: string
  originalBounds: BoundingBox
}

4.2 Canvas 기반 인터랙션

<!-- AnnotationSurface.vue -->
<template>
  <canvas
    ref="surface"
    class="annotation-surface"
    @pointerdown="handlePointerStart"
    @pointermove="handlePointerMove"
    @pointerup="handlePointerEnd"
  />
</template>

<script setup lang="ts">
import { ref, onMounted } from 'vue'

const surface = ref<HTMLCanvasElement | null>(null)
const activeTool = ref<'select' | 'highlight' | 'note'>('select')
const draftShape = ref<BoundingBox | null>(null)
const committedShapes = ref<AnnotationRecord[]>([])

let isDragging = false
let dragOrigin = { x: 0, y: 0 }

const handlePointerStart = (ev: PointerEvent) => {
  if (activeTool.value === 'select') return
  
  const rect = surface.value!.getBoundingClientRect()
  isDragging = true
  dragOrigin = {
    x: ev.clientX - rect.left,
    y: ev.clientY - rect.top
  }
  draftShape.value = {
    pageIndex: 0, // 현재 보고 있는 페이지 인덱스
    leftPx: dragOrigin.x,
    topPx: dragOrigin.y,
    rightPx: dragOrigin.x,
    bottomPx: dragOrigin.y
  }
}

const handlePointerMove = (ev: PointerEvent) => {
  if (!isDragging || !draftShape.value) return
  
  const rect = surface.value!.getBoundingClientRect()
  const currentX = ev.clientX - rect.left
  const currentY = ev.clientY - rect.top
  
  draftShape.value = {
    ...draftShape.value,
    leftPx: Math.min(dragOrigin.x, currentX),
    topPx: Math.min(dragOrigin.y, currentY),
    rightPx: Math.max(dragOrigin.x, currentX),
    bottomPx: Math.max(dragOrigin.y, currentY)
  }
  
  requestAnimationFrame(renderOverlay)
}

const handlePointerEnd = () => {
  if (!draftShape.value) return
  
  // 최종 주석 확정 및 서버 전송
  const finalized: AnnotationRecord = {
    annotationId: generateUUID(),
    kind: activeTool.value === 'highlight' ? 'highlight' : 'inline-note',
    bounds: draftShape.value,
    creator: 'current-user',
    createdAt: new Date().toISOString(),
    modifiedAt: new Date().toISOString(),
    resolved: false,
    payload: { colorHex: '#FFD700', opacity: 0.4 }
  }
  
  committedShapes.value.push(finalized)
  draftShape.value = null
  isDragging = false
  renderOverlay()
}

const renderOverlay = () => {
  const ctx = surface.value!.getContext('2d')!
  ctx.clearRect(0, 0, surface.value!.width, surface.value!.height)
  
  // 확정된 주석 그리기
  for (const ann of committedShapes.value) {
    ctx.fillStyle = (ann.payload as HighlightPayload).colorHex
    ctx.globalAlpha = (ann.payload as HighlightPayload).opacity
    ctx.fillRect(
      ann.bounds.leftPx,
      ann.bounds.topPx,
      ann.bounds.rightPx - ann.bounds.leftPx,
      ann.bounds.bottomPx - ann.bounds.topPx
    )
  }
  
  // 현재 그리고 있는 영역
  if (draftShape.value) {
    ctx.strokeStyle = '#0066FF'
    ctx.lineWidth = 2
    ctx.setLineDash([4, 4])
    ctx.strokeRect(
      draftShape.value.leftPx,
      draftShape.value.topPx,
      draftShape.value.rightPx - draftShape.value.leftPx,
      draftShape.value.bottomPx - draftShape.value.topPx
    )
    ctx.setLineDash([])
  }
}

onMounted(() => {
  // DPI에 맞춰 canvas 크기 조정
})
</script>

5. 성능 최적화 기법

5.1 이미지 디코딩 파이프라인

// workers/image-decoder.worker.ts
self.onmessage = async (ev) => {
  const { imageUrl, targetWidth, requestId } = ev.data
  
  const bitmap = await createImageBitmap(
    await fetch(imageUrl).then(r => r.blob()),
    { resizeWidth: targetWidth, resizeQuality: 'medium' }
  )
  
  const offscreen = new OffscreenCanvas(bitmap.width, bitmap.height)
  const ctx = offscreen.getContext('2d')!
  ctx.drawImage(bitmap, 0, 0)
  
  const imageData = ctx.getImageData(0, 0, bitmap.width, bitmap.height)
  
  self.postMessage({ 
    requestId, 
    imageData, 
    width: bitmap.width, 
    height: bitmap.height 
  }, [imageData.data.buffer])
}

5.2 메모리 압력 완화

// utils/resource-lru.ts
export class ImageLRU {
  private cache = new Map<string, ImageBitmap>()
  private maxEntries: number
  
  constructor(capacity: number) {
    this.maxEntries = capacity
  }

  get(key: string): ImageBitmap | undefined {
    const hit = this.cache.get(key)
    if (hit) {
      this.cache.delete(key)
      this.cache.set(key, hit) // MRU 재배치
    }
    return hit
  }

  put(key: string, value: ImageBitmap) {
    if (this.cache.has(key)) {
      this.cache.get(key)!.close()
      this.cache.delete(key)
    }
    
    while (this.cache.size >= this.maxEntries) {
      const oldest = this.cache.keys().next().value
      this.cache.get(oldest)!.close()
      this.cache.delete(oldest)
    }
    
    this.cache.set(key, value)
  }

  evictPage(pageIndex: number) {
    const prefix = `page-${pageIndex}-`
    for (const [k, v] of this.cache.entries()) {
      if (k.startsWith(prefix)) {
        v.close()
        this.cache.delete(k)
      }
    }
  }
}

6. 운영 경 고려사항

6.1 장애 복구 패턴

// resilience/circuit-breaker.ts
type CircuitState = 'closed' | 'open' | 'half-open'

export class OcrCircuitBreaker {
  private state: CircuitState = 'closed'
  private failures = 0
  private nextAttempt = 0
  private readonly threshold = 5
  private readonly timeout = 30000

  async execute<T>(operation: () => Promise<T>): Promise<T> {
    if (this.state === 'open') {
      if (Date.now() < this.nextAttempt) {
        throw new Error('Circuit breaker is OPEN')
      }
      this.state = 'half-open'
    }

    try {
      const result = await operation()
      this.onSuccess()
      return result
    } catch (err) {
      this.onFailure()
      throw err
    }
  }

  private onSuccess() {
    this.failures = 0
    this.state = 'closed'
  }

  private onFailure() {
    this.failures++
    if (this.failures >= this.threshold) {
      this.state = 'open'
      this.nextAttempt = Date.now() + this.timeout
    }
  }
}

6.2 성능 지표 수집

// telemetry/ocr-metrics.ts
export class OcrMetricsCollector {
  private samples: Map<string, number[]> = new Map()

  record(stage: 'upload' | 'preprocess' | 'inference' | 'render', ms: number) {
    const existing = this.samples.get(stage) ?? []
    existing.push(ms)
    if (existing.length > 100) existing.shift()
    this.samples.set(stage, existing)
  }

  percentile(stage: string, p: number): number {
    const sorted = [...(this.samples.get(stage) ?? [])].sort((a, b) => a - b)
    if (sorted.length === 0) return 0
    const idx = Math.floor((p / 100) * sorted.length)
    return sorted[Math.min(idx, sorted.length - 1)]
  }

  report(): Record<string, { p50: number; p95: number; p99: number }> {
    const report: Record<string, { p50: number; p95: number; p99: number }> = {}
    for (const stage of this.samples.keys()) {
      report[stage] = {
        p50: this.percentile(stage, 50),
        p95: this.percentile(stage, 95),
        p99: this.percentile(stage, 99)
      }
    }
    return report
  }
}

7. 통합 예시: 학술 논문 파이프라인

<!-- AcademicPaperPipeline.vue -->
<template>
  <div class="pipeline">
    <FileDropZone @uploaded="onDocumentReady" />
    
    <div v-if="pipelineState === 'analyzing'" class="stage-indicator">
      <span>{{ currentStage }}</span>
      <progress :value="progressValue" max="100" />
    </div>
    
    <div v-if="extractedStructure" class="structured-output">
      <header>
        <h1>{{ extractedStructure.title }}</h1>
        <p class="authors">{{ extractedStructure.authors.join('; ') }}</p>
      </header>
      
      <section class="abstract">
        <h2>Abstract</h2>
        <p>{{ extractedStructure.abstract }}</p>
      </section>
      
      <nav class="section-nav">
        <a 
          v-for="sec in extractedStructure.sections" 
          :key="sec.heading"
          @click="scrollToSection(sec.heading)"
        >
          {{ sec.heading }}
        </a>
      </nav>
      
      <article 
        v-for="sec in extractedStructure.sections" 
        :key="sec.heading"
        :id="sec.heading"
      >
        <h3>{{ sec.heading }}</h3>
        <div v-html="renderMarkdown(sec.body)" />
      </article>
      
      <footer class="references">
        <h2>References</h2>
        <ol>
          <li v-for="ref in extractedStructure.citations" :key="ref.id">
            <span v-html="formatCitation(ref)" />
          </li>
        </ol>
      </footer>
    </div>
  </div>
</template>

<script setup lang="ts">
import { ref } from 'vue'

interface PaperStructure {
  title: string
  authors: string[]
  abstract: string
  sections: Array<{ heading: string; body: string }>
  citations: Array<{ id: string; raw: string }>
}

const pipelineState = ref<'idle' | 'analyzing' | 'complete' | 'failed'>('idle')
const currentStage = ref('')
const progressValue = ref(0)
const extractedStructure = ref<PaperStructure | null>(null)

const onDocumentReady = async (_file: File, docId: string) => {
  pipelineState.value = 'analyzing'
  
  const channel = new RealtimeChannel('wss://api.example.com')
  channel.establish(docId)
  
  channel.subscribe('stage-update', (data) => {
    currentStage.value = (data as any).stage
    progressValue.value = (data as any).percent
  })
  
  channel.subscribe('structure-ready', (data) => {
    extractedStructure.value = data as PaperStructure
    pipelineState.value = 'complete'
    channel.terminate()
  })
}

const scrollToSection = (heading: string) => {
  document.getElementById(heading)?.scrollIntoView({ behavior: 'smooth' })
}

const renderMarkdown = (raw: string) => {
  // 마크다운 파서 호출
  return raw
}

const formatCitation = (ref: { id: string; raw: string }) => {
  return ref.raw
}
</script>

태그: vue3 LightOnOCR-2-1B websocket Canvas API 가상 스크롤

6월 11일 16:19에 게시됨