1. 시스템 개요 및 목표
문서 업무에서 즉각적인 텍스트 추출은 생산성 향상의 핵심이다. 기존 OCR 서비스는 파일 업로드 후 결과를 기다리는 방식이었으나, 본 시스템은 문서를 업로드하는 순간부터 인식 결과를 스트리밍하는 실시간 파이프라인을 구축한다.
LightOnOCR-2-1B는 10억 파라미터 규모의 경량 엔드투엔드 OCR 모델로, 대용량 모델 대비 낮은 추론 비용으로 문서 인식에서 경쟁력 있는 성능을 보인다. Vue3의 반응형 시스템과 결합하면 문서 처리 워크플로우를 웹 브라우저 내에서 원활하게 구현할 수 있다.
핵심 가치:
- 스트리밍 기반 결과 확인: 페이지 단위로 점진적 표시
- 협업 환경 지원: 주석과 수정 내역 실시간 동기화
- 에지 디바이스 대응: 낮은 리소스 요구량으로 모바일 환경 최적화
- 운영 비용 절감: 경량 모델로 인프라 부담 최소화
2. 아키텍처 설계
2.1 데이터 흐름 설계
클라이언트-서버 간 데이터 흐름을 다음과 같이 설계한다:
클라이언트 역할
- 문서 파일의 클라이언트 사이드 분할 업로드
- WebSocket을 통한 서버 푸시 메시지 수신
- Canvas 기반 뷰포트 렌더링 및 텍스트 오버레이
- 사용자 주석 데이터의 낙관적 업데이트
서버 역할
- LightOnOCR-2-1B 모델의 배치 추론 스케줄링
- 업로드 청크 조립 및 페이지별 큐 관리
- Redis Pub/Sub을 활용한 다중 인스턴스 간 상태 공유
- 인식 결과의 구조화 저장 및 증분 전송
2.2 기술 스택 구성
| 영역 | 기술 | 용도 |
|---|---|---|
| 프론트엔드 | Vue 3.4 + Composition API | UI 상태 관리 및 컴포넌트 구성 |
| 프론트엔드 | TypeScript 5.x | 타입 안전성 확보 |
| 프론트엔드 | Native WebSocket | 서버-클라이언트 실시간 통신 |
| 프론트엔드 | OffscreenCanvas | 그라운드 렌더링 워커 |
| 백엔드 | Sanic / Uvicorn | 고성능 비동기 서버 |
| 백엔드 | python-socketio | Socket.IO 프로토콜 지원 |
| 백엔드 | LightOnOCR-2-1B (ONNX Runtime) | 추론 엔진 |
| 백엔드 | Celery + Redis | 분산 태스크 큐 |
3. 핵심 모듈 구현
3.1 문서 업로드 및 진행률 추적
대용량 파일 업로드 시 진행 상황을 시각화하는 컴포넌트:
<!-- FileDropZone.vue -->
<template>
<div
class="drop-zone"
:class="{ 'drag-over': isDragging }"
@dragenter.prevent="isDragging = true"
@dragleave.prevent="isDragging = false"
@drop.prevent="onFileDrop"
@click="triggerFileInput"
>
<input
ref="fileInput"
type="file"
hidden
accept="application/pdf,image/*"
@change="onInputChange"
/>
<div v-if="!activeTransfer" class="placeholder">
<span>문서를 여기에 놓거나 클릭하여 선택</span>
</div>
<div v-else class="transfer-status">
<div class="filename">{{ activeTransfer.filename }}</div>
<div class="progress-track">
<div
class="progress-fill"
:style="{ width: activeTransfer.percentage + '%' }"
></div>
</div>
<span class="percentage">{{ activeTransfer.percentage }}%</span>
</div>
</div>
</template>
<script setup lang="ts">
import { ref } from 'vue'
interface TransferInfo {
filename: string
percentage: number
bytesSent: number
totalBytes: number
}
const emit = defineEmits<{ (e: 'uploaded', file: File, id: string): void }>()
const fileInput = ref<HTMLInputElement | null>(null)
const isDragging = ref(false)
const activeTransfer = ref<TransferInfo | null>(null)
const triggerFileInput = () => fileInput.value?.click()
const onFileDrop = (evt: DragEvent) => {
isDragging.value = false
const dropped = evt.dataTransfer?.files[0]
if (dropped) initiateUpload(dropped)
}
const onInputChange = (evt: Event) => {
const selected = (evt.target as HTMLInputElement).files?.[0]
if (selected) initiateUpload(selected)
}
const initiateUpload = async (file: File) => {
const chunkSize = 256 * 1024 // 256KB 청크
const docId = crypto.randomUUID()
activeTransfer.value = {
filename: file.name,
percentage: 0,
bytesSent: 0,
totalBytes: file.size
}
for (let offset = 0; offset < file.size; offset += chunkSize) {
const slice = file.slice(offset, Math.min(offset + chunkSize, file.size))
await transmitChunk(docId, slice, offset, file.size)
activeTransfer.value.bytesSent = Math.min(offset + chunkSize, file.size)
activeTransfer.value.percentage = Math.round(
(activeTransfer.value.bytesSent / file.size) * 100
)
}
emit('uploaded', file, docId)
activeTransfer.value = null
}
const transmitChunk = async (
docId: string,
blob: Blob,
position: number,
total: number
) => {
// 실제 구현 시 fetch 또는 XMLHttpRequest 사용
console.log(`Chunk at ${position}/${total} for ${docId}`)
}
</script>
3.2 실시간 통신 계층
재연결 로직과 메시지 버퍼링을 내장한 소켓 관리자:
// realtime/channel.ts
type MessageHandler = (payload: unknown) => void
interface QueuedMessage {
topic: string
payload: unknown
timestamp: number
}
export class RealtimeChannel {
private ws: WebSocket | null = null
private handlers = new Map<string, Set<MessageHandler>>()
private outboundQueue: QueuedMessage[] = []
private reconnectTimer: ReturnType<typeof setTimeout> | null = null
private attemptCount = 0
private readonly maxRetries = 7
private readonly baseDelay = 500
constructor(private endpoint: string) {}
establish(documentToken: string) {
const url = `${this.endpoint}/stream/${documentToken}`
this.ws = new WebSocket(url)
this.ws.onopen = () => {
this.attemptCount = 0
this.flushQueue()
}
this.ws.onmessage = (ev) => {
try {
const envelope = JSON.parse(ev.data)
this.dispatch(envelope.topic, envelope.data)
} catch {
console.warn('Malformed message received')
}
}
this.ws.onclose = () => this.scheduleReconnect(documentToken)
this.ws.onerror = () => this.ws?.close()
}
subscribe(topic: string, handler: MessageHandler) {
if (!this.handlers.has(topic)) {
this.handlers.set(topic, new Set())
}
this.handlers.get(topic)!.add(handler)
return () => {
this.handlers.get(topic)?.delete(handler)
}
}
publish(topic: string, payload: unknown, urgent = false) {
const message: QueuedMessage = { topic, payload, timestamp: Date.now() }
if (urgent && this.ws?.readyState === WebSocket.OPEN) {
this.ws.send(JSON.stringify(message))
return
}
if (this.ws?.readyState === WebSocket.OPEN) {
this.ws.send(JSON.stringify(message))
} else {
this.outboundQueue.push(message)
}
}
private flushQueue() {
while (this.outboundQueue.length > 0) {
const msg = this.outboundQueue.shift()!
this.ws?.send(JSON.stringify(msg))
}
}
private dispatch(topic: string, data: unknown) {
this.handlers.get(topic)?.forEach(fn => {
try { fn(data) } catch (err) { console.error(err) }
})
}
private scheduleReconnect(token: string) {
if (this.attemptCount >= this.maxRetries) return
const jitter = Math.random() * 300
const delay = Math.min(
this.baseDelay * Math.pow(2, this.attemptCount) + jitter,
30000
)
this.reconnectTimer = setTimeout(() => {
this.attemptCount++
this.establish(token)
}, delay)
}
terminate() {
this.reconnectTimer && clearTimeout(this.reconnectTimer)
this.ws?.close()
this.handlers.clear()
this.outboundQueue = []
}
}
3.3 가상화된 문서 뷰어
수천 페이지 문서에서도 일정한 메모리 사용량을 유지하는 렌더러:
<!-- VirtualDocumentViewer.vue -->
<template>
<div
ref="scroller"
class="doc-scroller"
@scroll.passive="onScroll"
>
<div :style="{ height: estimatedTotalHeight + 'px' }" class="sizer"></div>
<div
v-for="page in renderedPages"
:key="page.index"
class="page-wrapper"
:style="computePageStyle(page)"
>
<canvas :ref="el => bindCanvas(el, page.index)" />
<div class="text-overlay" v-html="page.ocrHtml" />
</div>
</div>
</template>
<script setup lang="ts">
import { ref, computed, onMounted } from 'vue'
interface PageViewport {
index: number
naturalWidth: number
naturalHeight: number
scale: number
ocrHtml: string
}
const props = defineProps<{
pageCount: number
pageSource: (idx: number) => Promise<PageViewport>
}>()
const scroller = ref<HTMLElement | null>(null)
const scrollY = ref(0)
const containerHeight = ref(800)
const averagePageHeight = 1100
const overscan = 2
const estimatedTotalHeight = computed(() =>
props.pageCount * averagePageHeight
)
const visibleRange = computed(() => {
const start = Math.floor(scrollY.value / averagePageHeight)
const visibleCount = Math.ceil(containerHeight.value / averagePageHeight)
return {
start: Math.max(0, start - overscan),
end: Math.min(props.pageCount, start + visibleCount + overscan)
}
})
const pageCache = new Map<number, PageViewport>()
const renderedPages = computed(() => {
const { start, end } = visibleRange.value
const result: PageViewport[] = []
for (let i = start; i < end; i++) {
if (pageCache.has(i)) {
result.push(pageCache.get(i)!)
} else {
props.pageSource(i).then(vp => pageCache.set(i, vp))
}
}
return result
})
const computePageStyle = (page: PageViewport) => ({
position: 'absolute' as const,
top: `${page.index * averagePageHeight}px`,
width: '100%',
height: `${page.naturalHeight * page.scale}px`
})
const bindCanvas = (el: unknown, index: number) => {
if (!(el instanceof HTMLCanvasElement)) return
// Canvas 2D context 초기화 및 페이지 이미지 그리기
}
const onScroll = () => {
scrollY.value = scroller.value?.scrollTop ?? 0
}
onMounted(() => {
const ro = new ResizeObserver(entries => {
containerHeight.value = entries[0].contentRect.height
})
if (scroller.value) ro.observe(scroller.value)
})
</script>
4. 상호작용형 주석 시스템
4.1 주석 데이터 구조
// annotations/schema.ts
export const ANNOTATION_TYPES = ['highlight', 'strikethrough', 'inline-note', 'region-crop'] as const
export type AnnotationKind = typeof ANNOTATION_TYPES[number]
export interface BoundingBox {
pageIndex: number
leftPx: number
topPx: number
rightPx: number
bottomPx: number
}
export interface AnnotationRecord {
annotationId: string
kind: AnnotationKind
bounds: BoundingBox
creator: string
createdAt: string // ISO 8601
modifiedAt: string
resolved: boolean
payload: HighlightPayload | NotePayload | CropPayload
}
interface HighlightPayload {
colorHex: string
opacity: number
}
interface NotePayload {
markdownContent: string
replyTo: string | null
}
interface CropPayload {
extractedImageUrl: string
originalBounds: BoundingBox
}
4.2 Canvas 기반 인터랙션
<!-- AnnotationSurface.vue -->
<template>
<canvas
ref="surface"
class="annotation-surface"
@pointerdown="handlePointerStart"
@pointermove="handlePointerMove"
@pointerup="handlePointerEnd"
/>
</template>
<script setup lang="ts">
import { ref, onMounted } from 'vue'
const surface = ref<HTMLCanvasElement | null>(null)
const activeTool = ref<'select' | 'highlight' | 'note'>('select')
const draftShape = ref<BoundingBox | null>(null)
const committedShapes = ref<AnnotationRecord[]>([])
let isDragging = false
let dragOrigin = { x: 0, y: 0 }
const handlePointerStart = (ev: PointerEvent) => {
if (activeTool.value === 'select') return
const rect = surface.value!.getBoundingClientRect()
isDragging = true
dragOrigin = {
x: ev.clientX - rect.left,
y: ev.clientY - rect.top
}
draftShape.value = {
pageIndex: 0, // 현재 보고 있는 페이지 인덱스
leftPx: dragOrigin.x,
topPx: dragOrigin.y,
rightPx: dragOrigin.x,
bottomPx: dragOrigin.y
}
}
const handlePointerMove = (ev: PointerEvent) => {
if (!isDragging || !draftShape.value) return
const rect = surface.value!.getBoundingClientRect()
const currentX = ev.clientX - rect.left
const currentY = ev.clientY - rect.top
draftShape.value = {
...draftShape.value,
leftPx: Math.min(dragOrigin.x, currentX),
topPx: Math.min(dragOrigin.y, currentY),
rightPx: Math.max(dragOrigin.x, currentX),
bottomPx: Math.max(dragOrigin.y, currentY)
}
requestAnimationFrame(renderOverlay)
}
const handlePointerEnd = () => {
if (!draftShape.value) return
// 최종 주석 확정 및 서버 전송
const finalized: AnnotationRecord = {
annotationId: generateUUID(),
kind: activeTool.value === 'highlight' ? 'highlight' : 'inline-note',
bounds: draftShape.value,
creator: 'current-user',
createdAt: new Date().toISOString(),
modifiedAt: new Date().toISOString(),
resolved: false,
payload: { colorHex: '#FFD700', opacity: 0.4 }
}
committedShapes.value.push(finalized)
draftShape.value = null
isDragging = false
renderOverlay()
}
const renderOverlay = () => {
const ctx = surface.value!.getContext('2d')!
ctx.clearRect(0, 0, surface.value!.width, surface.value!.height)
// 확정된 주석 그리기
for (const ann of committedShapes.value) {
ctx.fillStyle = (ann.payload as HighlightPayload).colorHex
ctx.globalAlpha = (ann.payload as HighlightPayload).opacity
ctx.fillRect(
ann.bounds.leftPx,
ann.bounds.topPx,
ann.bounds.rightPx - ann.bounds.leftPx,
ann.bounds.bottomPx - ann.bounds.topPx
)
}
// 현재 그리고 있는 영역
if (draftShape.value) {
ctx.strokeStyle = '#0066FF'
ctx.lineWidth = 2
ctx.setLineDash([4, 4])
ctx.strokeRect(
draftShape.value.leftPx,
draftShape.value.topPx,
draftShape.value.rightPx - draftShape.value.leftPx,
draftShape.value.bottomPx - draftShape.value.topPx
)
ctx.setLineDash([])
}
}
onMounted(() => {
// DPI에 맞춰 canvas 크기 조정
})
</script>
5. 성능 최적화 기법
5.1 이미지 디코딩 파이프라인
// workers/image-decoder.worker.ts
self.onmessage = async (ev) => {
const { imageUrl, targetWidth, requestId } = ev.data
const bitmap = await createImageBitmap(
await fetch(imageUrl).then(r => r.blob()),
{ resizeWidth: targetWidth, resizeQuality: 'medium' }
)
const offscreen = new OffscreenCanvas(bitmap.width, bitmap.height)
const ctx = offscreen.getContext('2d')!
ctx.drawImage(bitmap, 0, 0)
const imageData = ctx.getImageData(0, 0, bitmap.width, bitmap.height)
self.postMessage({
requestId,
imageData,
width: bitmap.width,
height: bitmap.height
}, [imageData.data.buffer])
}
5.2 메모리 압력 완화
// utils/resource-lru.ts
export class ImageLRU {
private cache = new Map<string, ImageBitmap>()
private maxEntries: number
constructor(capacity: number) {
this.maxEntries = capacity
}
get(key: string): ImageBitmap | undefined {
const hit = this.cache.get(key)
if (hit) {
this.cache.delete(key)
this.cache.set(key, hit) // MRU 재배치
}
return hit
}
put(key: string, value: ImageBitmap) {
if (this.cache.has(key)) {
this.cache.get(key)!.close()
this.cache.delete(key)
}
while (this.cache.size >= this.maxEntries) {
const oldest = this.cache.keys().next().value
this.cache.get(oldest)!.close()
this.cache.delete(oldest)
}
this.cache.set(key, value)
}
evictPage(pageIndex: number) {
const prefix = `page-${pageIndex}-`
for (const [k, v] of this.cache.entries()) {
if (k.startsWith(prefix)) {
v.close()
this.cache.delete(k)
}
}
}
}
6. 운영 경 고려사항
6.1 장애 복구 패턴
// resilience/circuit-breaker.ts
type CircuitState = 'closed' | 'open' | 'half-open'
export class OcrCircuitBreaker {
private state: CircuitState = 'closed'
private failures = 0
private nextAttempt = 0
private readonly threshold = 5
private readonly timeout = 30000
async execute<T>(operation: () => Promise<T>): Promise<T> {
if (this.state === 'open') {
if (Date.now() < this.nextAttempt) {
throw new Error('Circuit breaker is OPEN')
}
this.state = 'half-open'
}
try {
const result = await operation()
this.onSuccess()
return result
} catch (err) {
this.onFailure()
throw err
}
}
private onSuccess() {
this.failures = 0
this.state = 'closed'
}
private onFailure() {
this.failures++
if (this.failures >= this.threshold) {
this.state = 'open'
this.nextAttempt = Date.now() + this.timeout
}
}
}
6.2 성능 지표 수집
// telemetry/ocr-metrics.ts
export class OcrMetricsCollector {
private samples: Map<string, number[]> = new Map()
record(stage: 'upload' | 'preprocess' | 'inference' | 'render', ms: number) {
const existing = this.samples.get(stage) ?? []
existing.push(ms)
if (existing.length > 100) existing.shift()
this.samples.set(stage, existing)
}
percentile(stage: string, p: number): number {
const sorted = [...(this.samples.get(stage) ?? [])].sort((a, b) => a - b)
if (sorted.length === 0) return 0
const idx = Math.floor((p / 100) * sorted.length)
return sorted[Math.min(idx, sorted.length - 1)]
}
report(): Record<string, { p50: number; p95: number; p99: number }> {
const report: Record<string, { p50: number; p95: number; p99: number }> = {}
for (const stage of this.samples.keys()) {
report[stage] = {
p50: this.percentile(stage, 50),
p95: this.percentile(stage, 95),
p99: this.percentile(stage, 99)
}
}
return report
}
}
7. 통합 예시: 학술 논문 파이프라인
<!-- AcademicPaperPipeline.vue -->
<template>
<div class="pipeline">
<FileDropZone @uploaded="onDocumentReady" />
<div v-if="pipelineState === 'analyzing'" class="stage-indicator">
<span>{{ currentStage }}</span>
<progress :value="progressValue" max="100" />
</div>
<div v-if="extractedStructure" class="structured-output">
<header>
<h1>{{ extractedStructure.title }}</h1>
<p class="authors">{{ extractedStructure.authors.join('; ') }}</p>
</header>
<section class="abstract">
<h2>Abstract</h2>
<p>{{ extractedStructure.abstract }}</p>
</section>
<nav class="section-nav">
<a
v-for="sec in extractedStructure.sections"
:key="sec.heading"
@click="scrollToSection(sec.heading)"
>
{{ sec.heading }}
</a>
</nav>
<article
v-for="sec in extractedStructure.sections"
:key="sec.heading"
:id="sec.heading"
>
<h3>{{ sec.heading }}</h3>
<div v-html="renderMarkdown(sec.body)" />
</article>
<footer class="references">
<h2>References</h2>
<ol>
<li v-for="ref in extractedStructure.citations" :key="ref.id">
<span v-html="formatCitation(ref)" />
</li>
</ol>
</footer>
</div>
</div>
</template>
<script setup lang="ts">
import { ref } from 'vue'
interface PaperStructure {
title: string
authors: string[]
abstract: string
sections: Array<{ heading: string; body: string }>
citations: Array<{ id: string; raw: string }>
}
const pipelineState = ref<'idle' | 'analyzing' | 'complete' | 'failed'>('idle')
const currentStage = ref('')
const progressValue = ref(0)
const extractedStructure = ref<PaperStructure | null>(null)
const onDocumentReady = async (_file: File, docId: string) => {
pipelineState.value = 'analyzing'
const channel = new RealtimeChannel('wss://api.example.com')
channel.establish(docId)
channel.subscribe('stage-update', (data) => {
currentStage.value = (data as any).stage
progressValue.value = (data as any).percent
})
channel.subscribe('structure-ready', (data) => {
extractedStructure.value = data as PaperStructure
pipelineState.value = 'complete'
channel.terminate()
})
}
const scrollToSection = (heading: string) => {
document.getElementById(heading)?.scrollIntoView({ behavior: 'smooth' })
}
const renderMarkdown = (raw: string) => {
// 마크다운 파서 호출
return raw
}
const formatCitation = (ref: { id: string; raw: string }) => {
return ref.raw
}
</script>