Java HashMap 핵심 소스 코드 직접 구현하기

이전 글에서는 LinkedList의 핵심 소스 코드를 직접 구현해 보았습니다. 이번에는 Java HashMap의 핵심 소스 코드를 직접 구현해 보겠습니다. HashMap의 원리를 먼저 살펴보겠습니다.

HashMap은 이름에서 알 수 있듯이 hash와 map의 조합입니다. map은 매핑이라는 의미이고, HashMap은 hash를 활용하여 키-값 쌍을 저장하는 자료구조입니다. 구체적인 원리를 살펴보겠습니다.

HashMap 사용 예시

// 데이터 저장
CustomHashMap<String, String> map = new CustomHashMap<>();
map.put("name", "tom");

// 데이터 조회
System.out.println(map.get("name")); // tom 출력

사용 방법은 매우 간단합니다.

HashMap 원리 분석

Object 클래스에는 hashCode() 메서드가 있어 객체의 해시 값을 반환합니다. 이 값을 메모리 주소라고 이해해도 됩니다.

핵심 포인트 1: hashCode() 메서드는 숫자를 반환합니다.

핵심 포인트 2: HashMap 내부에서는 배열을 사용하여 데이터를 저장합니다.

데이터 저장 원리

배열의 크기가 7이라고 가정해보겠습니다.

배열의 인덱스 범위는 [0, 6]입니다.
"name"의 hashCode() 값을 구합니다.
해당 값을 7로 나머지 연산하면 결과 범위는 [0, 6]이 되어 배열 인덱스와 일치합니다.
예를 들어 "name".hashCode() % 7이 2라면, 값 "tom"은 배열의 2번 인덱스에 저장됩니다.

데이터 조회 원리

조회할 키의 hashCode() 값을 구합니다.
배열 크기로 나머지 연산을 수행합니다 (저장할 때와 동일한 위치가 됩니다).
해당 인덱스에서 값을 가져옵니다.

해결해야 할 문제점

해시 충돌 문제:
서로 다른 키가 같은 배열 인덱스에 매핑될 수 있습니다. 예를 들어 9 % 7 == 2, 16 % 7 == 2 모두 2가 됩니다. 이 문제는 연결 리스트로 해결합니다.

배열 용량 초과 문제:
배열이 가득 찰 경우 ArrayList와 마찬가지로扩容이 필요합니다.

해시 값 분포 문제:
직접 hashCode()를 사용하면 해시 충돌이 발생할 확률이 높습니다. JDK HashMap에서는 추가적인 해시 연산을 수행합니다.

결론적으로, HashMap은 배열과 연결 리스트의 조합 구조를 가집니다. 충돌이 발생하면 새 요소를链表의 앞에 추가합니다.

HashMap 핵심 소스 구현

이제 원리를 이해했으니 최소한의 코드로 HashMap을 구현해 보겠습니다. 클래스 이름은 CustomHashMap으로 하고, 배열에 저장할 요소는 Node 클래스로 정의합니다.

// 배열에 저장될 노드 클래스
public static class Node<K, V> {
    K key;          // 키 저장
    V value;        // 값 저장
    int hash;       // 키의 해시 값
    
    // 해시 충돌 시 연결 리스트로 처리
    // 새 요소를链表의 맨 앞에 추가하고, 기존 요소는 next로 연결
    Node<K, V> next;

    public Node(K key, V value, int hash, Node<K, V> next) {
        this.key = key;
        this.value = value;
        this.hash = hash;
        this.next = next;
    }
}

CustomHashMap 클래스의 필드 선언은 다음과 같습니다.

public class CustomHashMap<K, V> {
    // 기본 배열 크기
    private static final int DEFAULT_INITIAL_CAPACITY = 16;

    // 기본 부하율
    // 요소 개수가 배열 크기의 75%에 도달하면扩容 시작
    private static final float DEFAULT_LOAD_FACTOR = 0.75f;

    // 데이터를 저장할 배열
    private Node[] table;

    // 배열에 저장된 요소 개수
    private int size;
}

생성자 구현입니다.

public CustomHashMap() {
    table = new Node[DEFAULT_INITIAL_CAPACITY];
    size = 0;
}

이제 데이터를 저장하는 put() 메서드를 구현합니다.

/**
 * 키-값 쌍 저장
 * 동일 키로多次 저장 시后来的 값이 이전 값을 덮어쓰고 이전 값 반환
 * 예: put("name", "tom"), put("name", "lilei") → "tom" 반환, "lilei" 저장
 */
public V put(K key, V value) {
    // 간단한 구현을 위해 null 키는 지원하지 않음
    if (key == null) {
        throw new RuntimeException("key cannot be null");
    }

    // hashCode 값을 한 번 더 연산하여 해시 값 계산
    int hash = hash(key.hashCode());

    // 해시 값을 배열 크기로 매핑하여 저장 위치 결정
    int index = indexFor(hash, table.length);

    // 해당 위치에 기존 요소가 있는지 확인
    Node<K, V> current = table[index];
    while (current != null) {
        // 해시 값과 키 비교
        if (current.hash == hash && (key == current.key || key.equals(current.key))) {
            V oldValue = current.value;
            current.value = value;
            return oldValue;
        }
        current = current.next;
    }

    // 동일한 키가 없으면 새 노드 추가
    Node<K, V> next = table[index];
    table[index] = new Node<>(key, value, hash, next);

    // 부하율 초과 시扩容 수행
    if (size++ >= (table.length * DEFAULT_LOAD_FACTOR)) {
        resize();
    }

    return null;
}

hash() 함수와 indexFor() 함수의 구현입니다.

// JDK HashMap의 해시 함수 구현
static int hash(int h) {
    h ^= (h >>> 20) ^ (h >>> 12);
    return h ^ (h >>> 7) ^ (h >>> 4);
}

// 해시 값을 배열 인덱스로 변환
static int indexFor(int h, int length) {
    // 가장 이해하기 쉬운 방법: length로 나머지 연산
    // 결과 범위는 [0, length - 1]
    h = h > 0 ? h : -h; // 음수 방지
    return h % length;
}

扩容을 수행하는 resize() 메서드입니다.

// 배열 용량 확장
// 요소 개수가 table.length * 0.75 초과 시扩容
// 새 배열 크기는 기존 크기의 2배
private void resize() {
    int newCapacity = table.length * 2;
    Node[] newTable = new Node[newCapacity];

    Node[] oldTable = table;

    // 기존 배열의 모든 요소를 새 배열로 재매핑
    for (int i = 0; i < oldTable.length; i++) {
        Node<K, V> e = oldTable[i];
        oldTable[i] = null;

        // 연결 리스트인 경우 모든 노드 처리
        while (e != null) {
            Node<K, V> next = e.next;

            // 새 배열에서 인덱스 계산
            int newIndex = indexFor(e.hash, newCapacity);

            // 새 위치에 연결
            e.next = newTable[newIndex];
            newTable[newIndex] = e;

            e = next;
        }
    }

    table = newTable;
}

데이터를 조회하는 get() 메서드 구현입니다.

// 키로 값 조회
public V get(K key) {
    if (key == null) {
        throw new RuntimeException("key cannot be null");
    }

    // 키의 해시 값 계산
    int hash = hash(key.hashCode());

    // 배열 위치 계산
    int index = indexFor(hash, table.length);

    // 연결 리스트 순회하며 키 탐색
    Node<K, V> e = table[index];
    while (e != null) {
        if (hash == e.hash && (key == e.key || key.equals(e.key))) {
            return e.value;
        }
        e = e.next;
    }

    return null;
}

CustomHashMap 전체 소스 코드

public class CustomHashMap<K, V> {
    private static final int DEFAULT_INITIAL_CAPACITY = 16;
    private static final float DEFAULT_LOAD_FACTOR = 0.75f;

    private Node[] table;
    private int size;

    public static class Node<K, V> {
        K key;
        V value;
        int hash;
        Node<K, V> next;

        public Node(K key, V value, int hash, Node<K, V> next) {
            this.key = key;
            this.value = value;
            this.hash = hash;
            this.next = next;
        }
    }

    public CustomHashMap() {
        table = new Node[DEFAULT_INITIAL_CAPACITY];
        size = 0;
    }

    public V get(K key) {
        if (key == null) {
            throw new RuntimeException("key cannot be null");
        }

        int hash = hash(key.hashCode());
        int index = indexFor(hash, table.length);

        Node<K, V> e = table[index];
        while (e != null) {
            if (hash == e.hash && (key == e.key || key.equals(e.key))) {
                return e.value;
            }
            e = e.next;
        }

        return null;
    }

    public V put(K key, V value) {
        if (key == null) {
            throw new RuntimeException("key cannot be null");
        }

        int hash = hash(key.hashCode());
        int index = indexFor(hash, table.length);

        Node<K, V> e = table[index];
        while (e != null) {
            if (e.hash == hash && (key == e.key || key.equals(e.key))) {
                V oldValue = e.value;
                e.value = value;
                return oldValue;
            }
            e = e.next;
        }

        Node<K, V> next = table[index];
        table[index] = new Node<>(key, value, hash, next);

        if (size++ >= (table.length * DEFAULT_LOAD_FACTOR)) {
            resize();
        }

        return null;
    }

    private void resize() {
        int newCapacity = table.length * 2;
        Node[] newTable = new Node[newCapacity];

        Node[] src = table;

        for (int j = 0; j < src.length; j++) {
            Node<K, V> e = src[j];
            src[j] = null;

            while (e != null) {
                Node<K, V> next = e.next;
                int i = indexFor(e.hash, newCapacity);
                e.next = newTable[i];
                newTable[i] = e;
                e = next;
            }
        }

        table = newTable;
    }

    static int hash(int h) {
        h ^= (h >>> 20) ^ (h >>> 12);
        return h ^ (h >>> 7) ^ (h >>> 4);
    }

    static int indexFor(int h, int length) {
        h = h > 0 ? h : -h;
        return h % length;
    }
}

테스트 코드

public static void main(String[] args) {
    CustomHashMap<String, String> map = new CustomHashMap<>();
    map.put("name", "tom");
    map.put("age", "23");
    map.put("address", "beijing");
    String oldValue = map.put("address", "shanghai");

    System.out.println(map.get("name"));
    System.out.println(map.get("age"));

    System.out.println("Old Value = " + oldValue);
    System.out.println("New Value = " + map.get("address"));
}

실행 결과

tom
23
Old Value = beijing
New Value = shanghai

위에서 구현한 CustomHashMap으로 기본적인 HashMap 기능을 사용할 수 있습니다. 다만 remove, clear, containsKey, 반복자 등의 기능은 구현하지 않았습니다. 관심 있는 독자분들은 직접 구현해 보시는 것을 추천합니다.

태그: java HashMap hash-table data-structure implementation

6월 22일 19:56에 게시됨

괴물 클럽

Java HashMap 핵심 소스 코드 직접 구현하기