联系方式

您当前位置:首页 >> C/C++编程C/C++编程

日期:2024-12-08 09:23

CSE2425, C programming lab, course 2020-2021

Final assignment: Hash map

1 Introduction

In this final assignment you will implement a hash map

1

. A hash map is a data

structure that associates a key with a value (a chunk of data). Most hash maps

are implemented as an array of so-called buckets. A hash function translates

a given key (e.g., a name) to an index in the array, where the corresponding

bucket is stored.

Below we will specify the data structures that you have to provide, and the

functions that you have to implement. This assignment includes two bonus

functions that can raise your score from pass (C) to good (B) to excellent (A).

2 Testing

The first part of the assignment consist of implementing a test set for the hash

map. We have created a number of incorrect hash map implementations. The

goal is to create a test set on which these incorrect implementations fail. When

you have finished creating this test set, you can use this test set to test your own

implementation by copy&pasting it into the my tests of the Hashmap assignment

in Weblab.

3 Hash map structure

Define a type HashMap, which represents the hash map data structure.

Note: Use typedef such that a HashMap structure can be used without using

the struct keyword, i.e. the following construction should be possible:

HashMap *hm;

4 Creating a hash map

1. Implement a function create_hashmap that returns a pointer to the newly

constructed HashMap structure and has parameter

ˆ key_space, a size_t

2

that represents the number of buckets in the hash

map.

1http://en.wikipedia.org/wiki/Hashmap

2http://en.wikipedia.org/wiki/Size_t

1CSE2425, C programming lab, course 2020-2021

This function should allocate enough memory to fit key_space buckets, and the

allocated memory should be zeroed (i.e., NULLed).

2. A hash function maps a string (i.e. an array of chars ending with a null

character) to an index, so it returns a unsigned int. The parameter of a hash

function is simply a

ˆ key, a null-terminated string of characters.

As the hash map can only hold up to key_space buckets, using the hash function

–for example to lookup a mapping– requires some care; apply modulo key_space

to the result such that the value will be in the available bucket range.

3. A default hash function named hash should be implemented. This function

should sum all ASCII values of the characters of the key.

For example:

char *key = "AC";

unsigned int h = hash(key);

=> h = 132

5 Inserting data

Implement a function insert_data that has parameters

ˆ hm, a pointer to a hash map;

ˆ key, a null-terminated string of characters;

ˆ data, a void pointer to the source data;

ˆ resolve_collision, a ResolveCollisionCallback (see below).

The function should store the data pointer and a copy of the key in the bucket

that can be found by applying the hash function on the key. In case of a

collision, i.e. when there already is data with the same key in the hash map, the

resolve_collision function should be called with the the previously stored

data and data as arguments and the returned void pointer should be stored in

the bucket instead.

ResolveCollisionCallback, a pointer to a function that returns a void pointer

and has two parameters:

ˆ old_data, a void pointer to the previously stored data;

ˆ new_data, a void pointer to the data that is being newly inserted.

The function should determine what data is stored in the has map in case of a

key collision by returning the void pointer to the data that is to be stored.

2CSE2425, C programming lab, course 2020-2021

6 Retrieving data

Implement a function get_data that has parameters

ˆ hm, a pointer to a hash map;

ˆ key, a null-terminated string of characters.

The function should return the data pointer (a void pointer) in the hash map

that is associated with the key. If the key is not present in the hash map, NULL

should be returned.

7 Iterator

Implement a function iterate that has parameters

ˆ hm, a pointer to a hash map;

ˆ callback, a pointer to a function that returns nothing (i.e. void) and has

two parameters:

– key, a null-terminated string of characters;

– data, a void pointer to the data.

This function should iterate over the entire hash map. For each data element

it finds, the callback function should be called with the two members of the

element.

8 Removing data

Implement a function remove_data that has parameters

ˆ hm, a pointer to a hash map;

ˆ key, a null-terminated string of characters.

ˆ destroy_data, a DestroyDataCallback (see below).

This function should remove the element in the hash map that is associated with

the given key. If the destroy_data parameter is non-NULL it should be called

with the data pointer of the element as argument. If the key is not present, the

hash map should remain untouched. As the remove_data function cannot fail,

its return type is void.

DestroyDataCallback, a pointer to to a function that returns nothing (i.e.

void) and has one parameter:

ˆ data, a void pointer.

The function should clean up the data (e.g. free allocated memory).

3CSE2425, C programming lab, course 2020-2021

9 Deleting a hash map

Implement a function delete_hashmap that has parameters

ˆ hm, a pointer to the hash map that is to be deleted;

ˆ destroy_data, a DestroyDataCallback (see 8).

The function should deallocate all memory that was allocated by the hash map.

If the destroy_data parameter is non-NULL it should be called for every data

element that is stored in the hash map with the data pointer of the element as

argument.

10 Bonus: New hash function

Implement a function set_hash_function that has parameters

ˆ hm, a pointer to a hash map;

ˆ hash_function, a pointer to a hash function that returns a unsigned int

and a single parameter:

– key, a null-terminated string of characters.

This function should set hash_function as the new hash function of the hash

map hm. Changing the hash function means that a particular key may now be

hashed to different bucket than it was with the previous hash function. The

hash map must be updated (rehashed) to reflect this so that all data in the

hash map can still be retrieved with their corresponding keys.

11 Bonus: Counting Words

Implement a function count_words that has parameters

ˆ stream, a pointer to a FILE.

This function should count the number of times each word in the stream occurs

using the hash map you implemented. A word is defined as a sequence of one or

more alphanumeric characters (case sensitive). You may use fscanf

3

to read a

particular set of characters from a stream but other solutions are also accepted.

The data stored in the hash map should be properly allocated and deallocated,

do not simply store an integer that is cast to a pointer type. The return type

of the function is void.

3http://en.cppreference.com/w/c/io/fscanf

4CSE2425, C programming lab, course 2020-2021

Given the input:

foo bar_, foo!

bar "baz".

foo?

The program should write the following to the standard output:

bar: 2

baz: 1

foo: 3

The order in which the output is printed is not important.

12 Submission

The assignment should be implemented on Weblab.

ˆ All test code should be located in the Testing assignment.

ˆ All hash map code should be located in the Hashmap assignment.

ˆ Put all the word count source code inside the Wordcount assignment;

ˆ If you have implemented the first bonus exercise, add the following macro

to your Hashamp submission:

#define NEW_HASH

ˆ Do not include a main function. (We will use our own test driver, just like

the example test provided.)

Submissions violating the above requirements will be automatically rejected by

the Weblab system.

5


版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:821613408 微信:horysk8 电子信箱:[email protected]
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:horysk8