Data structure

There have been 1 completed talk and 1 topic suggestion tagged with data structure.

Related Tags

Completed Talks

Bloom Filters and Other Probabilistic Data Structures

Delivered by Luthfi Mawarid on Friday March 3, 2017

With the advent of big data, the ability to process large volumes of data is becoming increasingly important. For instance, when dealing with large data sets, we may want to perform simple operations such as counting the number of unique elements or checking whether or not an element is present in the set. While there are deterministic data structures, such as hash tables, that can perform these quickly, the sheer size of the data involved makes their use largely impractical and unscalable.Instead, we may want to trade-off some accuracy in our answers in exchange for greater space efficiency and ease of parallelization. For this, we introduce the concept of probabilistic data structures.

In this talk, we will mainly focus on Bloom filters, which are commonly used to test set membership and speed up data access. We will explore its main use cases, its implementation details, and the mathematics behind it. If time permits, I will also talk about the count min-sketch, used for frequency counting, and/or the HyperLogLog counter, used for cardinality estimation.

This talk will assume basic knowledge of probability.

Talk Suggestions

Purely Functional Data Structures

Persistent (or purely function) data structures are those whose operations do not mutate the existing data structure, but rather create a new data structure that might share some state. These data structures are useful because there is no need to copy memory; rather the data structure can be shared among multiple objects. Techniques for making these data structures fast, memory-efficient, or real-time are an active area of research in functional programming.

Possible reference materials for this topic include

Quick links: Google search, arXiv.org search, propose to present a talk

computer science data structure functional programming