# Probability

There have been 5 completed talks and 4 topic suggestions tagged with **probability**.

## Related Tags

- algorithm
- statistics
- bloom filter
- space efficiency
- data structure
- information theory
- applied mathematics
- markov chain

## Completed Talks

### Markov Chain Monte Carlo

Delivered by Jacob Jackson on Wednesday October 4, 2017

The talk will introduce Markov chain Monte Carlo methods as a means of sampling from a distribution. The Metropolis-Hastings algorithm will be discussed as well as applications of Markov chain Monte Carlo for Bayesian inference and optimization.

Will expect familiarity with basic probability theory, especially conditional probability.

The slides for this talk are available at Jacob Jackson’s website.

### Metric embeddings and dimensionality reduction

Delivered by Frieda Rong on Friday March 31, 2017

In this talk, we consider embeddings which preserve the pairwise distances of a set of points. It is often useful to find mappings from one high dimensional space to a lower dimensional space that preserve the geometry of the points. One source of applications is in streaming large amounts of data, for which storage is costly and/or impractical. However, the study of such embeddings has also inspired developments in the design of approximation algorithms and compressed sensing.

At the crux of the talk is the remarkable Johnson-Lindenstrauss lemma. This fundamental result shows that for Euclidean spaces, it is possible to achieve significant dimensionality reduction of a set of points while approximately preserving the pairwise distances. An elementary proof will be given, along with subsequent speed improvements with sparse projections and an interesting use of the Fourier transform. We will also discuss applications of the lemma to the fields mentioned above.

### Bloom Filters and Other Probabilistic Data Structures

Delivered by Luthfi Mawarid on Friday March 3, 2017

With the advent of big data, the ability to process large volumes of data is becoming increasingly important. For instance, when dealing with large data sets, we may want to perform simple operations such as counting the number of unique elements or checking whether or not an element is present in the set. While there are deterministic data structures, such as hash tables, that can perform these quickly, the sheer size of the data involved makes their use largely impractical and unscalable.Instead, we may want to trade-off some accuracy in our answers in exchange for greater space efficiency and ease of parallelization. For this, we introduce the concept of probabilistic data structures.

In this talk, we will mainly focus on Bloom filters, which are commonly used to test set membership and speed up data access. We will explore its main use cases, its implementation details, and the mathematics behind it. If time permits, I will also talk about the count min-sketch, used for frequency counting, and/or the HyperLogLog counter, used for cardinality estimation.

This talk will assume basic knowledge of probability.

### Random Graphs and Complex Networks

Delivered by Frieda Rong on Friday November 11, 2016

From the graph of Facebook friendships to the neurons inside your brain, networks are all around us. We’ll go over some surprising *connections* in network theory and see some of the following:

### Information Theory

Delivered by Sidhant Saraogi on Friday October 14, 2016

I will try to provide a brief introduction to Information Theory working towards motivating Shannon's Source Coding Theorem. We will use rather simple examples (for e.g. Repetition Codes) to explain the idea of noisy channels and similarly simple examples to explain the idea behind the theorem and eventually try to prove it for a rather specific example. (if we have the time !)

## Talk Suggestions

### Entropy in Mathematics and Information Theory

Possible reference materials for this topic include

Quick links: Google search, arXiv.org search, propose to present a talk

information theory probability statistics

### Gillespie Algorithm

The Gillespie Algorithm for stochastic equations is used heavily in a number of fields of applied mathematics, in particular computational systems biology.

Possible reference materials for this topic include

Quick links: Google search, arXiv.org search, propose to present a talk

algorithm applied mathematics biology computational biology computational mathematics computer science differential equation probability stochastic equation

### Optimal Stopping Theory and the Secretary Problem

Possible reference materials for this topic include

Quick links: Google search, arXiv.org search, propose to present a talk

decision theory probability statistics

### Random Walks and Self-Avoiding Walks

Possible reference materials for this topic include

Quick links: Google search, arXiv.org search, propose to present a talk