This is an old revision of the document!

2014 Schedule

05/06/2014 @ 11:00 room BC 420

Spatial Computing and the Triggered Instruction Control Paradigm

Joel EMER, Intel in Hudson

The historical improvements in the performance of general-purpose processors have long provided opportunities for application innovation. Word processing, spreadsheets, desktop publishing, networking and various game genres are just some of the many applications that have arisen because of the increasing capabilities and the versatility of general-purpose processors. Key to these innovations is the fact that general-purpose processors do not predefine the applications that they are going to run.

Currently, the capabilities of individual general-purpose processors are encountering challenges, such as diminishing returns in exploiting instruction-level parallelism and power limits. As a consequence, a variety of approaches are being employed to address this situation, including adding myriad dedicated accelerators. Unfortunately, while this improves performance it sacrifices generality. More specifically, the time, difficulty and cost of special purpose design preclude dedicated logic from serving as a viable avenue for application innovation.

There recently has been progress in addressing this dilemma between providing programmability and higher performance via an interesting middle ground between fully general-purpose computing and dedicated logic. In specific, spatial computing, where the computation is mapped spatially onto an array of small programmable processing elements addresses many of the cost-related liabilities of dedicated logic and is increasingly being applied to general computation problems. While field prgrammable gate arrays (FPGAs) are the best know spatial computing platform there are also a number of coarse grained variants.

In this talk, we will examine the range of spatial computing alternatives and explore in more depth the concept of triggered instructions, a novel control paradigm for arrays of processing elements (PEs) aimed at exploiting spatial parallelism. Triggered instructions completely eliminate the program counter and allow programs to transition concisely between states without explicit branch instructions. They also allow efficient reactivity to inter-PE communication traffic. The approach provides a unified mechanism to avoid over-serialized execution, essentially achieving the effect of techniques such as dynamic instruction reordering and multithreading, which each require distinct hardware mechanisms in a traditional sequential architecture.

Organized by Babak Falsafi
10/06/2014 @ 10:00 room BC 420

System Support for Handling Transiency in Data Centers and Cloud Systems

Prashant SHENOY, University of Massachusetts

Modern distributed applications are built using the implicit assumption that the underlying data center servers will be stable and normally available, barring for occasional faults. In many emerging scenarios, however, data centers and clouds only provide transient, rather than continuous, availability of their servers. Transiency in modern distributed systems arises in many contexts, such as green data centers powered using renewable intermittent sources cloud platforms that provide lower-cost spot server instances which can be revoked from their users, or smart data centers that voluntarily curtail their energy usage when signaled by the smart electric grid.

In this talk, I will argue for treating transiency as a first-class design concern when building modern distributed systems and applications. I will present bounded virtual machine migration as a general mechanism for handling transiency in a broad class of systems and discuss how such a mechanism can be employed to design higher-level techniques to handle transiency in data centers and cloud systems. I will end with some preliminary results that demonstrate the benefits and feasibility of our approach.

Organized by Babak Falsafi
10/06/2014 @ 11:15 room BC 420

Interactive Theorem Proving: The Isabelle Experience

Tobias NIPKOW, Technical University of Munich

The dream of machines that prove mathematical theorems automatically goes back to the beginning computer science. Meanwhile it has become a reality, although we had to replace "automatic" with "interactive" to make it work. Modern Interactive Theorem Provers (or Proof Assistants) are like programming environments, except that they help us write proofs instead of programs. Crucially, they check the proofs for logical correctness and (try to) bridge gaps in the argument by constructing missing subproofs automatically.
This talk will present an overview of the field from the perspective of the Isabelle proof assistant. I will talk about a range of applications from the proof of the Kepler Conjecture via the analysis of programming languages and operating systems to the role of proof assistants in teaching. A demo of the Isabelle systems will help the audience to get a feeling for what is involved when trying to convince a machine that some theorem is true.

Organized by Viktor Kuncak
10/06/2014 @ 15:15 room BC 420

From Software to Circuits: Open Source High-Level Synthesis for FPGA-based Processor/Accelerator Systems

Jason ANDERSON, University of Toronto

In this talk, we will describe a high-level synthesis tool, called LegUp, being developed at the University of Toronto. LegUp accepts a standard C program as input and automatically compiles the program to a hybrid architecture containing an FPGA-based MIPS soft processor and custom hardware accelerators that communicate through a standard bus interface. Results show that the tool produces hardware solutions of comparable quality to a commercial high-level synthesis tool. LegUp, along with a set of benchmark C programs, is open source and freely downloadable (www.legup.org), providing a powerful platform that can be leveraged for new research on a wide range of high-level synthesis topics. The tool has been downloaded by over 1000 groups from around the world since its initial release in March 2011. The talk will overview LegUp's current capabilities, as well as current research directions underway.

Organized by Paolo Ienne
11/06/2014 @ 14:00 room BC 420

Interactive Machine Learning via Adaptive Submodularity

Andreas KRAUSE, ETH Zürich

How can people and machines cooperate to gain insight and discover useful information from complex data sets? A central challenge lies in optimizing the interaction, which leads to difficult sequential decision problems under uncertainty. In this talk, I will introduce the new concept of adaptive submodularity, generalizing the classical notion of submodular set functions to adaptive policies. We prove that if a problem satisfies this property, a simple adaptive greedy algorithm is guaranteed to be competitive with the optimal policy.
The concept allows to recover, generalize and extend existing results in diverse domains including active learning, resource allocation and social network analysis. I will show results on several real-world applications, ranging from interactive content search over image categorization in citizen science to biodiversity monitoring via conservation drones.

Organized by Rüdiger Urbanke
13/06/2014 @ 17:00 room BC 420

A retrospection of social media analytics

Nick Koudas, University of Toronto

This talk will provide an overview of the work that we have been conducting over the last ten years on collecting, storing and analyzing social media data. I will present an overview of the BlogScope system an early social media analytics platform, Grapevine a social news system and Peckalytics. In each case we will outline the main challenges and technology we developed to address them. These include both algorithmic challenges as well as system and design optimizations that enabled us to support real-time analysis at scale.

Organized by Matthias Grossglauser
16/06/2014 @ 16:30 room BC 420

Social Learning in Decision-Making Groups

Vivek GOYAL, Boston University

People have always been influenced by the opinions of their acquaintances. Increasingly, through recommendations and ratings provided on all sorts of goods and services, people are also influenced by the opinions of people that are not even acquaintances. This ubiquity of the sharing of opinions has intensified the interest is the concept of herding (or informational cascades) introduced in 1992. While agents in most previous works have only individualistic goals, this talk focuses on social influence among agents in two collaborative settings.

We consider agents that perform Bayesian binary hypothesis testing and, in addition to their private signals, observe the decisions of earlier-acting agents. In the first setting, each decision has its own corresponding Bayes risk. Each agent affects the minimum possible Bayes risk for subsequent agents, so an agent may have a mixed objective including her own Bayes risk and the Bayes risks of subsequent agents; we demonstrate her tension between being informative to other agents and being right in her own decisions, and we show that she is more informative to others when she is open minded. In the second setting, opinions are aggregated by voting, and all agents aim to minimize the Bayes risk of the team's decision. We show that social learning is futile when the agents observe conditionally independent and identically distributed private signals (but not merely conditionally independent private signals) or when the agents require unanimity to make a decision. Our experiments with human subjects suggest that when opinions of people with equal qualities of information are aggregated by voting, the ballots should be secret. They have also raised questions about rationality and trust.

Organized by Martin Vetterli
18/06/2014 @ 14:00 room BC 420

State-of-the-art on Cryptanalysis of the SHA Family of Hash Functions

Yu SASAKI, NTT - JAPAN

SHA-2 is a family of hash function which is internationally standardized and widely used all over the world. NIST determined to use SHA-2 continuously, even after the release of a new hash function standard SHA-3. From this background, the security analysis of SHA-2 deserves much attention. In this talk, the history of the cryptanalysis on SHA-2 is reviewed with respect to the following points.

1. Dedicated cryptanalysis on SHA-2 such as collision attack, preimage attack, various types of distinguisher.
2. Generic attack on a class of hash functions including the design of SHA-2.
3. Dedicated cryptanalysis on SHA-2 in the keyed usage such as MAC.

Organized by Serge Vaudenay
19/06/2014 @ 14:00 room BC 420

Bitcoin contracts --- digital economy without lawyers?

Stefan DZIEMBOWSKI, University of Warsaw

BitCoin is a digital currency system introduced in 2008 by an anonymous developer using a pseudonym "Satoshi Nakamoto". Despite of its mysterious origins, Bitcoin became the first cryptographic currency that got widely adopted --- as of May 2014 the Bitcoin capitalization is over 5 bln euro. Bitcoin owes its popularity mostly to the fact that it has no central authority, the transaction fees are very low, and the amount of coins in the circulation is restricted, which in particular means that nobody can "print" money to generate inflation. The financial transactions between the participants are published on a public ledger maintained jointly by the users of the system.

One of the very interesting, but slightly less known, features of the Bitcoin is the fact that it allows for more complicated "transactions" than the simple money transfers between the participants: very informally, in Bitcoin it is possible to "deposit" some amount of money in such a way that it can be claimed only under certain conditions. These conditions are written in the form of the "Bitcoin scripts" and in particular may involve some timing constrains. This property allows to create the so-called "contracts", where a number of mutually-distrusting parties engage in a Bitcoin-based protocol to jointly perform some task. The security of the protocol is guaranteed purely by the properties of the Bitcoin, and no additional trust assumptions are needed. This Bitcoin feature can have several applications in the digital economy, like creating the assurance contracts, the escrow and dispute mediation, the rapid micropayments, the multiparty lotteries.

In this talk I will give a short introduction to this area, present some recent results, and highlight the future research directions.

Organized by Serge Vaudenay
19/06/2014 @ 16:30 room BC 420

Inferring application performance regardless of data completeness

Alessandra Sala, Bell Labs Ireland

Modern communication networks, such as online social networks and call networks, give us a unique opportunity of observing, analyzing and better understanding human behaviors. In Telecommunication industry, user’s data are considered a precious source of information to shed light into novel insights to drive the design of future communication platforms. Unfortunately, in the presence of factors such as increasing privacy awareness, restrictions on application programming interfaces (APIs) and constrained sampling strategies, analyzing complete datasets is often unrealistic. For instance, partial network views are basically default in telco analytics, as customers typically have frequent contacts with customers of other providers - which naturally cannot be observed; or, accurately inferring user activity is the Holy Grail of mobile advertisement and targeted service offering because privacy restrictions usually do not allow the logging of complete URLs.

This talk discusses the potential and risks of mining partial data with the analysis of two specific use cases. In the first use case, we unveil the hidden effects in the evaluation of marketing campaign in social networks when the spread of information is estimated from a partial view of the network. The proposed methodology is able to quantify the error introduced due to network partiality based on a theoretical oracle scenario and correct for the introduced error at large extent. In the second use case, we show an approach to mine mobile web traces form heavily truncated URLs and inferring user activities with high accuracy. Truncated URLs are trimmed from information like location or purchased products, to mask possibly sensitive end user data. Furthermore, URLs derived from real web traces are highly noisy because dominated by unintentional web traffic like advertisement, web analytics or third parties scripts. We have developed a statistical model to segregate representative URLs characterizing the user activities from unintentional web traffic and demonstrated that our approach classifies user activities with 92% accuracy.

Organized by Matthias Grossglauser
20/06/2014 @ 10:00 room BC 420

Saving DVFS: Software Decoupled Access-Execute

Stefanos Kaxiras, Uppsala University, Sweden

The end of Dennard scaling is expected to shrink the range of DVFS in
future nodes, limiting the energy savings of this technique. This work
evaluates how much we can increase the effectiveness of DVFS by using a
software decoupled access-execute approach. Decoupling the data access
from execution allows us to apply optimal voltage-frequency selection
for each phase and therefore improve energy efficiency over standard
coupled execution.
The underlying insight of our work is that by decoupling access and
execute we can take advantage of the memory-bound nature of the access
phase and the compute-bound nature of the execute phase to optimize
power efficiency, while maintaining good performance. To demonstrate
this we built a task based parallel execution infrastructure consisting
of a runtime system to orchestrate the execution, compiler
infrastructure to automatically concert programs to
Decoupled-Access-Execute, and a modeling infrastructure based on
hardware measurements to simulate zero-latency, per-core DVFS.
Based on real hardware measurements we project that the combination of
decoupled access-execute compiled programs and DVFS has the potential to
improve EDP by 25% without hurting performance. On memory-bound
applications we significantly improve performance due to increased MLP
in the access phase and ILP in the execute phase. Furthermore we
demonstrate that our method can achieve high performance both in
presence or absence of a hardware prefetcher. We are now expanding our
work to target serial programs using a combination of compiler
techniques and hardware features.

Organized by Babak Falsafi
20/06/2014 @ 14:00 room BC 420

Entity-Centric Data Management

Philippe CUDRE-MAUROUX, University of Fribourg

Until recently, structured (e.g., relational) and unstructured (e.g., textual) data were managed very differently: Structured data was queried declaratively using languages such as SQL, while unstructured data was searched using boolean queries over inverted indices. Today, we witness the rapid emergence of entity-centric techniques to bridge the gap between different types of content and manage both unstructured and structured data more effectively. I will start this talk by giving a few examples of entity-centric data management. I will then describe two recent systems that were built in my lab and revolve around entity-centric data management techniques: ZenCrowd, a socio-technical platform that automatically connects HTML pages to semi-structured entities, and TripleProv, a scalable, efficient, and provenance-enabled back-end to manage graphs of entities.

Organized by Christoph Koch
20/06/2014 @ 15:15 room BC 420

Microfacet BRDF models

Lionel SIMONOT, Université de Poitiers, France

The volume and surface scattering of a material is usually deﬁned at a macroscopic scale by its bidirectional reﬂectance distribution function (BRDF) that describes the chance of reﬂection for different pairs of incoming and outgoing light directions. It is widely used in remote sensing, science material, lighting simulation or computer graphics.
Microfacet BRDF models are based on a geometrical description of the material surface as a collection of microfacets whose dimensions are much greater than the wavelength. The resulting BRDF dépends on the microfacet slope distribution, the shadowing-masking function and the BRDF of the individual facets. In the original Cook-Torrance model, and in most of cases even today, each facet is assumed to reflect light only in the specular direction. The third factor of the model is then given by the Fresnel relation depending on the material refractive index.
We will present the generalization of microfacet models for any radiometric response of the individual facets. Some crucial points (radiometric proof of the model, choice of the distribution and of the shadowing-masking function, numerical calculations…) will be discussed. And we propose to consider facets consisting of a flat interface on a Lambertian background. The Oren-Nayar model distribution of strictly Lambertian facets and the Cook-Torrance model distribution of strictly specular facets appear as special cases.

Organized by Roger Hersch

Summer Research Institute

2014 Schedule