Title and abstracts of the invited talks

Abstract: T cells orchestrate immune responses by binding peptides derived from pathogen proteins displayed on the surface of infected cells. Host cells also display peptide fragments from the host's own proteins. Incorrectly identifying peptides derived from the body's own proteome as pathogenic can result in autoimmune disease. To minimize autoreactivity, randomly generated T cell receptor (TCR) sequences undergo positive and negative selection processes in the thymus, leading to a selected T cell repertoire that is both diverse and self-tolerant. A model of thymic selection as an extreme value process provides insights into amino acid compositions of selected TCRs, and TCR/peptide interactions. However, thymic selection is imperfect, and autoreactive T cells exist in healthy individuals. To understand how autoimmunity is yet avoided, without loss of responsiveness to pathogens, we suggest that collective decisions by the T-cell population, rather than individual T cells, determine immune response. The theory is qualitatively consistent with experimental data and yields a criterion for thymic selection to be adequate for suppressing autoimmunity.

Abstract: In this presentation I review stochastic optimal control theory as a deterministic control of densities. Using the Pontryagin Maximum Principle, the corresponding Hamilton equations for the evolution of the state and the co-state are derived. Classical stochastic optimal control problems are linear in the density (both dynamics and cost) and as a result the evolution of the co-state is independent of the state, and is in fact the Bellman equation. We consider the non-linear generalization. In particular, we show that for a particular cost the Hamilton equations are equivalent to the Schr\"odinger equation. We thus formulate quantum mechanics as a control problem on densities. We illustrate the difference between classical and quantum control in the case of linear quadratic control problems. We discuss whether and how the optimal control solution can be obtained by sampling, providing an efficient computational method for this new class of control problems.

Abstract: The demand of approximating a high-dimensional vector by a linear combination of a small number, K, of column vectors selected from a fixed matrix arises in many contexts of information science such as compressed sensing, sparse linear regression, and data compression. Such a task is sometimes referred to as ``sparse approximation problem (SAP)''. Despite the simplicity of its expression, SAP is highly nontrivial to solve. In this talk, we attempt to characterize the difficulty of solving SAP by drawing the entropy landscape versus the residual sum of squares (RSS). The replica based analysis of a synthetic model of sparse linear regression indicates that the entropy landscape of SAP is rather smooth and one can nearly minimize RSS by the rapid simulated annealing, although the replica symmetry breaking makes it difficult to exactly find the minimum RSS solution. The utility of cross validation to determine an appropriate K value for maximizing the prediction ability is also discussed.

Abstract: Algorithms map input spaces to output spaces where inputs are possibly affected by fluctuations. Beside run time and memory consumption, an algorithm might be characterized by its sensitivity to the signal in the input and its robustness to input fluctuations. The achievable precision of an algorithm, i.e., the attainable resolution in output space, is determined by its capability to extract predictive information in the input relative to its output. I will present an information theoretic framework for algorithm analysis where an algorithm is characterized as computational evolution of a posterior distribution on the output space. Algorithms are ranked according to their information contents of the output distribution. The method allows us to investigate complex data analysis pipelines as they occur in computational neuroscience and neurology as well as in molecular biology. I will demonstrate this design concept for algorithm validation with a statistical analysis of diffusion tensor imaging data and functional MRI data.

Abstract: Physical systems with many degrees of freedom can often be effectively described and understood in terms of a small number of macro-variables. Constructing such coarse-grained representations relies on detailed models of system dynamics that are nonexistent for most biological data. I will introduce an approach for coarse-graining arbitrary high-dimensional data based on a hierarchical decomposition of multivariate information. "Local" interactions are discovered and represented as modules that are then progressively combined to capture global dependencies. For gene expression data, coarse-grained states represent a robust computational phenotype that reflects diverse biological processes and can be used to predict patient-specific cancer survival rates. Other example applications include detection of novel multivariate biomarkers for Alzheimer's disease and improved recovery of fMRI modes.

Abstract: The Expectation Propagation (EP) approach to approximate inference provides typically highly accurate results for expectations and free energies in probabilistic inference. Unfortunately, the application of this method to large systems becomes computationally inefficient. In this talk we will investigate possibilities of simplifying EP in the thermodynamic limit by applying the concept of self—averaging. Using random matrix theory, we obtain a Thouless—Anderson—Palmer (TAP) limit of EP with couplings that are not simply independent random variables but have (weak) dependencies. We develop a theoretical approach for solving TAP equations with such coupling matrices for Ising systems. Finally, we discuss extensions of our approach to dynamical systems. Joint work with Burak Cakmak and Ole Winther.

Abstract: Collective changes in biological groups requires all individuals in the group to go through a behavioral change of state. Sometimes these changes are triggered by external perturbations, as in evasi

Abstract: Spatial coupling first originated as an engineering construction in the field of error correcting codes for communications. This construction that takes an underlying system and enlarges it by coupling many copies of this system along a spatial chain. This is done in such a way that first order phase transition thresholds remain unchanged but metastable states disappear, which has desirable algorithmic consequences. More recently spatial coupling has been used as a proof technique to derive lower bounds for thresholds of (uncoupled) random constraint satisfaction problems and also to prove predictions of replica formulas for free energies and mutual informations in Bayesian inference (uncoupled) problems. The talk will review this set of ideas.

Abstract: Evolution is simple if adaptive mutations appear one at a time. However, in large microbial populations many mutations arise simultaneously resulting in a complex dynamics of competing variants. I will discuss recent insight into universal properties of such rapidly adapting populations and compare model predictions to whole genome deep sequencing data of HIV-1 populations at many consecutive time points. Genetic diversity data can further be used to infer fitness of individuals in a population sample and predict successful genotypes. We validate these prediction using historical influenza virus sequence data. Successful predictions of the composition of future influenza virus population could guide strain selection for seasonal influenza vaccines.

Riccardo Zecchina

Abstract:

Abstract: For about a decade my colleagues and I have been trying to use maximum entropy methods to build models for the patterns of activity in networks of neurons, taking advantage of emerging experimental data. Much of our effort has ben focused on populations of neurons in the retina, and one might worry that this example is very far from generic, since the correlations in the network are shaped so strongly by the visual stimulus. I’ll describe our first effort to analyze experiments in the hippocampus, where we find that simple pairwise maximum entropy models are surprisingly successful in giving a detailed, quantitative description of higher order structure in the collective activity of the network. Importantly, these models work much better than a model of independent “place cells,” and our results provide hints that the network is encoding more than just place. In a different direction, we are trying to develop a renormalization group approach in which the role of momentum shells is played by eigenmodes of the correlation matrix (PCA meets RG). I will describe some of the analytical development, and show how this can be used to look at real data, going back to the large data sets in the retina, where we have preliminary evidence that the distribution of activity is controlled by a nontrivial fixed point.

Abstract: The evolution of seasonal human influenza viruses is a process far from equilibrium: it involves continuous adaptive changes in viral proteins, which occur in response to host immune challenge. Inferring the genetic and antigenic basis of such processes is an important non-equilibrium inverse problem. In this talk, we show that the adaptive process of influenza is carried by adaptive evolutionary sectors in its surface protein hemagglutinin, which are classes of amino acid sites coupled by broad fitness interactions. We infer these sectors and the underlying fitness model from an asymmetric correlation matrix of mutations on a phylogenetic tree. At the genetic level, the influenza sectors contain known antigenic protein sites, as well as other sites that can be associated with compensatory mutations. Phenotypically, they can be associated with quantitative traits linked to the evolution of viral antigenicity. Based on this example, we discuss the statistical mechanics of fitness interactions in adaptive evolution.

Abstract:

Abstract: We examine the inference of non-coding forces, such as selective pressure from the innate immune system or the energetics of amino acid usage, on genome evolution. Recently, we quantified such problems in the language of statistical physics, where competition between selective and entropic forces drive evolution. As a result from the computational gains of this approach, we have been able to examine more rapidly whole transcriptome data. In several tumors, one finds the appearance of unusual RNA, from endogenous non-coding RNA, with features similar to viral RNA. We show that a key set of such RNA, abundantly transcribed in tumors but rarely in normal tissue, are immunogenic, likely activating the innate immune system in the tumor microenvironment, and speculate on their role in tumor evolution.

Abstract: