Recognition of pathogens relies on families of proteins showing great diversity. Here we construct maximum entropy models of the sequence repertoire, building on recent experiments that provide a nearly exhaustive sampling of the IgM sequences in zebrafish. These models are based solely on pairwise correlations between residue positions but correctly capture the higher order statistical properties of the repertoire. By exploiting the interpretation of these models as statistical physics problems, we make several predictions for the collective properties of the sequence ensemble: The distribution of sequences obeys Zipf’s law, the repertoire decomposes into several clusters, and there is a massive restriction of diversity because of the correlations. These predictions are completely inconsistent with models in which amino acid substitutions are made independently at each site and are in good agreement with the data. Our results suggest that antibody diversity is not limited by the sequences encoded in the genome and may reflect rapid adaptation to antigenic challenges. This approach should be applicable to the study of the global properties of other protein families.
Many of the biological networks inside cells can be thought of as transmitting information from their inputs (e.g., the concentrations of proteins or other signaling molecules) to their outputs (e.g., the expression levels of various genes). On the molecular level, the relatively small concentrations of the relevant molecules and the intrinsic randomness of chemical reactions provide sources of noise that set physical limits on this information transmission. Given these
limits, not all networks perform equally well, and maximizing information transmission provides a optimization principle from which we might hope to derive the properties of real regulatory networks. I will discuss the properties of specific small networks that can transmit the maximum information. Concretely, I will show how the form of molecular noise drives predictions not just of the qualitative network topology but also the quantitative parameters for the
input/output relations at the nodes of the network. In an attempt to link these general theoretical considerations to real biological systems, I will illustrate the predictions on the example of transmission of positional information in the early development of the fly embryo. Lastly, I will discuss different approaches of how a stochastic molecular level description can be successfully expanded to larger regulatory systems.
1) P.-G. de Gennes, Chemotaxis: the role of internal delays. When exposed to certain chemoattractants, bacteria like Escherichia coli move up the concentration gradient nablac with a velocity kappanablac.Microscopically, E. coli moves at constant speed when itrsquos flagellum is rotating counter-clockwise (ccw) and tumbles when the rotation is clockwise (cw). The lifetime of a ccw interval, tau+, is a function of the concentration c(tprime) experienced at earlier times. The corresponding response function was measured long ago by Berg and co-workers. We present here a detailed description of the motion taking place during one ccw interval. This gives an explicit formula relating the chemotactic coefficient kappa to the response function; the formula has some surprising features.
2) A. Celani and M. Vergassola, Bacterial strategies for chemotaxis response. Regular environmental conditions allow for the evolution of specifically adapted responses, whereas complex environments usually lead to conflicting requirements upon the organism’s response. A relevant instance of these issues is bacterial chemotaxis, where the evolutionary and functional reasons for the experimentally observed response to chemoattractants remain a riddle. Sensing and motility requirements are in fact optimized by different responses, which strongly depend on the chemoattractant environmental profiles. It is not clear then how those conflicting requirements quantitatively combine and compromise in shaping the chemotaxis response. Here we show that the experimental bacterial response corresponds to the maximin strategy that ensures the highest minimum uptake of chemoattractants for any profile of concentration. We show that the maximin response is the unique one that always outcompetes motile but nonchemotactic bacteria. The maximin strategy is adapted to the variable environments experienced by bacteria, and we explicitly show its emergence in simulations of bacterial populations in a chemostat. Finally, we recast the contrast of evolution in regular vs. complex environments in terms of minimax vs. maximin game-theoretical strategies. Our results are generally relevant to biological optimization principles and provide a systematic possibility to get around the need to know precisely the statistics of environmental fluctuations.
All animals express microRNA genes, which in turn regulate the expression of the protein repertoire of the organism. These thousands of small RNA genes add another layer of recursive complexity to a seemingly impenetrable molecular circuitry. An important component of microRNA biology has been to quantitate the global scope and extent of their control of gene expression. Analysis methods have concentrated on predicting complementary sequence matches between the small RNA and potential target genes. Although some rules have emerged, functional gene target prediction still remains an unsolved problem. Recent
work illustrates that system level properties of the cells e.g. limiting protein machinery and the target gene abundance are determinants of how much a gene will be regulated by microRNAs or siRNAs. Our overall goal is to develop a mathematical, predictive theory of gene regulatory programs which control the behaviour of cells.
(*) The seminar will take place at the Centro per le Biotecnologie Molecolari (CBM), via Nizza 52, Torino
Chris Sander, Memorial Sloan Kettering Cancer Center (MSKCC)
We present a novel method for deriving network models from molecular profiles of perturbed cellular systems. The network models aim to predict quantitative outcomes of combinatorial perturbations, such as drug pair treatments or multiple genetic alterations.
Mathematically, we represent the system by a set of nodes, representing molecular concentrations or cellular processes, a perturbation vector and an interaction matrix. After perturbation, the system evolves in time according to differential equations with built-in non-linearity, similar to Hopfield networks, capable of representing epistasis and saturation effects. For a particular set of experiments, we derive the interaction matrix by minimizing a composite error function, aiming at accuracy of prediction and simplicity of network structure. To evaluate the predictive potential of the method we performed drug pair treatment experiments in a human breast cancer cell line (MCF7) with observation of phospho-proteins and cell cycle markers. The best derived network model rediscovered known interactions and contained interesting predictions.
Possible applications of the combinatorial perturbation approach include the discovery of regulatory interactions, the design of targeted combination therapies, and the engineering of molecular biological networks.
* Seminar at Aula Magna, Politecnico di Torino main building