Publications | Alice Wang

2023

Polarity is all you need to learn and transfer faster

Qingyang Wang, Michael A. Powell, Ali Geisa, Eric W. Bridgeford, and Joshua T. Vogelstein

In Proceedings of the 40th International Conference on Machine Learning, 2023

Abs arXiv PDF

Natural intelligences (NIs) thrive in a dynamic world - they learn quickly, sometimes with only a few samples. In contrast, artificial intelligences (AIs) typically learn with a prohibitive number of training samples and computational power. What design principle difference between NI and AI could contribute to such a discrepancy? Here, we investigate the role of weight polarity: development processes initialize NIs with advantageous polarity configurations; as NIs grow and learn, synapse magnitudes update, yet polarities are largely kept unchanged. We demonstrate with simulation and image classification tasks that if weight polarities are adequately set a priori, then networks learn with less time and data. We also explicitly illustrate situations in which a priori setting the weight polarities is disadvantageous for networks. Our work illustrates the value of weight polarities from the perspective of statistical and computational efficiency during learning.
Why do networks have inhibitory/negative connections?

Qingyang Wang, Mike A. Powell, Ali Geisa, Eric Bridgeford, Carey E. Priebe, and 1 more author

In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Oct 2023

Abs arXiv PDF

Why do brains have inhibitory connections? Why do deep networks have negative weights? We believe representing functions is the primary role of both (i) the brain in natural intelligence, and (ii) deep networks in artificial intelligence. Our answer to why there are inhibitory/negative weights is: to learn more functions. We prove that, in the absence of negative weights, neural networks with non-decreasing activation functions are not universal approximators. While this may be an intuitive result to some, to the best of our knowledge, there is no formal theory, in either machine learning or neuroscience, that demonstrates why negative weights are crucial in the context of representation capacity. Further, we provide insights on the geometric properties of the representation space that non-negative deep networks cannot represent. We expect these insights will yield a deeper understanding of more sophisticated inductive priors imposed on the distribution of weights that lead to more efficient biological and machine learning.
Representation Ensembling for Synergistic Lifelong Learning with Quasilinear Complexity

Joshua T. Vogelstein, Jayanta Dey, Hayden S. Helm, Will LeVine, Ronak D. Mehta, and 11 more authors

Submitted, Oct 2023

Abs arXiv

In lifelong learning, data are used to improve performance not only on the current task, but also on previously encountered, and as yet unencountered tasks. In contrast, classical machine learning which starts from a blank slate, or tabula rasa, uses data only for the single task at hand. While typical transfer learning algorithms can improve performance on future tasks, their performance on prior tasks degrades upon learning new tasks (called forgetting). Many recent approaches for continual or lifelong learning have attempted to maintain performance on old tasks given new tasks. But striving to avoid forgetting sets the goal unnecessarily low. The goal of lifelong learning should be to not only improve performance on future tasks (forward transfer) but also to improve performance on past tasks (backward transfer) with any new data. Our key insight is that we can synergistically ensemble representations that were learned independently on disparate tasks to enable both forward and backward transfer. This generalizes ensembling independently learned representations (like in decision forests) and complements ensembling dependent representations (like in gradient boosted trees). Moreover, we ensemble representations in quasilinear space and time. We demonstrate this insight with two algorithms: representation ensembles of (1) trees and (2) networks. Both algorithms demonstrate forward and backward transfer in a variety of simulated and benchmark data scenarios, including tabular, image, spoken, and adversarial tasks, including CIFAR-100, 5-dataset, Split Mini-Imagenet, and Food1k, as well as the spoken digit dataset. This is in stark contrast to the reference algorithms we compared to, most of which failed to transfer either forward or backward, or both, despite that many of them require quadratic space or time complexity.

2021

Early Emergence of Solid Shape Coding in Natural and Deep Network Vision

Ramanujan Srinath, Alexandriya Emonds, Qingyang Wang, Augusto A. Lempel, Erika Dunn-Weiss, and 2 more authors

Current Biology, Oct 2021

Abs PDF

Area V4 is the first object-specific processing stage in the ventral visual pathway, just as area MT is the first motion-specific processing stage in the dorsal pathway. For almost 50 years, coding of object shape in V4 has been studied and conceived in terms of flat pattern processing, given its early position in the transformation of 2D visual images. Here, however, in awake monkey recording experiments, we found that roughly half of V4 neurons are more tuned and responsive to solid, 3D shape-in-depth, as conveyed by shading, specularity, reflection, refraction, or disparity cues in images. Using 2-photon functional microscopy, we found that flat- and solid-preferring neurons were segregated into separate modules across the surface of area V4. These findings should impact early shape-processing theories and models, which have focused on 2D pattern processing. In fact, our analyses of early object processing in AlexNet, a standard visual deep network, revealed a similar distribution of sensitivities to flat and solid shape in layer 3. Early processing of solid shape, in parallel with flat shape, could represent a computational advantage discovered by both primate brain evolution and deep-network training.
Myelin pathology in ataxia-telangiectasia is the cell-intrinsic consequence of ATM deficiency in the oligodendrocytes

Kai-Hei Tse, Aifang Cheng, Sunny Hoi-Sang Yeung, Gerald Wai-Yeung Cheng, Qingyang Wang, and 5 more authors

medRxiv (submited to GLIA), Oct 2021

Abs HTML

Ataxia-telangiectasia (A-T) is a rare genetic disease caused by mutations in the gene encoding the ATM (Ataxia-telangiectasia mutated) protein. Although neuronal degeneration in the cerebellum remains the most prominent sign in A-T, neuroimaging studies reveal myelin abnormalities as early comorbidities. We hypothesize that these myelin defects are the direct consequence of ATM deficiencies in the oligodendrocytes (OL) lineage. We examined samples from ten A-T brains in which the ATM mutations had been mapped by targeted genomic sequencing as well as samples from Atm-/- mice. In healthy human and wild type mouse cerebellum we confirmed the presence of ATM in white matter OLs. In A-T but not age-matched controls, a significant reduction in OL density was coupled with a massive astrogliosis. We found that the extent of this OL pathology was particularly strong in cases with frameshifts or premature termination ATM mutations. Similar pathologies were also found in Atm-/- mice in an age- and gene dose-dependent fashion. In vitro, DNA damage induced by etoposide-induced DSBs or blockade of ATM activity with KU-60019 differentially jeopardized cell cycle control in OL progenitors and mature OLs. Turning to structural analysis in silico, we identified likely interactions between ATM and myelin basic protein as well as myelin regulatory factor and confirm this by immunoprecipitation. These novel OL-specific functions of the ATM protein affect all stages of the OL lineage. They thus provide a cell biological basis for a direct role for ATM in central nervous system myelination and illustrate how the myelin pathology found in A-T is at least in part independent of neuronal degeneration.