November Signaling-based neural networks for cellular computation Christian Cuba Samaniego ccubasam@g.ucla.edu 2 Andrew Moorman amoorman@mit.edu 0 Giulia Giordano giulia.giordano@unitn.it 1 Elisa Franco efranco@seas.ucla.edu 2 Department of Biological Engineering, Massachusetts Institute of Technology , Massachusetts 02139 , USA Department of Industrial Engineering, University of Trento , 38123 Povo (TN) , Italy Department of Mechanical and Aerospace Engineering, University of California Los Angeles , California 90095 , USA 2020 10 2020

- Cellular signaling pathways are responsible for decision making that sustains life. Most signaling pathways include post-translational modification cycles, that process multiple inputs and are tightly interconnected. Here we consider a model for phosphorylation/dephosphorylation cycles, and we show that under some assumptions they can operate as molecular neurons or perceptrons, that generate sigmoidal-like activation functions by processing sums of inputs with positive and negative weights. We carry out a steady-state and structural stability analysis for single molecular perceptrons as well as for feedforward interconnections, concluding that interconnected phosphorylation/dephosphorylation cycles may work as multilayer biomolecular neural networks (BNNs) with the capacity to perform a variety of computations. As an application, we design signaling networks that behave as linear and non-linear classifiers.

I. INTRODUCTION

Living cells sense, process, and respond to a multitude of inputs from the environment, collectively using this information to make complex, life-sustaining decisions. To integrate and process several inputs at once, cells rely on signaling pathways that regulate downstream pathways depending on the input patterns they detect [ 1 ]. An important building block of signaling pathways is given by protein post-translational modification cycles, of which the best known is phosphorylation [ 2 ]. The rich dynamics and steadystate responses of these signaling pathway have been amply studied and are well-understood when they are taken as single-input, single-output isolated systems [ 3 ]. However, these pathways not only respond to multiple inputs, but they are also tightly interconnected, making it difficult to clearly identify their input-output function and their full potential for signal processing and computation.

In this manuscript, we consider a model for multi-input,

single output phosphorylation/dephosphorylation cycles and describe their ability to operate as molecular perceptrons. Building on previous work by us and others, that proposed a strategy to build biomolecular neural networks (BNNs) using molecular sequestration [ 4 ], here we demonstrate that, under some assumptions, these natural signaling pathways can operate as multi-input perceptrons as long as their inputoutput behavior has a tunable threshold. Further, these molecular perceptrons can process several inputs assigning them positive and negative weights by tuning catalytic rate parameters; the presence of both positive and negative weights is essential for complex classification. We report a structural steady-state and stability analysis of the multi-input phosphorylation/dephosphorylation cycle model considered as a candidate perceptron, as well as stability analysis for BNNs resulting from the feedforward interconnection of multiple perceptrons. With simulations, we show that by tuning the weights suitably, and by designing the depth of the BNN, it is possible to build linear and non-linear classifiers.

There is a long history of theoretical and experimental work devoted to discovering the computational power of chemical and biological networks under the lens of neural networks. For example, a chemical implementation of neural networks based on reversible reaction was proposed in [ 5 ]; networks in living cells were examined in [ 6 ], showing that they present features not found in conventional computerbased neural networks; chemical Boltzmann machines have been described in [ 7 ]. It has been proposed that transcriptional networks in vitro and in vivo may operate as neural networks by tuning various reaction rate parameters [ 8 ], [ 9 ]. Nucleic acids have been engineered for experimental demonstrations of molecular neural networks and classifiers in vitro [ 10 ], [ 11 ], in cell extracts [ 12 ], and in cells [ 13 ], [ 11 ].

Although in practice any biological network may be tuned to work as a neural network [ 6 ], the main challenge is to translate an abstract design into an implementation with predictable behavior. A long-standing challenge, for example, is that of implementing negative weights in a biological neural network that can only admit positive variables (concentrations). In previous work, we demonstrated that molecular sequestration reaction can theoretically implement positive and negative weights because it performs a subtraction operation between its inputs [ 4 ]. Because a salient feature of molecular sequestration is that it generates an ultrasensitive response with a tunable threshold, we rationalize that other networks with a tunable threshold may implement molecular neural networks with positive and negative weights. This rationale motivates the present study: we show that when post-translational modification cycles, such as phosphorylation/dephosphorylation, present an ultrasensitive response with a tunable threshold, they can operate as perceptron nodes with positive and negative weights. After providing basic definitions in Section II, in Section III we report mathematical analysis of phosphorylation/dephosphorylation cycles as a perceptrons. In section IV we describe the design of a single-layer linear classifiers, Section V describes multilayer BNNs, and finally Section VI provides simulation examples of non-linear classifiers.

II. BACKGROUND: ARTIFICIAL NEURAL NETWORKS AND PERCEPTRONS

Artificial neural networks (ANNs), also called perceptron networks, take inspiration from the architecture of the human brain. ANNs consist of multiple, interconnected artificial neurons, or perceptrons, which represent the nodes of the network. Each perceptron performs a different computation of the same functional form, whose outcome depends on the perceptron’s inputs and parameters. We adopt the following model (Fig. 1A) for a single perceptron that takes n inputs: n ! f X wixi , xi, wi ∈ R,

i=1 where f is a non-linear activation function, xi are the perceptron inputs, i = 1, ..., n, and wi is the parameter or weight associated to the ith input. By connecting multiple perceptrons, each with potentially distinct inputs and weights,

ANNs achieve the emergent ability to carry out complex

computations far more sophisticated than those realizable by an individual perceptron [ 14 ]. In the next sections, we discuss how a well-known class of (post-translational) signaling networks can be viewed as molecular perceptrons.

III. PHOSPHORYLATION-DEPHOSPHORYLATION CYCLES

CAN OPERATE LIKE BIOMOLECULAR PERCEPTRONS

Here we argue that post-translational modification cycles

such as phosphorylation-dephosphorylation cycles can operate like biomolecular perceptrons (Fig. 1). Let us denote chemical species with uppercase letters, and their concentrations with the corresponding lowercase letter, so that species

X has concentration x. In a phosphorylation-dephosphorylation cycle (scheme in

Fig 1B), a target protein P ∗ (inactive state, unphosphorylated) is converted into P (active state, phosphorylated) in the presence of a kinase Z, which adds a phosphate group to P ∗ so as to produce P . A phosphatase Y can reverse this process by removing the phosphate group from protein P , thus producing P ∗.

A Perceptron

B Biomolecular Perceptron

C Input-Output steady state

1 activation 1 inhibition

We model this process through the following biochemical reactions. The kinase Z, the phosphatase Y and the target protein P ∗ are produced at a constant rate, respectively u, v and θ:

u ∅ −−−* Z

v ∅ −−−* Y ∅ −−θ−* P ∗.

All species are subject to first order degradation with rate constant δ:

δ Z, Y, P, P ∗, C1, C2 −−−* ∅.

Z and P ∗ bind with rate constant a1 to form an intermediate

complex C1, which dissociates at rate constant d1. The complex C1 produces P and Z with rate parameter k1.

Z + P ∗ −)−−a−1−*− C1 −−k−1* P + Z.

d1 Analogously, Y and P bind at rate constant a2 to form an intermediate complex C2, which dissociates with parameter d2. In addition, the complex C2 can produce P ∗ and Y with rate parameter k2.

Y + P −)−−a−2−*− C2 −−k−2* P ∗ + Y.

d2

Assuming that the reaction kinetics follow the law of mass action, the dynamic evolution of the species concentrations can be expressed by the following system of Ordinary Differential Equations (ODEs):

z˙ = p˙∗ = c˙1 = y˙ = p˙ = c˙2 = u − a1zp∗ + (d1 + k1)c1 − δz, θ − a1zp∗ + d1c1 + k2c2 − δp∗, a1zp∗ − (d1 + k1)c1 − δc1, v − a2yp + (d2 + k2)c2 − δy, −a2yp + d2c2 + k1c1 − δp, a2yp − (d2 + k2)c2 − δc2.

We can show that the system is positive, which is consis

tent with the fact that the state variables are concentrations of chemical species, hence cannot take negative values.

Proposition 1: The system (1)–(6) is positive: given a positive initial condition, all the state variables have nonnegative values at all future times. Proof: The result follows by noticing that, for each

state variable x ∈ {z, p∗, c1, y, p, c2}, x˙ ≥ 0 when x = 0, hence the variable cannot become negative.

A. Stability analysis without production and degradation

If we neglect production and degradation reactions (u = θ = v = 0 and δ = 0), assuming that they compensate for each other, then the system simplifies to: z˙ = p˙∗ = c˙1 = y˙ = p˙ = c˙2 = −a1zp∗ + (d1 + k1)c1, −a1zp∗ + d1c1 + k2c2, a1zp∗ − (d1 + k1)c1, −a2yp + (d2 + k2)c2, −a2yp + d2c2 + k1c1, a2yp − (d2 + k2)c2, where the sums z + c1, y + c2 and p∗ + p + c1 + c2 are constant quantities, hence the system can be associated with a reduced-order system with three differential equations and three conservation laws. Since each species is involved at least in one conservation law, this is a conservative chemical reaction network. The steady-state value and its global stability can only be assessed within the stoichiometric (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) B. Quasi-steady-state approximation

In the presence of production and degradation reactions, the quantities ztot = z + c1, ytot = y + c2, and ptot = p∗ + p + c1 + c2 are not constant. From Equations (1)–(6), it follows that these quantities evolve over time as z˙tot y˙tot p˙tot = = = u − δztot, v − δytot, θ − δptot.

Therefore, at the steady state, z¯ + c¯1 = z¯tot = u/δ, while y¯ + c¯2 = y¯tot = v/δ and p¯∗ + p¯ + c¯1 + c¯2 = p¯tot = θ/δ.

Let us adopt a commonly accepted assumption in enzyme kinetics (cf. Michaelis-Menten kinetics) and consider that the concentrations of the intermediate complexes, c1 and c2, do not change on the time scale of product formation, thus reaching a quasi steady state: c˙1 = a1(ztot − c1)p∗ − d1c1 − k1c1 − δc1 ≈ 0, c˙2 = a2(ytot − c2)p − d2c2 − k2c2 − δc2 ≈ 0. (13) (14) (15) compatibility class associated with an assigned value of the conserved quantities.

Theorem 1: The system (7)–(12) admits a unique steady

state within each stoichiometric compatibility class, and such a steady state is structurally globally asymptotically stable.

Proof: If we set z = ztot − c1 and y = ytot − c2, and neglect the dynamic equations for z and y, from (7)–(12) we get the reduced-order system p˙∗ c˙1 = = p˙ =

= c˙2 −a1p∗(ztot − c1) + d1c1 + k2c2, a1p∗(ztot − c1) − (d1 + k1)c1, −a2p(ytot − c2) + d2c2 + k1c1, a2p(ytot − c2) − (d2 + k2)c2, where p∗ + p + c1 + c2 is constant. The Jacobian is −α1

α1 Jr =  0 0

β1 + d1 −(β1 + d1 + k1) k1 0 where α1 = a1(ztot − c¯1), α2 = a2(ytot − c¯2), β1 = a1p¯∗ and β2 = a2p¯. Jr is singular and weakly column diagonally dominant, with strictly negative diagonal entries and nonnegative off-diagonal entries. Therefore, the system admits the 1-norm k[p∗ − p¯∗ c1 − c¯1

p − p¯ c2 − c¯2]>k1 =. kxk1 as a weak Lyapunov function that certifies the structural marginal stability of its steady state (p¯∗, c¯1, p¯, c¯2) [ 15 ], [16, Section 4.5.5], regardless of the values of the system parameters. The only eigenvalue on the imaginary axis is λ = 0, associated with the left eigenvector [ 1, 1, 1, 1 ] that identifies an invariant subspace for the system. In this invariant subspace, the system admits a polyhedral Lyapunov function whose unit ball is given by the intersection between the shifted diamond centred at the steady state (p¯∗, c¯1, p¯, c¯2) and the hyperplane orthogonal to the vector [ 1, 1, 1, 1 ] and passing through (p¯∗, c¯1, p¯, c¯2). Since the system Jacobian is structurally nonsingular within the stoichiometric compatibility class, this Lyapunov function guarantees that the system equilibrium is structurally asymptotically stable. Note that, since the original (non-linearized) system can be rewritten as x˙ = A(x)x, where A is related to the integral of the system Jacobian [ 15 ], [ 17 ], the system admits the same Lyapunov function at all points, hence a global (not only local) stability analysis can be conducted. In particular, in view of the structural nonsingularity of the Jacobian, the results in [ 17 ] also guarantee the uniqueness of the steady state and its structural global asymptotic stability within the stoichiometric compatibility class. See also [ 18 ], where the system (7)–(12) is shown to be structurally attractive.

It is worth stressing that Theorem 1 does not rely on the

mass-action-kinetics assumption: the same result would hold true for any functional form (and parameter values) of the reaction kinetics, as long as the reaction rates are monotonic functions of the reagent concentrations.

Therefore and c1 ≈ c2 ≈

p∗ p∗ + K1

p p + K2 ztot, with

K1 = ytot, with

K2 = d1 + k1 + δ a1 a2 d2 + k2 + δ , .

If we plug the expression of c1 and c2 into the equation for p˙, we obtain

p +pK2 ytot+k1 p∗ +p∗K1 ztot−δp, p˙ = −a2p p +K2K2 ytot+d2 which can rewrite as p˙ = −(k2 + δ)

p +pK2 ytot + k1 p∗ +p∗K1 ztot − δp. (20)

Similarly, the equation for p˙∗ can be rewritten as

p˙∗ = θ − (k1 + δ) p∗ +p∗K1 ztot + k2 p +pK2 ytot − δp∗. (21) C. Steady-state analysis

To perform the steady-state analysis of the phosphorylation-dephosphorylation network, we make the following simplifying assumptions, which are typically verified in practice. Phosphorylation and dephosphorylation are assumed to be much faster than degradation: k1, k2 δ. Also, the total amount of the target protein is assumed to be much larger than the total amount of the enzymes: ptot ztot, ytot. Then, since c1 < ztot ptot and c2 < ytot ptot, we can approximate p∗ + p = ptot − c1 − c2 ≈ ptot. Then, by considering k1 + δ ≈ k1 and k2 + δ ≈ k2 (since k1, k2 δ), we can rewrite equations (20) and (21) as p˙ = −k2 p +pK2 ytot + k1 p∗ +p∗K1 ztot − δp (22)

D. Stability analysis

In this section, we show structural stability of the reduced order model including equations (13), (14) and (15), as well as equations (22) and (23). Since p + p∗ = ptot, we can consider a system with only four equations: and p˙∗ = θ − k1 p∗ +p∗K1 ztot + k2 Under these approximations, p + p∗ = ptot and consistently p˙ + p˙∗ = p˙tot.

Also, under the above approximations, by rearranging equation (22) at steady state (i.e., p˙ = 0) and neglecting the term δp, we can write the ratio

k1z¯tot r = k2y¯tot ≈ p¯/p¯tot Kˆ1 + 1 − p¯/p¯tot Kˆ2 + p¯/p¯tot (1 − p¯/p¯tot) , (24) where Kˆ1 = K1/p¯tot and Kˆ2 = K2/p¯tot.

In addition, to simplify our analysis, let us assume that

Kˆ = K1 = Kˆ2. When the steady-state amount of active p ˆ is larger than half the total amount, p¯ > p¯tot/2, the ratio is larger than one: r > 1. Conversely, when p¯ < p¯tot/2, the ratio is smaller than one: r < 1. Finally, when p¯ = p¯tot/2, this results in r = 1 (the inputs k1z¯tot and k2y¯tot are approximately the same in magnitude). In short, (p¯ ≥ 1/2 if r ≥ 1

p¯ < 1/2 if r < 1.

We can rewrite this expression as a function of z¯tot and y¯tot as follows: (p¯ ≥ 1/2 if k1z¯tot − k2y¯tot ≥ 0 p¯ < 1/2 if k1z¯tot − k2y¯tot < 0 (25)

Since at steady state z¯tot = u/δ and y¯tot = v/δ, we can rewrite the condition as a function of the inputs u and v: (p¯ ≥ 1/2 if k1u − k2v ≥ 0

p¯ < 1/2 if k1u − k2v < 0

This mechanism allow us to implement both positive and negative weights, thanks to the fact that the input u appears in the condition with a positive sign, while the input v appears with a negative sign. Then, defining the input u = Pim=1 wi∗xi, associated with a positive sign, and the input v = Pjn=−1m wj xj , associated with a negative sign, the ∗ effective input (whose sign determines whether the value of p¯ is above or below the threshold 1/2) becomes n m X wqxq = X(k1wi∗)xi + q=1 i=1 n−m X (−k2wj∗)xj , j=1 which corresponds to the weighted sum of inputs with both positive and negative coefficients. The coefficients wk∗ can be interpreted biologically as production rates of the enzymes Z and Y . Then, the steady-state expression for multiple inputs can be rewritten as (p¯ ≥ 1/2 if Pqn=1 wqxq ≥ 0 p¯ < 1/2 if Pqn=1 wqxq < 0 p˙ = −k2 p +pK2 ytot + k1 ptotpt−otp−+pK1 ztot − δp (30)

Theorem 2: The system (27)–(30) admits a unique steady

state, which is structurally globally asymptotically stable.

Proof: We start by observing that the system trajectories

are asymptotically bounded in the open set S = {0 < ztot < zMAX , 0 < ytot < yMAX , 0 < ptot < pMAX , 0 < p < ptot}, hence at least one steady state must exist in S. In particular, it can be seen that z¯tot = u/δ, y¯tot = v/δ and p¯tot = θ/δ, while p¯ ∈ (0, p¯tot) solves the equilibrium condition

p +pK2 y¯tot + δp = k1 p¯totp¯t−otp−+pK1 z¯tot =. h(p). In the interval [0, p¯tot], g(p) is a continuous strictly increasing function with g(0) = 0 and g(p¯tot) > 0, while h(p) is a continuous strictly decreasing function with h(0) > 0 and h(p¯tot) = 0, hence an intersection must exist in view of continuity, and it is unique in view of monotonicity.

Structural local asymptotic stability of the equilibrium can

be proven by noticing that the Jacobian matrix  −δ

0 J =  0  f1 0 −δ 0 −f2 0 0 −δ f3 0 0 0 −δ − f4   ,   where the quantities f1 = k1 ptoptt−otp−+pK1 , f2 = k2 p+pK2 ,

K1 K1 f3 = k1 (ptot−p+K1)2 ztot and f4 = k1 (ptot−p+K1)2 ztot +

K2 k2 (p+K2)2 ytot are positive, is a lower triangular matrix, whose eigenvalues are the negative diagonal entries. Structural asymptotic stability can also be guaranteed by the existence of a polyhedral Lyapunov function [ 15 ]: the similarity transformation T J T −1, with T = diag(1, 1, 1, ρ), turns J into a column diagonally dominant matrix with strictly negative diagonal entries, if ρ is chosen small enough; then, the weighted 1-norm kxk1,T = kT xk1, with x = [ztot − z¯tot, ytot − y¯tot, ptot − p¯tot, p − p¯]>, is a structural

Lyapunov function for the system [16, Proposition 4.58]. Finally, since the Jacobian is structurally nonsingular, the

results in [ 17 ] guarantee the structural global asymptotic stability of the equilibrium.

Also Theorem 2 would hold for any kinetics (not necessar

ily mass-action or Michaelis-Menten), as long as monotonicity with respect to species concentrations can be assumed.

Remark 1: Since all the variables of system (27)–(30) are asymptotically bounded in an open set, as shown in the proof of Theorem 2, the following fundamental result from degree theory can also be applied to prove uniqueness of the equilibrium.

Theorem 3: [ 19 ], [ 20 ] Assume that the system x˙ = f (x), with f : Rn → Rn sufficiently regular, has solutions that are globally uniformly asymptotically bounded in an open set S and admits N < ∞ equilibrium points x¯(i), i = 1, . . . , N , each contained in S and such that the determinant of the system Jacobian matrix evaluated at x¯(i) is nonzero: det(Jf (x¯(i))) 6= 0 ∀i. Then, PiN=1 = sign[det(−Jf (x¯(i)))] = 1.

For the Jacobian in Equation (31), det(−J ) > 0 structurally; therefore, the equilibrium must be unique. IV. TUNABLE MOLECULAR LINEAR CLASSIFIERS

In Section III, we have shown that a reaction network based on post-translational modification by phosphorylation and dephosphorylation may work as a tunable thresholding mechanism (see Fig. 1C). In particular, the threshold can be adjusted by the production of kinase and phosphatase species by positive and negative inputs, respectively, per equation (26). Altogether, this circuit can be viewed as a molecular perceptron with a sigmoidal activation function that incorporates positively- and negatively-weighted inputs.

In this section, we consider two case studies to show how the phosphorylation-dephosphorylation cycle, and by extension any other post-translational modification cycle, may be engineered to build bio-molecular linear classifiers in the presence of multiple inputs, and we discuss how adjusting the weights associated with kinetic rates can predictably program the linear classification.

A. Network with two positive and one negative weights

We start with the design of a linear classifier with three

inputs, as shown in Fig. 2A. We consider two inputs X1 and X2, which produce the kinase Z with rates w1 and w2, as well as a constant production rate w0 for the phosphatase Y .

All other chemical reactions are taken as in Section III, as

shown in Fig. 2B. Recall that, in our scheme, input species that produce Z are considered positively-weighted, while input species that produce Y are negatively-weighted. Thus, this reaction network represents a single perceptron with two positive inputs and a single, fixed negative input, or bias.

Using analogous assumptions as in Section III to simplify the model leads to the ODE system: z˙tot y˙tot p˙tot p˙ = = = = w1x1 + w2x2 − δztot w0 − δytot θ − δptot

ptot − p k1 ptot − p + K1 ztot − k2

p p + K2 ytot.

At the steady state, z¯tot = wδ1 x1 + wδ2 x2, y¯tot = wδ0 , and p¯tot = θδ . We can find p¯ as a function of the system inputs by following the same procedure as in Section III, which eventually yields (25) where k1z¯tot − k2y¯tot = w1k1 δ .

The terms inside the parenthesis are the effective weights of

the system. Fig. 2C-D shows how the linear classification of the system changes when we change the production rates w1 and w2: Varying w1 rescales the x-axis, varying w2 rescales the y-axis, and varying w0 rigidly shifts the white region (not shown here).

B. A network with one positive and two negative weights

As a second case study, we consider the design of a linear perceptron with one positive and two negative weights, as shown in Fig. 3A top. Hence, this reaction network features a single input X2, which produces Z with rate w2, and two inputs which produce Y – a constant production w0 and an input X1 associated with rate w1 – as shown in Fig. 3A (bottom). This leads to the ODE system: z˙tot y˙tot p˙tot p˙ = = = = w2x2 − δztot w0 + w1x1 − δytot θ − δptot

ptot − p k1 ptot − p + K1 (32) (33) (34) p p + K2 ztot − k2 ytot. (35)

At steady state, z¯tot = wδ2 x2, y¯tot = p¯tot = θδ , while p¯ is expressed by (25) with w0 + wδ1 x1, and δ k1z¯tot − k2y¯tot = w2k1 δ x2 − w0k2 δ − w1k2 δ x1.

Again, varying the production rates of the system has an equivalent effect on the network’s linear classification: w1 and w2 rescale the x- and y-axes, respectively (Fig. 3B-C) and w0 rigidly shifts the white region (not shown here).

In this example, we can observe the importance of incorporating negatively-weighted input values: Changing the sign of w1 from positive to negative reorients the decision boundary and affords this perceptron greater representation power than the previous example. Specifically, a linear classifier with only positive weights (excluding the bias term), like the perceptron in Fig. 2, is incapable of producing the classification in Fig. 3 without negative weights. In the following sections, we will also demonstrate how tuning positive and negative weights in molecular networks can enable us to engineer more complex decision boundaries when multiple nodes are interconnected.

V. MULTI-LAYER BIOMOLECULAR NEURAL NETWORKS Having shown how post-translational modification cycles can operate as a single biomolecular perceptron, we now study networks obtained by cascading many such perceptrons into a multi-layer system. In the following, we denote as

Biomolecular Neural Network (BNN) a network including multiple perceptrons, that are organized into layers linked only by feedforward connections (cf. Fig. 4). Each perceptron in a layer processes input species xi,j (i ∈ {1, . . . , W }, j ∈ {1, . . . , D}) to produce output species yi,j , where W is the width and D is the depth of the network.

BNNs including layered perceptrons based on molecular

sequestration have been studied in [ 4 ]. Here, we analyze the stability of the equilibria of a BNN composed of layered phosphorylation-dephosphorylation modules.

Proposition 2: Consider a BNN with depth D and width

wj for each layer j (hence, W = Pj wj ). Every equilibrium point is locally asymptotically stable.

Proof: The general Jacobian matrix for a feedforward BNN network takes the lower block-triangular form:

    JBN N =     

L1 C1,2 . .

.

C1,j . .

.

C1,D where • matrix Cj,k, with j, k ∈ {1, . . . , D} and j < k, describes how the variables in layer j affect the dynamics of the variables in layer k; • matrix Lj represents the autonomous dynamics of layer j and has the block-diagonal form

Lj = blockdiag{Pj,1, Pj,2, . . . , Pj,mj }, where Pi,j denotes the Jacobian matrix of the biomolecular perceptron in position i, j.

Since the Jacobian JBN N is a block-triangular matrix, its characteristic polynomial ψBN N (s) is the product of the characteristic polynomials of its diagonal blocks. Also, since each diagonal block Lj is block-diagonal, its characteristic polynomial is in turn the product of the characteristic polynomials ψi,j (s) of the perceptrons Pi,j included in the corresponding layer. Therefore,

D ψBN N (s) = Y mj !

Y ψi,j (s) .

j=1 i=1

Since the generic phosphorylation-dephosphorylation per

ceptron is structurally asymptotically stable, as shown in Theorem 2, all the eigenvalues of matrix JBN N have negative real part. Therefore, the BNN is asymptotically stable around any equilibrium.

VI. MOLECULAR NON-LINEAR CLASSIFIERS In the previous section we demonstrated the stability of any feedforward BNN that includes a cascade of perceptrons based on post-translational modification cycles. In this section, we use three simulation examples to illustrate

Fig. 4. Biomolecular Neural Network A) The schematic diagram for a BNN of depth D = 4 with two input species, X1 and X2, six biomolecular perceptrons in cicles, and one output species O1. Black arrows represent feedforward connections between neighboring layers, and red lines, feedforward connections spanning more than one layer. B) A simple biological implementation of each node based on phosphorylation reactions. how feedforward BNNs may be used to build non-linear classifiers, highlighting their versatility and computational power.

A. XNOR and XOR bio-molecular networks

Exclusive logical or (XOR) classification is a quintessen

tial problem in the field of machine learning. It is the only logical operation which is not linearly separable, and has consequently motivated the use of multi-layer networks:

Unlike single perceptrons, networks with two or more layers

can perform non-linear classifications (e.g. XOR) provided appropriate weights.

For example, by adjusting its parameter values, the twolayer, three-node network shown in Fig. 5A can implement either XNOR or XOR logical functions. To conserve space, we do not show the detailed chemical reactions for this network, but describe the system with the following ODEs, following steps as given in previous sections: z˙tot = w1,1x1 + w1,2x2 − δz1tot 1 y˙tot = w1,0 − δy1tot 1 z˙tot = w2,0 − δz2tot 2 y˙tot = w2,1x1 + w2,2x2 − δy2tot 2 z˙tot = w3,1p1 + w3,2p2 − δz3tot 3 y˙tot = w3,0 − δy3tot 3 p˙tot = θ − δpitot, i

ptot − pi p˙i = k1 ptoti− pi + K zitot − k2 i

pi pi + K yitot, with i = 1, 2, 3.

Notably, the network topology above may carry out either classification by only modifying the bias weights. In Fig. 5, we display the intermediate transfer functions given by perceptrons p1 and p2 (Fig. 5B and C, Left and Middle) for each non-linear classifier. To obtain an XNOR classification, we require p1 and p2 to have oppositely signed weights applied to x1 and x2 – positive weights for p1 and negative for p2 – and a larger magnitude of bias for w1,0 than w2,0 (see Fig. 5B). Under these conditions, the entire network represents an XNOR function; the output of p3 is maximal when x1 and x2 both take on low or high values. In contrast, if we decrease the magnitude of bias w1,0 and increase biases w2,0 and w3,0, the same network topology can perform an

XOR operation, which has the opposite output (see Fig. 5C).

This case study illustrates an important general property of BNNs: By changing the weights of each node, the classification boundary, represented by the white area of each plot, can be adjusted in a predictable way and to represent highly dissimilar decision-making functions. Biologically, such changes are tantamount to adjustments of the rate parameters of the network’s constituent reactions; no modifications to its underlying topology are required. B. More complex decision-making

Although the previous classifier is able to deliver a nonlinear function of its inputs, its network topology is incapable

of creating a closed decision boundary. To achieve this form of classification, we add a perceptron to the first layer of the network, creating a four-node network (Fig. 6). We then optimize the weights of p¯1, p¯2 and p¯3 to enclose a central region of the classification space, as shown in Fig. 6B. Only where their maximal values (i.e. blue regions) overlap is the threshold value of the fourth node exceeded. This "activated" region forms a closed boundary where the output of the entire network is high. Moreover, by modifying the weights of each node it is possible to shift, scale, and rotate this closed boundary and enclose alternative regions (not shown here). The ODEs describing this network are given below: z˙tot 1 y˙tot 1 z˙tot 2 y˙tot 2 z˙tot 3 y˙tot 3 z˙tot 4 y˙tot 4 p˙tot i p˙i = = = = = = = = = = w1,1x1 + w1,2x2 − δz1tot w1,0 − δy1tot w2,1x1 + w2,0 − δz2tot w2,2x2 − δy2tot w3,0 + w3,2x2 − δz3tot w3,1x1 − δy3tot w4,1p1 + w4,2p2 + w4,3p3 − δz4tot w4,0 − δy4tot θ − δpitot ptot − pi

i k1 ptot − pi + K i ztot − k2 i

pi pi + K yitot, (53) (44) (45) (46) (47) (48) (49) (50) (51) (52)

VII. DISCUSSION

We have discussed how, under some assumptions, a broad class of biomolecular signaling networks have the capacity to operate as molecular perceptrons, due to their sigmoidal input-output behavior with a tunable threshold, that can integrate additively the contribution of multiple inputs. In particular we have shown that the perceptron model arising from this type of network includes inputs that are weighted both positively and negatively. A great advantage of realizing and tuning negative weights is that they make it possible to optimize networks for classification. We illustrated this concept by simulating linear and non-linear classifiers based on phosphorylation/dephosphorylation cycles. This manuscript extends to signaling networks our earlier work that demonstrated how molecular sequestration motifs have the capacity to implement biomolecular neural networks [ 4 ]. While here we focus exclusively on demonstrating the computational and classification power of signaling networks, our goal for future work is to build on this approach to examine the contribution of cross-talk to the decision boundary of post-translational modification cycles.

Our analysis complements theoretical and experimental efforts toward building complex molecular networks such as chemical Boltzmann machines [ 7 ] and DNA-based classifiers [ 13 ], [ 11 ]. As molecular neural networks are becoming relevant for applications that include in vitro diagnostics [ 12 ] and in vivo cellular classifiers[ 13 ], we expect that methods to harness the computational power and implementation options of BNNs will make it possible to expand the ways we can engineer living cells for advanced decision making.

VIII. ACKNOWLEDGMENTS EF and CCS are sponsored by NSF/BBSRC award 2020039. The content of the information does not necessarily reflect the position or the policy of the Government, and no official endorsement should be inferred. [1] B. E. Housden and N. Perrimon , “ Spatial and temporal organization of signaling pathways,” Trends in biochemical sciences , vol. 39 , no. 10 , pp. 457 - 464 , 2014 . [2] G. Manning , D. B. Whyte , R. Martinez , T. Hunter , and S. Sudarsanam , “ The protein kinase complement of the human genome,” Science , vol. 298 , no. 5600 , pp. 1912 - 1934 , 2002 . [3] N. I. Markevich , J. B. Hoek , and B. N. Kholodenko , “ Signaling switches and bistability arising from multisite phosphorylation in protein kinase cascades,” The Journal of cell biology , vol. 164 , no. 3 , pp. 353 - 359 , 2004 . [4] A. Moorman , C. C. Samaniego , C. Maley , and R. Weiss , “ A dynamical biomolecular neural network,” in 2019 IEEE 58th Conference on Decision and Control (CDC) . IEEE, 2019 , pp. 1797 - 1802 . [5] A. Hjelmfelt , E. D. Weinberger , and J. Ross , “ Chemical implementation of neural networks and turing machines , ” Proceedings of the National Academy of Sciences , vol. 88 , no. 24 , pp. 10 983 - 10 987, 1991 . [6] D. Bray , “ Protein molecules as computational elements in living cells , ” Nature , vol. 376 , no. 6538 , pp. 307 - 312 , 1995 . [7] W. Poole , A. Ortiz-Munoz , A. Behera , N. S. Jones , T. E. Ouldridge , E. Winfree, and M. Gopalkrishnan , “Chemical boltzmann machines, ” in International Conference on DNA-Based Computers . Springer, 2017 , pp. 210 - 231 . [8] N. E. Buchler , U. Gerland , and T. Hwa, “ On schemes of combinatorial transcription logic , ” Proceedings of the National Academy of Sciences , vol. 100 , no. 9 , pp. 5136 - 5141 , 2003 . [9] J. Kim , J. Hopfield , and E. Winfree, “ Neural network computation by in vitro transcriptional circuits,” in Advances in neural information processing systems , 2005 , pp. 681 - 688 . [10] L. Qian , E. Winfree, and J. Bruck , “ Neural network computation with dna strand displacement cascades , ” Nature , vol. 475 , no. 7356 , pp. 368 - 372 , 2011 . [11] K. M. Cherry and L. Qian , “ Scaling up molecular pattern recognition with dna-based winner-take-all neural networks , ” Nature , vol. 559 , no. 7714 , pp. 370 - 376 , 2018 . [12] R. Lopez , R. Wang , and G. Seelig, “ A molecular multi-gene classifier for disease diagnostics,” Nature chemistry , vol. 10 , no. 7 , pp. 746 - 754 , 2018 . [13] Z. Xie , L. Wroblewska , L. Prochazka , R. Weiss , and Y. Benenson , “Multi-input rnai-based logic circuit for identification of specific cancer cells , ” Science , vol. 333 , no. 6047 , pp. 1307 - 1311 , 2011 . [14] G. Cybenko , “ Approximation by superpositions of a sigmoidal function,” Mathematics of control, signals and systems , vol. 2 , no. 4 , pp. 303 - 314 , 1989 . [15] F. Blanchini and G. Giordano, “ Piecewise-linear Lyapunov functions for structural stability of biochemical networks , ” Automatica , vol. 50 , no. 10 , pp. 2482 - 2493 , 2014 . [16] F. Blanchini and S. Miani , Set-theoretic methods in control . Springer (Second Edition) , 2015 . [17] F. Blanchini and G. Giordano, “ Polyhedral Lyapunov functions structurally ensure global asymptotic stability of dynamical networks iff the Jacobian is non-singular,” Automatica , vol. 86 , pp. 183 - 191 , 2017 . [18] M. Ali Al-Radhawi , D. Angeli , and E. D. Sontag , “ A computational framework for a Lyapunov-enabled analysis of biochemical reaction networks , ” PLOS Computational Biology , vol. 16 , no. 2 , pp. 1 - 37 , 2020 . [19] J. Hofbauer , “ An index theorem for dissipative semiflows ,” Rocky Mountain J. Math., vol. 20 , pp. 1017 - 1031 , 1990 . [20] N. G. Lloyd , Degree Theory. Cambridge University Press, 1978 .