# CoEWC - Co-Expression Weighted by Clustering Coefficient

#### Definition

CoEWC is based on the integration of the topological properties of PPI network and the coexpression of interacting proteins. It determines a protein’s essentiality based on whether it has a high probability to be co-expressed with its neighbors and whether each of its neighbors takes part in densely connected clusters. In CoEWC, a protein’s essentiality is determined by the number of the protein’s neighbors and the probability that the protein is co-expressed with its neighbors as well as its neighbors’ clustering properties.

Co-Expression Weighted by Clustering Coefficient need two preliminary factors namely Pearson correlation coefficient (PCC) and Clustering coefficient (CC) that will be calculated as fallows:

$$PCC(X,Y)={1\over s-1} \sum_{i=1}^s \left(g(x,i)-\bar g(X)\over \sigma(X)\right).\left(g(Y,i)-\bar g(Y)\over \sigma(Y)\right) (1)$$

Where $s$ is the number of samples of the gene expression data; $g(X,i)$ (or $g(\gamma,i)$) is the expression level of gene $X$ (or $\gamma$) in the sample $i$ under a specific condition; $\bar g(X)$ (or $\bar g(Y)$) represents the mean expression level of gene $X$ (or \gamma) and $\sigma (X)$ (or $\sigma (Y)$) represents the standard deviation of expression level of gene $X$ (or $\gamma$).

$$CC(u)={{\sum_{n\in N_u} e(u,v)|e(u,v)\in E}\over k_u\times (k_u-1)/2}$$

Where $N_u$ is the set of neighbors of protein $u$ and $k_u$ denotes the number of immediately connected neighbors of $u$.

$$CoEWC(u)= \sum_{v\in N_u} PCC(u,v)\times CC(v)$$

Where $N_u$ denotes the set of all immediately connected neighbors of node $u$ in PPI network.

Different from SoECC and PeC, which all emphasize co-clustering relationship between a protein and its neighbors, CoEWC pay more attention to the clustering property of the protein’s neighbors rather than the protein itself.

Co-Expression Weighted by Clustering Coefficient need two preliminary factors namely Pearson correlation coefficient (PCC) and Clustering coefficient (CC) that will be calculated as fallows:

$$PCC(X,Y)={1\over s-1} \sum_{i=1}^s \left(g(x,i)-\bar g(X)\over \sigma(X)\right).\left(g(Y,i)-\bar g(Y)\over \sigma(Y)\right) (1)$$

Where $s$ is the number of samples of the gene expression data; $g(X,i)$ (or $g(\gamma,i)$) is the expression level of gene $X$ (or $\gamma$) in the sample $i$ under a specific condition; $\bar g(X)$ (or $\bar g(Y)$) represents the mean expression level of gene $X$ (or \gamma) and $\sigma (X)$ (or $\sigma (Y)$) represents the standard deviation of expression level of gene $X$ (or $\gamma$).

$$CC(u)={{\sum_{n\in N_u} e(u,v)|e(u,v)\in E}\over k_u\times (k_u-1)/2}$$

Where $N_u$ is the set of neighbors of protein $u$ and $k_u$ denotes the number of immediately connected neighbors of $u$.

$$CoEWC(u)= \sum_{v\in N_u} PCC(u,v)\times CC(v)$$

Where $N_u$ denotes the set of all immediately connected neighbors of node $u$ in PPI network.

Different from SoECC and PeC, which all emphasize co-clustering relationship between a protein and its neighbors, CoEWC pay more attention to the clustering property of the protein’s neighbors rather than the protein itself.

#### References

- Zhang X., Xu J., Xiao W.x., 2013. A New Method for the Discovery of Essential Proteins. PLoS ONE, 8(3). DOI: 10.1371/journal.pone.0058763