forked from BaderLab/CBW_Pathways_2023
-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy path3.05-Module3_Cytoscape_Primer.Rmd
293 lines (205 loc) · 16.8 KB
/
3.05-Module3_Cytoscape_Primer.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
# Module 3 Lab Primer: Cytoscape Primer {#cytoscape_mod3}
**This work is licensed under a [Creative Commons Attribution-ShareAlike 3.0 Unported License](http://creativecommons.org/licenses/by-sa/3.0/deed.en_US). This means that you are able to copy, share and modify the work, as long as the result is distributed under the same license.**
By Gary Bader, Ruth Isserlin, Chaitra Sarathy, Veronique Voisin
## Goal of the exercise
**Create a network and customize it.**
The goal of this exercise is to learn how to create a network in Cytoscape and customize id. In this example, the proteins are the entities represented as nodes in the network and known physical interactions are the connections between the proteins that are represented as edges. We will overlay 2 additional pieces of information about these proteins, mutation information per protein as node color and mutation expression as node size.
## Data
* The data used in this exercise is a set of protein - protein interactions and associated attributes.
## Start the exercise
To start the lab practical section, first create a cytoscape_primer_files directoty on your computer and download the files below.
```{block, type="rmd-datadownload"}
Right click on link below and select "Save Link As...".
Place it in the corresponding module directory of your CBW work directory.
```
<br>
<br>
Two files are needed for this exercise:
* [networktable.txt](./Module3/cytoscape_primer/data/network_table.txt)
* [nodeattribute.txt](./Module3/cytoscape_primer/data/node_attribute.txt)
## Exercise 1a - Create Network from table
1. Launch Cytoscape
1. Locate the top menu bar and select **File**,--> **Import**, --> **Network from File…**.
<img src="./Module3/cytoscape_primer/images/cyto_menu_import.png" width="75%" />
<ol start=3 type="1">
<li> Browse your computer and select the file [networktable.txt](./Module3/cytoscape_primer/data/network_table.txt)
</ol>
<ol start=4 type="1">
<li> An **Import Network from Table** dialog box opens. The 3 columns of the table should be set as “source”, “interaction” and “target” respectively. </ol>
<img src="./Module3/cytoscape_primer/images/cyto_table_import.png" width="75%" />
```{block, type="rmd-tip"}
Cytoscape will assume, by default, will look for the column names that start with "source", "interaction" and "target". It will assume that any other column is an interaction attribute (edge attribute)
* This is just an example file. You can import files with any number of additional columns and choose to ignore all columns except for the ones that you want to import or import all of them. Although Cytoscape tries to guess the data type of each column and the type (ie. is it an attribute associated with source node, target nodes or the interaction) you are able to fine tune everything.
<img src="./Module3/cytoscape_primer/images/cyto_table_import_options.png" width="75%"/>
```
<br>
<br>
<ol start=5 type="1">
<li> Click “Ok”.
</ol>
<img src="./Module3/cytoscape_primer/images/cyto_basic_network.png" width="75%"/>
* A network containing the proteins as blue square nodes and interaction as edges should be displayed in the main Cytoscape window.
## Exercise 1b - Load node attributes
1. Locate the Cytoscape top menu bar and select **File**,--> **Import**,--> **Table from File…**.
<img src="./Module3/cytoscape_primer/images/cyto_menu_import_table.png" width="75%" />
1. Browse your computer and select the file [nodeattribute.txt](./Module3/cytoscape_primer/data/node_attribute.txt)
1. click “Open”.
1. An “Import Table from Columns” dialog appears.
<img src="./Module3/cytoscape_primer/images/cyto_import_attribute_table.png" width="75%" />
1. Click on “OK”.
1. You should be able to see the imported attributes in the node table.
<img src="./Module3/cytoscape_primer/images/cyto_import_attribute_loaded.png" width="75%" />
<br>
<br>
```{block, type="rmd-note"}
i. The key column is assumed to be the first column in your table.
i. The key is the column in the loaded attribute file used to match your attributes to your network.
i. **key colum for Network** is the column in the Network that the key is matched to. (In this network there isn't the ability to set this value because that is the only attribute associated with the nodes in our network but normally this drop box will be selectable)
i. The key and matching column need to match perfectly (unless you have specifid that case does not matter).
<img src="./Module3/cytoscape_primer/images/cyto_import_attribute_table_key.png" width="75%" />
```
<br>
<br>
```{block, type="rmd-tip"}
Similiar to the **Import Network from Table**, everything about the import is customizable. Cytoscape does its best to guess the datatypes of each column but you are able to fine tune it.
<img src="./Module3/cytoscape_primer/images/cyto_import_attribute_table_options.png" width="75%" />
There are also advanced options if you want to:
* change the file delimiter
* skip lines
* specify the header column
<img src="./Module3/cytoscape_primer/images/cyto_import_attribute_table_advanced.png" width="75%" />
```
## Exercise 1c - Map node attributes to Visual Style
1. Go to “Control Panel” on the left side and select the “Style” tab. Make sure that you are in the “Node” tab.<br>
<img src="./Module3/cytoscape_primer/images/cyto_style_panel.png" width="75%" />
1. Select the “Fill Color” field
1. expand it by clicking on the right arrow.
<img src="./Module3/cytoscape_primer/images/cyto_fill_colour.png" width="75%" />
1. Set “Column” to “expression” and “Mapping Type” to “Continuous Mapping”.
<img src="./Module3/cytoscape_primer/images/cyto_fill_colour_expanded.png" width="75%" />
1. This will change the colours of the nodes to the default colour coding.
<img src="./Module3/cytoscape_primer/images/cyto_fill_colour_network_default.png" width="75%" />
1. Double click on the continuos mapping colour box to manually adjust the colour and other settings.
1. At the bottom of the “Style” tab, check the box “Lock node width and height”.
<img src="./Module3/cytoscape_primer/images/cyto_lock_nodes.png" width="75%" />
1. Select the “Size” field and
1. expand it by clicking on the right arrow.
1. Set “Column” to “mutation” and “Mapping Type” to “Continuous Mapping”.
<img src="./Module3/cytoscape_primer/images/cyto_size_nodes.png" width="75%" />
1. Your resulting network maps expression to the colour of the node and the size of the node to the number of mutations.
<br>
<br>
```{block, type="rmd-bonus"}
1. Adjust the setting on the colour mapping. Change the colour scheme. Change the maximum and minimum values.
1. Adjust the setting on the size mapping. Make the nodes bigger with higher values.
1. Eventhough the network is small, play around with the layouts.
```
## Exercise 2 - Work with larger networks
Cytoscape supplies a few demo networks that you can play around with. When you open cytoscape you are presented with a Start Panel where you can choose to reload a previous session or load in one of the sample networks.
<img src="./Module3/cytoscape_primer/images/cyto_starter_panel.png" width="75%" />
1. You do not need to re-open cytoscape to open the starter panel. Locate the Cytoscape top menu bar and select **View**,--> **Show Starter panel**.
<img src="./Module3/cytoscape_primer/images/launch_starter_panel.png" width="75%" />
1. Double click on the **Affinity Purification Network** to open it.
1. If you already have a session open then you will recieve a warning that the current session will be lost. Before proceeding make sure your current session is saved. (Click on cancel. Then, **File** --> **Save as**)<br>
<img src="./Module3/cytoscape_primer/images/session_overwrite_warning.png" width="75%" />
1. Once the network has loaded you will see a network of protein interactions derived from an affinity purification experiment. Bait proteins are reprsented as pink hexagons and their corresponing prey proteins blue boxes.
<img src="./Module3/cytoscape_primer/images/affinity_purification_network.png" width="75%" />
1. Using this larger network play around with the different layouts
<br>
<br>
```{block, type="rmd-bonus"}
1. Search for the node "VPR"
1. select all of the prey proteins associated with "VPR"
```
## Exercise 3 - Perform basic enrichment analysis using EnrichmentTable
In Module 2 we performed detailed enrichment analysis with g:profiler and GSEA. We supplied gene lists and ranked expression sets in order to perform the analysis. What if you want to run a quick enrichment analysis with a given network or a given subset of the network? The easiest way to do this is to use the cytoscape app EnrichmentTable. EnrichmentTable will query g:profiler directly with the given network or subnetwork. Not all of the parameters that are available in the web version can be tweaked from the enrichmentmap table app but it can be an easy way to quickly see enrichment results.
We will select the bait protein VPR and all its associated prey proteins to use for an enrichment analysis.
<br>
<br>
```{block, type="rmd-note"}
**Bait Protein** - Is the labelled protein in an affinity purification experiment that is pulled down.<br>
**Prey Protein** - are the proteins that are associated with the bait protein when it is pulled down and are assumed to interact with the bait protein.<br>
**First neighbor** - are all the nodes that are directly connected to the given node
```
<br>
<br>
1. In the search bar enter "VPR". Press enter.
<img src="./Module3/cytoscape_primer/images/affinity_purification_vpr_search.png" width="75%" />
1. VPR is now the only highlighted node in the network. In order to select all its associated preys we need to select all the nodes that are connected to VPR, all of VPR's first neighbours. There are two ways to select the first neighbours:
i. In the top menu bar click on **Select** --> **Nodes** --> **First neighbors of selected nodes** --> **undirected**
i. Click on the **first neighbor** button, <img src="./Module3/cytoscape_primer/images/first_neighbour.png" width="5%"/>, in the quick links button set.
<ol start=3 type="1">
<li>Click on the "Enrichment Table" in the Table Panel.</li>
</ol>
<ol start=4 type="1">
<li>Click on the cog icon in the top right hand corner of the Enrichment Table panel
<img src="./Module3/cytoscape_primer/images/enrichment_table_panel.png" width="75%" />
</li>
</ol>
<ol start=5 type="1">
<li>This will bring up a panel with the adjustable settings. There are only 5 adjustable parameters- </li>
</ol>
i. **Organism** - This shows a list of organisms that are available on the g:Profiler site.
i. **Gene ID column** - the column in the current network that you want to use to search g:Profiler with. Ideally this should be a column specifying the Gene Name or other identifier.
i. **Multiple testing correction** - change to fdr.
i. **Adjusted p-value threshold (min 0 max 1)** - leave as 0.05. If you are getting too many results you can make this value smaller.
i. **Include inferred GO annotations (IEA)** - by default the search will exclude inferred from electonic annotation GO terms. If you want to include them, select this option. <br>
<img src="./Module3/cytoscape_primer/images/enrichment_table_params.png" width="75%" />
<br>
<br>
```{block, type="rmd-tip"}
By default, EnrichmentTable automatically uses all the databases available on the g:Profiler site. There is no way to filter prior to running the analysis. You need to filter the results after the analysis has been run. This **will** change the results because you end up filtering the results after the multiple correction and the multiple correction is dependent on the number of genesets you are testing with.
```
<br>
<br>
<ol start=6 type="1">
<li>Filter the EnrichmentTable results to show only GO:BP, Reactome and Wikipathway, similiar to what we used in Module 2. </li>
</ol>
i. Click on the filter icon in the top left hand corner of the enrichment table results. <br> <img src="./Module3/cytoscape_primer/images/enrichment_table_filter.png" width="75%" />
i. Next to **Select Categories** select *Gene Ontology Biological Process*, *Reactome*, *Wikipathways*. To select multiple options click and hold *command* key on Mac or *Shift* on Windows. <br> <img src="./Module3/cytoscape_primer/images/enrichment_table_filter_options.png" width="75%" />
i. click on **OK**
i. The EnrichmentTable will update to only include the sets from *Gene Ontology Biological Process*, *Reactome*, *Wikipathways*.<br> <img src="./Module3/cytoscape_primer/images/enrichment_table_filter_results.png" width="75%" />
## Exercise 3B - create Enrichment Map and Enhanced graphics nodes from EnrichmentTable {#enrichmenttabl-features}
<ol start=1 type="1">
<li> To create an Enrichment Map from the EnrichmentTable results, Click on the EM logo in the top left hand bar in the ErichmentTable Panel.<br> <img src="./Module3/cytoscape_primer/images/enrichment_table_em_button.png" width="75%" /> </li>
</ol>
i. This will bring up an EM options panel with very limited parameter adjustments. You can only change the name of the network and the connectivity threshold. You have already specified the p-value threshold when you originally performed the analysis. If you want to create your network with a more permissive q-value you need to go back to the EnrichmentTable search panel. Click on **OK**<br> <img src="./Module3/cytoscape_primer/images/enrichment_table_em_params.png" width="75%" />
i. This will create an Enrichment Map in a new network and represents all the *Gene Ontology Biological Process*, *Reactome*, *Wikipathways* terms enriched for the VPR and its prey protein set. <br> <img src="./Module3/cytoscape_primer/images/enrichment_table_em.png" width="75%" />
## Exercise 4 - Load network from NDex
[NDex](https://www.ndexbio.org/) is an open-source repository where scientists can store, share, manipulate and publish biological network data. Networks are viewable on the web through their webapp but can also be downloaded directly into cytoscape so you can search, manipulate, integrate and analyze the given network for yourselves.
For the purpose of this exercise we are going to load in a network from the publication [A protein landscape of Breast Cancer](https://www.science.org/doi/10.1126/science.abf3066?url_ver=Z39.88-2003&rfr_id=ori:rid:crossref.org&rfr_dat=cr_pub%20%200pubmed). This publication is associated with multiple networks the the authors of this paper created and shared in NDex - https://www.ndexbio.org/index.html#/networkset/4423340d-e8e3-11eb-b666-0ac135e8bacf
1. Start a new session. **File** --> **Close**
1. In the Network Search bar (located at the top of the control panel) make sure that the search provider is set to NDex. <br> <img src="./Module3/cytoscape_primer/images/ndex_db_choice.png" width="75%" />
<br>
<br>
```{block, type="rmd-tip"}
It should be set to NDex by default but click on down arrow to see the different data sources you can search for. Later in the workshop we will be using this bar to query GeneMania.
```
<br>
<br>
<ol start=3 type="1">
<li> Enter *MCF7_All_PPI>=0.9* into the search box, Click on the search icon. <br> <img src="./Module3/cytoscape_primer/images/ndex_search_term.png" width="75%" /></li></ol>
<ol start=4 type="1">
<li> A search results box will appear. The *MCF7_All_PPI>=0.9* network is just one of the networks associated with this publication. Eventhough you are searching for this specific network, other networks associated with the original paper will also show up in the search results as well as others.<br> <img src="./Module3/cytoscape_primer/images/ndex_search_results.png" width="75%" /></li>
</ol>
<ol start=5 type="1">
<li> Click on the green down arrow next to *MCF7_All_PPI>=0.9*, the network will start to import.</li></ol>
<ol start=6 type="1">
<li> Once the network has been loaded, click on **Close Dialog** </li></ol>
<ol start=7 type="1">
<li> Resulting network loaded into cytoccape.<br> <img src="./Module3/cytoscape_primer/images/ndex_resulting_network.png" width="75%" /> </li></ol>
<br>
<br>
```{block, type="rmd-note"}
**Description taken from [NDex record](https://www.ndexbio.org/viewer/networks/84e36f91-ecb7-11eb-b666-0ac135e8bacf)**<br>
* Baits are shown as yellow box, and
* preys as grey circle.
* Size of each node represents number of patients with alterations in each protein.
* Dotted line represents the physical protein-protein association (validated in other studies) with high Integrated Association Stringency score.
```
<br>
<br>
```{block, type="rmd-bonus"}
1. Change the edge width to reflect the number of patients the associations is found in instead of the PPI score.
1. Change the default node colour to blue.
```