Merge pull request #4152 from shiltemann/ucsc-ss

hexylena · web-flow · commit 5b091b3dc7a3 · 2023-05-19T10:14:22.000+02:00
[Intro to Galaxy and Genomics] update ucsc display link screenshot
diff --git a/topics/introduction/tutorials/galaxy-intro-strands/tutorial.md b/topics/introduction/tutorials/galaxy-intro-strands/tutorial.md
@@ -159,7 +159,7 @@ The Table Browser has a daunting number of options. Fortunately, they are all se
 **track** has a bewildering list of options. UCSC suggests `GENCODE v41`.  A web search leads us to the [GENCODE web site](https://www.gencodegenes.org/) which prominently states:
 
 > <warning-title>ALL GENCODE is different from GENCODE</warning-title>
-> **Warning** The Table browser only provides the most recent release of GENCODE which is updated several times per year. ALL GENCODE does not contain the same data as GENCODE and you should select the GENCODE track even if the version number is wrong. 
+> **Warning** The Table browser only provides the most recent release of GENCODE which is updated several times per year. ALL GENCODE does not contain the same data as GENCODE and you should select the GENCODE track even if the version number is wrong.
 {: .warning}
 
 >The goal of the GENCODE project is to identify and classify all gene features in the human and mouse genomes with high accuracy based on biological evidence...
@@ -317,7 +317,7 @@ Here's how we'll answer this question:
 
 It turns out that all of these steps are easy in Galaxy!
 
-### Split the genes into forward and reverse datasets
+## Split the genes into forward and reverse datasets
 
 How might we do this?  Column 6 contains the strand information.  Can we split genes into two datasets based on the value of Column 6.  How?  Lets take a look at our available tools.  And *whoa! There are over 40 toolboxes, and several hundred tools.* How are we going to find a tool that can do the split?
 
@@ -396,7 +396,7 @@ Your history should now have (at least) 3 datasets in it, with names like:
 
 The number of genes in the `forward` plus `reverse` datasets should be the same as in the `Genes chr22` dataset.  If they aren't can you figure out why?
 
-### Check for overlaps
+## Check for overlaps
 
 Genes are an example of a *genomic interval*.
 
@@ -424,7 +424,7 @@ Of the tools in the **Operate on Genomic Intervals** toolbox, **Join** and parti
 >     - {% icon param-files %}*"of"*:  `Genes, forward strand` (the first dataset)
 >     - {% icon param-files %} *"that intersect"* : `Genes, reverse strand` (the second dataset)
 >     - *"for at least"*: `1`
->       
+>
 >       This will return genes with even just one position overlapping.
 >     - *Click* **Run Tool**.
 >
@@ -436,7 +436,7 @@ Of the tools in the **Operate on Genomic Intervals** toolbox, **Join** and parti
 {: .hands_on}
 
 
-## Results and final steps.
+# Results and final steps.
 
 At this point we *could* say that we have answered our question. Using dataset previews in the history panel, we can compare the number of genes in the `Overlapping forward` and `Overlapping reverse` datasets with the number of genes in the full `Genes chr22` dataset, and *conclude that overlapping genes on opposite strands are actually pretty common.*
 
@@ -445,7 +445,7 @@ However, before we rush off to publish our conclusions, let's
 1. Get both the forward and reverse overlapping genes into a single dataset (one link will look better in our publication), and
 2. *Visualize* our new dataset, just to double-check our results.
 
-### Combine forward and reverse overlapping genes into one dataset.
+## Combine forward and reverse overlapping genes into one dataset.
 
 What tool can we use to combine the two datasets into one?  Try *searching* for `combine` or `join` or `stack` in the tool search box.  You'll find lots of tools, but none of them do what we want to do. *Some times you just have to manually look through toolboxes to find what you need.*  Where should we look?  It's probably not **Get Data** or **Send Data**, but it could easily be in any of the next 4 toolboxes: **Lift-Over, Collection Operations, Text Manipulation, or Datamash**.
 
@@ -459,18 +459,18 @@ It turns out that **Lift-Over** and **Collection Operations** are not what we wa
 >   - {% icon param-files %} *"Concatenate Dataset"*: `Overlapping reverse genes`.
 >   - *"Dataset*"
 >      - Click on {% icon param-repeat %} *"Insert Dataset"*
->      
+>
 >         This adds a second dataset pull-down menu to the form.
 >
->      - In *"1: Dataset"*      
+>      - In *"1: Dataset"*
 >         - {% icon param-files %} *"Select"*: `Overlapping forward genes` as the second dataset.
 > 4. *Click* **Run Tool**
 > 5. *Rename* the resulting dataset something informative like `Overlapping genes`
 {: .hands_on}
 
 Once the concatenate operation is finished, preview the dataset in your history panel.  Does it have the expected number of genes in it?  If not, see if you can figure out what happened.
 
-### Visualize the overlapping genes
+## Visualize the overlapping genes
 
 Galaxy knows about several visualization options for lots of different dataset types, including BED.  Whenever you preview a dataset in the history panel, Galaxy provides links to these visualizations.  For BED files (which is the format we have), options include **IGB, IGV,** and **UCSC main.**  IGB and IGV are widely used desktop applications and eventually you may want to install one or both of them.  For now, let's visualize the data at UCSC, using the UCSC *Genome* Browser.
 
@@ -479,7 +479,8 @@ Galaxy knows about several visualization options for lots of different dataset t
 > 1. Click on your `Overlapping genes` dataset in your history panel. This will show the dataset preview in the history panel.
 > 2. Click to expand the dataset, if it isn't already, so that you can see the dataset metadata and additional actions like Visualize.
 > 3. Click on the {% icon galaxy-barchart %} (**Visualize**) icon
-> 3. Click on the **display at UCSC main** link.
+> 4. Click on the **display at UCSC (main)** link that appears in the blue box at the top of the screen.
+>    ![visualisation options are shown in Galaxy's middle panel]({% link topics/introduction/images/101_displayucsc.png %})
 >
 > This will launch a new window, showing UCSC's Genome Browser with our dataset shown right at the top.  UCSC figures out that our first overlapping gene is ~11 million bases into chromosome 22, and it has landed us there.
 >
@@ -548,7 +549,7 @@ Let's refine our question slightly
 
 Run the [Create a reusable workflow from a history]({% link topics/galaxy-interface/tutorials/history-to-workflow/tutorial.md %}) tutorial for how to do this, *and then come back here to run your newly created workflow with the exon data.*
 
-## Rerun analysis with exon data
+# Rerun analysis with exon data
 
 We want to run the same analysis, but this time only look for overlaps that happen in *exons*, the parts of genes that produce stuff our body uses.  Before we start looking at exons, let's start a new history, one that contains only the genes file we got from UCSC. We could go back to UCSC and refetch the file, but there is an easier way.
 
@@ -567,7 +568,7 @@ We want to run the same analysis, but this time only look for overlaps that happ
 > 6. The history name is a link.  *Click* on it.
 {: .hands_on}
 
-### Get the exon data
+## Get the exon data
 
 And your new history appears in the history panel with the copied *genes* dataset.  What we need is *exons.* How can we get the exon information?  There are two relatively easy ways to get this information, one of which will be very familiar.
 
@@ -588,7 +589,7 @@ If you got the data from UCSC it will look something like this:
 
 Your history should now have two datasets: one describing entire genes, and one describing just the exons.
 
-### Rerun the analysis, this time on exons.
+## Rerun the analysis, this time on exons.
 
 When you did the *History to Workflow* tutorial you created a new workflow that was then added to your list of defined workflows.