Set up a Data Studio

Data Studio setup

Create a data studio

1. Add a data studio

To create a data studio, select Add data studio and select a template. Currently, templates for Jupyter, VS Code, and RStudio are available.

Click to show animation

Add a data studio

2. Select a compute environment

Currently, only AWS Batch is supported.

3. Mount data using Data Explorer

Create a data link

To enable access to data in a dtudio, create a custom data link pointing to the directory in the AWS S3 bucket where the results are saved. This will allow us to read and write only the data we need from cloud storage, from within our Studio.

Select the Add cloud bucket button in Data Explorer and specify the path to the output directory:

Create data link

Mount the data link into the studio

Select data to mount into your data studio environment using the Fusion file system in Data Explorer. In the Data Explorer, you can select the newly created data link to mount.

This data will be available at /workspace/data/<dataset>.

Click to show animation

Mount data into studio

4. Resources for environment

Enter a CPU or memory allocation for your data studio environment (optional). The default is 2 CPUs and 8192 MB of memory.

Then, select Add.

The data studio environment will be available in the Data Studios landing page with the status 'stopped'. Select the three dots and Start to begin running the studio.

Click to show animation

Start a studio

Connect to a studio

Connect to a data studio

To connect to a running data studio session, select the three dots next to the status message and choose Connect. A new browser tab will open, displaying the status of the data studio session. Select Connect.

Collaborate in a data studio

Collaborators can also join a data studios session in your workspace. For example, to share the results of the nf-core/rnaseq pipeline, you can share a link by selecting the three dots next to the status message for the data studio you want to share, then select Copy data studio URL. Using this link, other authenticated users with the "Connect" role (at minimum) can access the session directly.

Stop a studio session

Stop a data studio

To stop a running session, select the three dots next to the status and select Stop. Any unsaved analyses or results will be lost.

Advanced

For a more detailed use case of performing tertiary analysis with the results of the nf-core/rnaseq pipeline in an RStudio/RShiny app environment, take see Tertiary analysis with Data Studios.

Checkpoints in Data Studios

When starting a data studio, a checkpoint gets created. This checkpoint allows you to restart a data studio with previously installed software and changes made to the root filesystem of the container. Please note, that if you stop a data studio and restart it, this will restart it from the latest checkpoint. To go back to a specific previous configuration of data studio session, please restart it from a checkpoint as highlighted in the screenshot below:

alt text

More information

For a detailed explanation about specific concepts of Data Studios and the tools preinstalled in Data Studios images, see the Seqera Platform docs.

Advanced

For additional details on Data Studios based on a demonstration from Rob Newman, see Data Studios deep dive.