Set up a Data Studio
Data Studio setup
Create a data studio
1. Add a data studio
To create a data studio, select Add data studio and select a template. Currently, templates for Jupyter, VS Code, and RStudio are available.
Click to show animation
2. Select a compute environment
Currently, only AWS Batch is supported.
3. Mount data using Data Explorer
Create a data link
To enable access to data in a dtudio, create a custom data link pointing to the directory in the AWS S3 bucket where the results are saved. This will allow us to read and write only the data we need from cloud storage, from within our Studio.
Select the Add cloud bucket button in Data Explorer and specify the path to the output directory:
Mount the data link into the studio
Select data to mount into your data studio environment using the Fusion file system in Data Explorer. In the Data Explorer, you can select the newly created data link to mount.
This data will be available at /workspace/data/<dataset>
.
Click to show animation
4. Resources for environment
Enter a CPU or memory allocation for your data studio environment (optional). The default is 2 CPUs and 8192 MB of memory.
Then, select Add.
The data studio environment will be available in the Data Studios landing page with the status 'stopped'. Select the three dots and Start to begin running the studio.
Click to show animation
Connect to a data studio
To connect to a running data studio session, select the three dots next to the status message and choose Connect. A new browser tab will open, displaying the status of the data studio session. Select Connect.
Collaborate in a data studio
Collaborators can also join a data studios session in your workspace. For example, to share the results of the nf-core/rnaseq pipeline, you can share a link by selecting the three dots next to the status message for the data studio you want to share, then select Copy data studio URL. Using this link, other authenticated users with the "Connect" role (at minimum) can access the session directly.
Stop a data studio
To stop a running session, select the three dots next to the status and select Stop. Any unsaved analyses or results will be lost.
Advanced
For a more detailed use case of performing tertiary analysis with the results of the nf-core/rnaseq pipeline in an RStudio/RShiny app environment, take see Tertiary analysis with Data Studios.
Checkpoints in Data Studios
When starting a data studio, a checkpoint gets created. This checkpoint allows you to restart a data studio with previously installed software and changes made to the root filesystem of the container. Please note, that if you stop a data studio and restart it, this will restart it from the latest checkpoint. To go back to a specific previous configuration of data studio session, please restart it from a checkpoint as highlighted in the screenshot below:
More information
For a detailed explanation about specific concepts of Data Studios and the tools preinstalled in Data Studios images, see the Seqera Platform docs.
Advanced
For additional details on Data Studios based on a demonstration from Rob Newman, see Data Studios deep dive.