Tag Archives: roadrunner

After a bunch of various issues (running out of hard drive space – multiple times, config file issues, typos), I’ve finally given up on running meraculous. It failed, again, saying it couldn’t find a file in a directory that meraculous created! I’ve emailed the authors and if they have an easy fix, I’ll implement it and see what happens.

Anyway, it’s all documented in the Jupyter Notebook below.

One good thing came out of all of it is that I had to run kmergenie to identify an appopriate kmer size to use for assembly, as well as estimated genome size (this info is needed for both meraculous and SOAPdeNovo (which I’ll be trying next)):

kmergenie output folder: http://owl.fish.washington.edu/Athaliana/20180125_geoduck_novaseq/20180206_kmergenie/
kmergenie HTML report (doesn’t display histograms for some reason): 20180206_kmergenie/histograms_report.html
kmer size: 117
Est. genome size: 2.17Gbp

Jupyter Notebook (GitHub): 20180205_roadrunner_meraculous_geoduck_novaseq.ipynb

Software Installation – ALPACA on Roadrunner

0000-0002-2747-368X

List of software that needed installing to run ALPACA:

Installed all software in:

/home/shared/

Had to change permissions on /home/shared/. Used the following to change permissions recursively (-R) to allow all admin (i.e. sudo group) users to read/write in this directory:

$sudo chown -R :sudo /home/shared

Compiled Celera Assembler from source (per the ALPACA requirements). This is the source file that I used: https://sourceforge.net/projects/wgs-assembler/files/wgs-assembler/wgs-8.3/wgs-8.3rc2.tar.bz2/download

Added all software to my system PATH by adding the following to my ~./bashrc file:

## Add bioinformatics softwares to PATH

export PATH=${PATH}:
/home/shared/alpaca:
/home/shared/Bismark:
/home/shared/bowtie2-2.3.3.1-linux-x86_64:
/home/shared/ectools-0.1:
/home/shared/PBSuite_15.8.24/bin:
/home/shared/pecan/bin:
/home/shared/samtools-1.6/bin:
/home/shared/wgs-assembler/Linux-amd64/bin

After adding that info to the bottom of my ~./bashrc file, I re-loaded the file into system memory by sourcing the file:

$source ~/.bashrc

Followed the ALPACA test instructions to confirm proper installation. More specific test instructions are actually located at the top of this file: /home/shared/alpaca/scripts/run_example.sh

Changed Celera Assembler directory name:

$mv /home/shared/wgs-8.3rc2 /home/shared/wgs-assembler

Step 1.

$mkdir /home/shared/test

Step 2.

$cd /home/shared/test/

Step 3.

$../alpaca/scripts/run_example.sh

Step three failed (which executes the run_example.sh script) due to permission problems.

Realized the script file didn’t have execute perimssions so I added execute permissions with the following command:

$sudo chmod +x /home/shared/alpaca/scripts/run_example.sh

Step 4. Continued with ALPACA Tests 2 & 3.

Everything tested successfully. Will try to get an assembly running with our PacBio and Illumina data.

Computer Management – Additional Configurations for Reformatted Xserves

0000-0002-2747-368X

Sean got the remaining Xserves configured to run independently from the master node of the cluster they belonged to and installed OS X 10.11 (El Capitan).

The new computer names are Ostrich (formerly node004) and Emu (formerly node002).

He enabled remote screen sharing and remote access for them.

Sean also installed a working hard drive on Roadrunner and got that back up and running.

I went through this morning and configured the computers with some other changes (some for my user account, others for the entire computer):

Renamed computers to reflect just the corresponding bird name (hostnames had been labeled as “bird name’s Xserve”)
Created srlab user accounts
Changed srlab user accounts to Standard instead of Administrative
Created steven user account
Turned on Firewalls
Granted remote login access to all users (instead of just Administrators)
Installed Docker Toolbox
Changed power settings to start automatically after power failure
Added computer name to login screen via Terminal:

sudo defaults write /Library/Preferences/com.ap\ple.loginwindow LoginwindowText "TEXT GOES HERE"

Changed computer HostName via Terminal so that Terminal displays computer name:

sudo scutil --set HostName "TEXT GOES HERE"

Installed Mac Homebrew (I don’t know if installation of Homebrew is “global” – i.e. installs for all users)
Used Mac Homebrew to install wget
Used Mac Homebrew to install tmux

Docker – VirtualBox Defaults on OS X

0000-0002-2747-368X

I noticed a discrepancy between what system info is detected natively on Roadrunner (Apple Xserve) and what was being shown when I started a Docker container.

Here’s what Roadrunner’s system info looks like outside of a Docker container:

However, here’s what is seen when running a Docker container:

It’s important to notice the that the Docker container is only seeing 2 CPUs. Ideally, the Docker container would see that this system has 8 cores available. By default, however, it does not. In order to remedy this, the user has to adjust settings in VirtualBox. VirtualBox is a virtual machine thingy that gets installed with the Docker Toolbox for OS X. Apparently, Docker runs within VirtualBox, but this is not really transparent to a beginner Docker user on OS X.

To change the way VirtualBox (and, in turn, Docker) can access the full system hardware, you must launch the VirtualBox application (if you installed Docker using Docker Toolbox, you should be able to find this in your Applications folder). Once you’ve launched VirtualBox, you’ll have to turn off the virtual machine that’s currently running. Once that’s been accomplished, you can make changes and then restart the virtual machine.

Shutdown VirtualBox machine before you can make changes:

Here are the default CPU settings that VirtualBox is using:

Maxed out the CPU slider:

Here are the default RAM settings that VirtualBox is using:

Changed RAM slider to 24GB:

Now, let’s see what the Docker container reports for system info after making these changes:

Looking at the CPUs now, we see it has 8 listed (as opposed to only 2 initially). I think this means that Docker now has full access to the hardware on this machine.

This situation is a weird shortcoming of Docker (and/or VirtualBox). Additionally, I think this issue might only exist on the OS X and Windows versions of Docker, since they require the installation of the Docker Toolbox (which installs VirtualBox). I don’t think Linux installations suffer from this issue.

Docker – One liner to create Docker container

0000-0002-2747-368X

One liner to create Docker container for Jupyter notebook usage and data analysis on roadrunner (Xserve):

docker run -p 8888:8888 -v /Users/sam/gitrepos/LabDocs/jupyter_nbs/sam/:/notebooks -v /Users/sam/data/:/data -v /Users/sam/analysis/:/analysis -it kubu4/bioinformatics:v11 /bin/bash

This does the following:

Maps roadrunner port 8888 to Docker container port 8888 (for Jupyter notebook access outside of the Docker container)
Mounts my local Jupyter notebooks directory to the
```
/notebooks
```
directory in the Docker container
Mounts my local data directory to the
```
/data
```
directory in the Docker container
Mounts my local analysis directory to the
```
/analysis
```
directory in the Docker container

These commands allow me to interact with data outside of the Docker container.

RAM Upgrade – Roadrunner (Apple Xserve) to 48GB RAM

0000-0002-2747-368X

We received the new 48GB RAM set we ordered from Other World Computing for the Apple Xserve (roadrunner) that I installed El Capitan (OS X 10.11.5) on two weeks ago.

I installed it and this computer (which was plenty quick before) is extremely responsive now!

Below are some pics from the installation:

Computer Setup – Cluster Node003 Conversion

0000-0002-2747-368X

Here’s an overview of some of the struggles getting node003 converted/upgraded to function as an independent computer (as opposed to a slave node in the Apple computer cluster).

6TB HDD
Only 2.2TB recognized when connected to Hummingbird via Firewire – internet suggests that is max for Xserve; USB might recognize full drive) – Hummingbird is a converted Xserve running Mavericks
Reformatted on different Mac and full drive size recognized
Connected to Hummingbird (via USB) and full 6TB recognized
Connected to Mac Mini to install OS X
Tried installing OS X 10.8.5 (Mountain Lion) via CMD+r at boot, but failed partway through installation
Tried and couldn’t reformat drive through CMD+r at boot with Disk Utility
Broken partition tables identified on Linux, used GParted to establish partition table, back to Mac Mini and OS X (Mountain Lion) install worked
Upgraded to OS X 10.11.5 (El Capitan)
Inserted drive to Mac cluster node003 – wouldn’t boot all the way – Apple icon, progress bar > Do Not Enter symbol
Removed drive, put original back in, connected 6TB HDD via USB, but booting from USB not an option (when booting and holding Option key)
Probably due to node003 being part of cluster – reformatted original node003 drive with clean install of OS X Server.
Booting from USB now an option and worked with 6TB HDD!
Put 6TB HDD w/El Capitan in internal sled and won’t boot! Apple icon, progress bar > Do Not Enter symbol
Installed OS X 10.11.5 (El Capitan) on old 1TB drive and inserted into node003 – worked perfectly!
Will just use 1TB boot drive and figure out another use for 6TB HDD
Renamed node003 to roadrunner
Current plan is to upgrade from 12GB to 48GB of RAM and then automate moving data off this drive to long-term storage on Owl (Synology server).

Sam's Notebook

University of Washington – Fishery Sciences – Roberts Lab

Tag Archives: roadrunner

FastQC – RRBS Geoduck BS-seq FASTQ data

Jupyter Notebook:

Results:

FastQC output folder:

MultiQC output folder:

MultiQC report (HTML):

TrimGalore/FastQC/MultiQC – TrimGalore! RRBS Geoduck BS-seq FASTQ data

20180516 – UPDATE!!

THIS WAS RUN WITH THE INCORRECT SETTING IN TRIMGALORE! `--non-directional`

WILL RE-RUN

Results:

TrimGalore! output folder:

FastQC output folder:

MultiQC output folder:

MultiQC report (HTML):

NovaSeq Assembly – Trimmed Geoduck NovaSeq with Meraculous

Software Installation – ALPACA on Roadrunner

Step 1.

Step 2.

Step 3.

Step 4. Continued with ALPACA Tests 2 & 3.

Computer Management – Additional Configurations for Reformatted Xserves

Docker – VirtualBox Defaults on OS X

Shutdown VirtualBox machine before you can make changes:

Here are the default CPU settings that VirtualBox is using:

Maxed out the CPU slider:

Here are the default RAM settings that VirtualBox is using:

Changed RAM slider to 24GB:

Now, let’s see what the Docker container reports for system info after making these changes:

Docker – One liner to create Docker container

RAM Upgrade – Roadrunner (Apple Xserve) to 48GB RAM

Computer Setup – Cluster Node003 Conversion

Jupyter Notebook:

Results:

FastQC output folder:

MultiQC output folder:

MultiQC report (HTML):

20180516 – UPDATE!!

THIS WAS RUN WITH THE INCORRECT SETTING IN TRIMGALORE! --non-directional

WILL RE-RUN

Results:

TrimGalore! output folder:

FastQC output folder:

MultiQC output folder:

MultiQC report (HTML):

Step 1.

Step 2.

Step 3.

Step 4. Continued with ALPACA Tests 2 & 3.

Shutdown VirtualBox machine before you can make changes:

Here are the default CPU settings that VirtualBox is using:

Maxed out the CPU slider:

Here are the default RAM settings that VirtualBox is using:

Changed RAM slider to 24GB:

Now, let’s see what the Docker container reports for system info after making these changes:

THIS WAS RUN WITH THE INCORRECT SETTING IN TRIMGALORE! `--non-directional`