Knowledge Discovery Lab

Illumina BaseSpace Project

10/24/2016 Group meeting; (Attendance: Prof.Jill Macoska, Prof.Wei Ding, Zheyun Xiao, Tong Wang); Discussed the details about the project. The goals of this project include:

  1. Test Illumina BaseSpace, check if it is user-friendly or if there are any issues in user interface.
  2. Evaluate RNA seqs data in BaseSpace, and compare the results with Tuxedo pipeline.

One user account is registered in BaseSpace

GHPCC account has been registered, and we can access to umb_triley folder

Cell line data (about 30G) has been downloaded to laptop from GHPCC

11/18/2016 Meeting. (Attendance: Prof. Zarringhalam, Zheyun Xiao, Tong Wang); Prof. Zarringhalam show us:

  1. The basic background about DNAs, RNAs, expression levels, etc.
  2. Cell line data, the name convention, samples.
  3. The steps to test BaseSpace.

The 30G data are grouped based on different  Sample folders. Each folder contains 4 .zip documents. They are in two sets. We uploaded sets by sets to the websites. We ran the data on FASTq Toolkit v2.2.0 and we have the analysis reports: ana_jrn008cgatgtl001-basespace-sequence-hub ana_-jrn008cgatgtl002-basespace-sequence-hub

11/30/2016 Meeting. (Attendance: Prof. Zarringhalam, Tong Wang) discussed about fastq trimming and tophat alignment

Tong registered new account in Base Space.

Finished fastq trimming and tophat alignment

Detail steps:

  1. Create new project
  2. From the created project, click import and drag and drop one pair (left and right) of cell line data, the data pair will show as one file in the project
  3. Launch FastQ toolkit from apps
    1. Select samples: choose the uploaded file. Select project: choose the project that contains that data file
    2. In adapter trimming, select TruSeq HT/LT Common Sequence (AGATCGGAAGAGC)
    3. Run it, it takes around 15 minutes and generates a result file in the project
  4. Launch TopHat alignment
    1. Select samples: choose the file generated by FastQ. Select project: choose the project that contains that data file
    2. Check TrimTruSeqs adapters