9 fosmids assemblies (indépendant) consensus (573331) run through chit-est. no initial limited on coverage and size
robertsmac:cd-hit-v4.5.4-2011-03-07 sr320$ ./cd-hit-est -i /Volumes/Bay4\ scratch/BIGFOSMID.fa -o /Volumes/Bay4\ scratch/BIGFOSMID_cdhit -M 2500
================================================================
Program: CD-HIT, V4.5.4, Feb 23 2012, 11:03:06
Command: ./cd-hit-est -i /Volumes/Bay4 scratch/BIGFOSMID.fa -o
/Volumes/Bay4 scratch/BIGFOSMID_cdhit -M 2500
Started: Sat Feb 25 10:04:34 2012
================================================================
Output
----------------------------------------------------------------
total seq: 573331
longest and shortest : 45747 and 876
Total letters: 1520457838
Sequences have been sorted
Approximated minimal memory consumption:
Sequence : 1592M
Buffer : 1 X 28M = 28M
Table : 1 X 25M = 25M
Miscellaneous : 11M
Total : 1659M
Table limit with the given memory limit:
Max number of representatives: 4194304
Max number of word counting entries: 105119873
comparing sequences from 0 to 8048
comparing sequences from 8048 to 33677
.......... 10000 finished 6565 clusters
.......... 30000 finished 17621 clusters
comparing sequences from 30470 to 78377
.......... 50000 finished 27961 clusters
.......... 60000 finished 32752 clusters
comparing sequences from 66272 to 147166
.......... 80000 finished 42042 clusters
.......... 100000 finished 50760 clusters
.......... 120000 finished 59254 clusters
comparing sequences from 121616 to 260978
.......... 130000 finished 63410 clusters
.......... 140000 finished 67390 clusters
.......... 170000 finished 79059 clusters
.......... 180000 finished 82911 clusters
.......... 200000 finished 90239 clusters
comparing sequences from 208854 to 475563
.......... 240000 finished 104516 clusters
.......... 250000 finished 107875 clusters
.......... 270000 finished 114614 clusters
.......... 290000 finished 121203 clusters
.......... 320000 finished 130854 clusters
.......... 350000 finished 140267 clusters
comparing sequences from 357117 to 573331
.......... 360000 finished 143323 clusters
.......... 370000 finished 146454 clusters
.......... 380000 finished 149489 clusters
.......... 390000 finished 152532 clusters
.......... 450000 finished 170317 clusters
.......... 460000 finished 173215 clusters
.......... 500000 finished 184701 clusters
.......... 540000 finished 196292 clusters
.......... 560000 finished 201868 clusters
......
573331 finished 205903 clusters
Apprixmated maximum memory consumption: 2500M
writing new database
writing clustering information
program completed !
Total CPU time 27885
robertsmac:cd-hit-v4.5.4-2011-03-07 sr320$
FASTA FILE: http://main.g2.bx.psu.edu/datasets/83ae21f9d999536d/display?username=sr320&to_ext=fasta&slug=repository (618MB)
-- Should move to CLC for blasting.