medp_colfo-20110601 -- Q1, Q2, Q3 are the quartiles of the reported contig lengths. B1000 and B1000 indicate the percentage of bases involved in contigs at least 1000 bp and 2000 bp, respectively. -- stats.txt: Summary statistics associated with contigs.fa. Includes total number of sequences and bases in the contig set, N50, etc. See note above on Q1, Q2, Q3, B1000, and B2000. contigs.fa: Contigs from assembly, min. 100 bp. Possibly includes UTRs. Sequences may contain IUPAC ambiguity codes representing ambiguous bases, http://www.bioinformatics.org/sms/iupac.html. peptides.fa: Protein products predicted by ESTScan. Does not necessarily include initial methionine. Possibly more than one polypeptide per contig. readcounts/*.dat: Read counts obtained by post hoc alignment of reads to reported contigs, per sample, via gsnap with default parameters. Tab-delimited columns with the format sample contig_id all_aligned unique_aligned paired_aligned contig_len where sample indicates the sample, library, tissue, etc.; contig_id is the contig identifier, for example, medp_colfo-20110601|1234); all_aligned is the number of reads aligned to this contig; unique_aligned is the number of reads that aligned uniquely to this contig; and paired_aligned is the number of pairs aligned to this contig. Contig_len is the length of the contig in bp.