Refering to a list of names using Python -


I'm new to Python, so please stand with me.

I can not find this bit to make the script work properly:

  genome = open ('refT.txt', 'r')  

Detafail - a reference genome with a bunch (2 million contigs):

  Contig_01 TGCAGGTAAAAAACTGTCACCTGCTGGT Contig_02 TGCAGGTCTTCCCACTTTATGATCCCTTA Contig_03 TGCAGTGTGTCACTGGCCAAGCCCAGCGC Contig_04 TGCAGTGAGCAGACCCCAAAGGGAACCAT Contig_05 TGCAGTAAGGGTAAGATTTGCTTGACCTA  

The file is opened:

   

above A list of contigs to remove from the listed datasets :

  Contig_0 1 Contig_02 Contig_03 Contig_05  

My disappointing script:

  in line for cont_list: if not line in genome.readline (): proceed: a = The script successfully writes the first three contigs in the output file, but (= 'output', 'a') data_  

. "Out% s") input ("% s"% s) data_out.close () input does not seem to be able to exclude "contig_04" for some reasons, which is not in the list, and moves on "Contig_05" .

I may feel like a lazy bast to post it, I code -_-

On this small bit I will first try to generate a Python which gives you a tube: (contig, gnome) :

  Def pair (file_obj): for line in file_obj: yield line, next (file_obj)  

Now, I will use it to get the desired elements:

< Pre> wanted = {'Contig_01', 'Contig_02', 'Contig_03', 'Contig_05'} Fine fine Open ('filename') in the form: pair = pair (wings) when you wanted: p = next (pair) if p [0] wanted: # write in the output file, store in the list, or dict, .. . Wanted. Forgotten (P [0])

Comments