dictionary - Contig Extension with Python -
i want add function program creates dictionaries dna sequences receives contig (incon= initial contig; dna sequence) , extends right finding overlapping parts in form of keys in dictionaries , concatenating values "+" operator.
i'll give quick example:
gatttgaagc initial contig
atttgaagc:a 1 of many entries in dictionary
i want function search such overlapping part (i asked here yesterday , worked fine , specific values not within function variables) key in dictionary , concatenate value of key initial sequence (extend contig right) , save new sequence incon
delete dictionary-entry , repeat until there no entries left (this part haven't tried yet).
first want function search keys length of 9 values of length 1 (atttgaagc:a) , if there no overlapping parts keys length 8 values of length 2 (f.e. atttgaag:tg) , on.
additional info: dictionary "suffixdicts" has such entries values length 1 (key has length 14) 10 (key has length 5).
"reads" list of sequences stored
when try steps 1 after work (like search) , don't when tried built function out of it, literally nothing happens. function supposed return smallest possible extension.
def extendcontig (incon, reads, suffixdicts): incon = reads[0] x in range(1,len(incon)): key in suffixdicts.keys(): if incon[x:] == key: incon = incon+suffixdicts['key'] print(incon) else: print("n") return()
i'm new python , there dire mistakes made , them pointed out. know i'm way on head i'm understanding parts of existing code still have problems implementing myself it, due incorrect synthax. know there programs use understand whole thing behind it.
edit: asked add given functions. of them written parts wrote based on given code (basically copied tweaks). warning: quite lot:
reading fasta file: additional info: fasta file contains large amounts of sequences in form:
"> read 1
ttatgaatattacgcaatggacgtccaaggtacagcgtatttgtacgcta
"> read 2
aactgctatctttcttgtccactcgaaaatccataacgtagcccataacg
"> read 3
tcagttatcctatatactggatcccgactttaatcggcgtcggaattact
i uploaded file here: http://s000.tinyupload.com/?file_id=52090273537190816031
edit: edited large blocks of code out doesn't seem necessary.
Comments
Post a Comment