python 3.x - Counting Frequencies -
i trying figure out how count number of frequencies word tags i-gene , o appeared in file.
the example of file i'm trying compute this:
45 wordtag o cortex 2 wordtag i-gene cdc33 4 wordtag o ppre 4 wordtag o how 44 wordtag o if
i trying compute sum of word[0] (column 1) in same category (ex. i-gene) same category (ex. o)
in example:
the sum of words category of i-gene 2
, sum of words category of o 97
my code:
import os def reading_files (path): counter = 0 root, dirs, files in os.walk(path): file in files: if file != ".ds_store": if file == "gene.counts": open_file = open(root+file, 'r', encoding = "iso-8859-1") line in open_file: tmp = line.split(' ') words in tmp: word in words: if (words[2]=='i-gene'): sum = sum + int(words[0] if (words[2] == 'o'): sum = sum + int(words[0]) else: print('nothing') print(sum)
i think should delete word loop - don't use it
for word in words:
i use dictionary - if want solve generally. while read file, fill dictionary with: - if have key in dict -> increase value - if new key, add dict, , set value it's value.
def reading_files (path): freqdict = dict() ... words in tmp: if words[2] not in freqdict(): freqdict[words[2]] = 0 freqdict[words[2]] += int(words[0])
after created dictionary, can return , use keyword, or can pass keyword function, , return value or print it. prefer first 1 - use less file io operation possible. can use collected data memory.
for solution wrote wrapper:
def getvalue(fdict, key): if key not in fdict: return "nothing" return str(fdict[key])
so behave example.
it not neccessary, practice: close file when not using anymore.
Comments
Post a Comment