当前位置 : 主页 > 编程语言 > python >

python_pickle(序列号和反序列化实例)定义一个名为“wordcount”的函数,功能为统

来源:互联网 收集:自由互联 发布时间:2022-06-14
文章目录 ​​problem:​​ ​​result:​​ ​​code​​ problem: 首先,定义一个名为“wordcount”的函数,功能为统计中文文本中某个关键字出现的次数,函数原型如下: 其中w和txtfile均为


文章目录

  • ​​problem:​​
  • ​​result:​​
  • ​​code​​

problem:

首先,定义一个名为“wordcount”的函数,功能为统计中文文本中某个关键字出现的次数,函数原型如下:

其中w和txtfile均为字符串。
其次,在存放本次实验材料的文件夹中,利用os.mkdir()创建一个新的文件夹,取名“mydir”;同时,自动识别出以“news_”开头的所有文本文件,将其移动至新建的文件目录“mydir”中(注:需编程自动实现移动文件)。
进一步,利用pickle模块将函数wordcount以及识别出的以“news_”开头的所有文本文件名组合成一个列表,永久保存至文件“wc.pkl”,并存储在文件夹“mydir”中。
最后,再次利用pickle模块将保存在“wc.pkl”中的列表数据载入,获得函数wordcount,并调用wordcount计算四个关键字“中国”、“美国”、“科技”和“芯片”在以“news_”开头的所有文本文件中出现的次数,打印输出,格式参考如下

result:

python_pickle(序列号和反序列化实例)定义一个名为“wordcount”的函数,功能为统计中文文本中某个关键字出现的次数,利用pickle模块将函数wordcount保存到文件中_文本文件

python_pickle(序列号和反序列化实例)定义一个名为“wordcount”的函数,功能为统计中文文本中某个关键字出现的次数,利用pickle模块将函数wordcount保存到文件中_表数据_02

code

import os
import shutil

path_src = path_string_fix
path_des = path_string_fix+"mydir/"
""" create the dir mydir in the proper source path """
if not os.path.exists(path_des):
os.mkdir(path_src+"mydir")
""" get the files in the path: """
files_list = os.listdir(path_src)
""" get the files start with news_: """
file_news_list=os.listdir(path_des)[:2]

""" move files from source path to destination path:"""
def move_news():
for file_name in files_list:
if file_name.startswith("news_"):
# print(file_name)
shutil.move(path_src+file_name,path_des)
""" count the word in specified file """
def wordcount(w,txt_file):
"""the frequency of appearance of word w in the file txt_file(attention ,the txt_file use the absolute path)
!attention2:the function read files which is encode in gbk,so the open() use the encoding="gbk"(gb18030 is ok too) to read it correctly
Args:
w (str): [description]
txt_file (str): [absolute path]
"""
# list=[]
string=""
with open(txt_file,"r",encoding='gbk') as file_input_stream:
string= file_input_stream.read()
# print(string)
return string.count(w)
# print(wordcount("t",path_src+"log.txt"))
""" use(experience the serialize module pickle too store(dump) and use the object serialized:) """

def pickle_deal():
# obj_list=obj_list
with open(path_des+"wc.pkl","wb") as file_output_stream:
pickle.dump((wordcount,file_news_list),file_output_stream)
with open(path_des+"wc.pkl","rb") as file_input_stream:
return pickle.load(file_input_stream)
# print(obj_list)

def print_head(word_list):
#to format the head print:
for i in [""]+word_list:
print(i.center(20),end="")
print()

def print_result(word_list):
print_head(word_list)
for file in obj_list[1]:
print(file.center(20),end="")
file_full_path=path_des+file
for word in word_list:
frequency=0
frequency=wordcount(word,file_full_path)
frequency=str(frequency).center(20)
print(frequency,end="")
print()

word_list=["中国","美国","科技","芯片"]
move_news()
obj_list=pickle_deal()
"get the function from pickled file"
wordcount=obj_list[0]
print_result(word_list)


上一篇:python_计数排序counting_sort()
下一篇:没有了
网友评论