vbs进行自动化测试
这几天有个课程设计要做,队友是@LTL-77 和老卢,是基于余弦相似度实现 的代码查重,✌主要负责调试分析的,所以从leetcode 上面ctrl+c和ctrl+v下来了百来个样本(为什么这个网站会那么卡?),先用某种方法(AI)把所有题目名称弄下来,第一页如下(内容太多就不完全展示了):
1 2 3 4 5 6 "exam-room" , "two-sum" , "add-two-numbers" , .......................................... "minimum-knight-moves" , "how-many-apples-can-you-put-into-the-basket" ,
然后就是批量获取了,统一放在temp文件夹里,方便查找。
因为某些作者写的代码过于精简或者代码位置定位不到,有的显示获取问题,有的则会是空文件,所以需要写一个脚本对文件大小进行过滤,只留大文件(建议先阅读代码再进行使用):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 import osdef delete_small_files (directory, min_size ): """ 检查目录中所有文件的大小,删除小于 min_size 字节的文件,包括子目录中的文件。 :param directory: 要检查的目录路径 :param min_size: 最小文件大小(字节) """ for dirpath, dirnames, filenames in os.walk(directory): for filename in filenames: file_path = os.path.join(dirpath, filename) if os.path.isfile(file_path): file_size = os.path.getsize(file_path) if file_size < min_size: print (f"Deleting file: {file_path} (size: {file_size} bytes)" ) os.remove(file_path) else : print (f"Keeping file: {file_path} (size: {file_size} bytes)" ) if __name__ == "__main__" : directory = "." min_size = int (input ("Enter the minimum file size (in bytes): " )) if os.path.isdir(directory): delete_small_files(directory, min_size) else : print (f"The directory {directory} does not exist." )
既然代码清理好了,就可以进行自动化测试了,由于本人的vbs技术实在太菜,python的某些库环境又爆炸了,所以只能再预处理一次,主要目的是把相同题目的题解抽取第一个出来方便之后进行一对多的比较:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 import osimport shutildef copy_and_rename_file_to_parent (subfolder_path ): files = os.listdir(subfolder_path) files = [f for f in files if os.path.isfile(os.path.join(subfolder_path, f))] if files: folder_name = os.path.basename(subfolder_path) first_file = files[0 ] source_file = os.path.join(subfolder_path, first_file) parent_dir = os.path.dirname(subfolder_path) target_file = os.path.join(parent_dir, f"{folder_name} {os.path.splitext(first_file)[1 ]} " ) shutil.copy(source_file, target_file) try : os.remove(source_file) print (f"文件 {first_file} 被复制到父目录并重命名为 {folder_name} {os.path.splitext(first_file)[1 ]} ,原文件已删除。" ) except Exception as e: print (f"删除文件 {source_file} 时出错: {e} " ) else : print (f"文件夹 {subfolder_path} 中没有文件!" ) def get_all_subfolders (directory ): subfolders = [f for f in os.listdir(directory) if os.path.isdir(os.path.join(directory, f))] return subfolders if __name__ == "__main__" : current_directory = os.getcwd() subfolders = get_all_subfolders(current_directory) if subfolders: for folder in subfolders: subfolder_path = os.path.join(current_directory, folder) copy_and_rename_file_to_parent(subfolder_path) else : print ("当前目录下没有文件夹!" )
把第一部分代码跑好之后:
就应该进行vbs测试了,代码如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 Set objFSO = CreateObject ("Scripting.FileSystemObject" )Set objFolder = objFSO.GetFolder("./code" ) Set WshShell = CreateObject ("WScript.Shell" )WshShell.Run "course_design_end2.0.exe" WScript.Sleep 500 WshShell.SendKeys "no" WScript.Sleep 100 WshShell.SendKeys "{ENTER}" For Each objSubFolder In objFolder.SubFolders subFolderName = objSubFolder.Name WScript.Sleep 1000 cFilePath = "./code/" & subFolderName & ".c" If objFSO.FileExists(cFilePath) Then For Each objFile In objSubFolder.Files If LCase (objFSO.GetExtensionName(objFile.Name)) = "c" Then WshShell.SendKeys cFilePath WScript.Sleep 100 WshShell.SendKeys "{ENTER}" WScript.Sleep 200 WshShell.SendKeys objFile.Path WScript.Sleep 100 WshShell.SendKeys "{ENTER}" WScript.Sleep 400 End If Next End If Next WScript.Sleep 2000 WshShell.SendKeys "^a" WScript.Sleep 100 WshShell.SendKeys "^c" WScript.Echo "剪贴板内容已追加到文件!"
然后输出的数据放在zzz_output.txt里面,需要进行正则提取,然后又要制excel和图表,所以要使用re, openpyxl, matplotlib,
这里分成了两个程序:正则提取制excel(此处不知道问什么读取文件总是出错,所以选择把内容直接放到代码里,由于内容较多,只截取了一次测试的数据)、excel内容提取做plot
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 import refrom openpyxl import Workbooktext=r"""Whether to import external keyword library(It shouldn't be more than 500 words) yes or no:no Please enter the two copies of the code to be checked again (if you do not need to check again, please enter exit) code1:./code/3sum.c code2:D:\code\design\niubi\code\3sum\code_3sum_10.c 两段代码的相似度为:ovo0.7740ovo 两份代码相似度高,可能存在抄袭行为 """ pattern = r"code1:\s*([^\\]+\.c)\s*code2:([^\s]+\\([^\\]+\.c))\s*两段代码的相似度为:ovo(\d+\.\d+)ovo" matches = re.findall(pattern, text) print (matches)wb = Workbook() ws = wb.active ws.title = "相似度结果" ws.append(["序号" ,"Code1" , "绝对路径" ,"Code2" , "相似度" ]) i=1 if matches: for match in matches: code1,path, code2, similarity = match ws.append([str (i),code1, path,code2, similarity]) i=i+1 else : print ("没有找到匹配的内容" ) wb.save("similarity_results.xlsx" ) print ("数据已写入 similarity_results.xlsx" )
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 import openpyxlimport matplotlib.pyplot as pltimport numpy as npfile_path = 'similarity_results.xlsx' workbook = openpyxl.load_workbook(file_path) sheet = workbook.active x_data = [] y_data = [] for row in sheet.iter_rows(min_row=2 , max_col=5 , values_only=True ): x_data.append(row[0 ]) try : y_data.append(float (row[4 ])) except ValueError: print (f"警告: 跳过无法转换为数字的行: {row[4 ]} " ) continue plt.plot(x_data, y_data, marker='o' ) plt.title('plot' ) plt.xlabel('num' ) plt.ylabel('similarity' ) plt.grid(True ) ticks = np.arange(0.6 , 1 , step=0.05 ) plt.yticks(ticks) plt.xticks(ticks=[0 ,50 ,100 ,150 ,200 ,250 ,300 ,350 ,400 ,450 ,500 ]) plt.show()