snape.nj修改,输出正确顺序domain列与url列,改正summary_juin.md并存储为pdf
This commit is contained in:
parent
a0bc1dc810
commit
637bbd3e17
|
@ -69,7 +69,7 @@ def process_page(driver, url, visited_pages, start_domain, data):
|
||||||
hyperlink_content_text = hyperlink_content_element.text
|
hyperlink_content_text = hyperlink_content_element.text
|
||||||
print(hyperlink_content_text)
|
print(hyperlink_content_text)
|
||||||
# Add URL, Domain, and Content of the hyperlink to the data list
|
# Add URL, Domain, and Content of the hyperlink to the data list
|
||||||
data.append([href, start_domain, hyperlink_content_text])
|
data.append([start_domain, href, hyperlink_content_text])
|
||||||
# Recursively process the page and follow hyperlinks
|
# Recursively process the page and follow hyperlinks
|
||||||
process_page(driver, href, visited_pages, start_domain, data)
|
process_page(driver, href, visited_pages, start_domain, data)
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
|
|
|
@ -10,7 +10,7 @@
|
||||||
|
|
||||||
## 分析结果
|
## 分析结果
|
||||||
|
|
||||||
根据分析要求进行得到分析结果,具体见结果表
|
通过对爬取结果进行分析并与标准文档比对,分别在27876页网页中发现错误100处,在4153篇公众号中发现错误33处,具体见结果表
|
||||||
|
|
||||||
## 存在问题
|
## 存在问题
|
||||||
|
|
||||||
|
|
Binary file not shown.
Loading…
Reference in New Issue