我显然比 Javascript更新一点,而不是我愿意承认的.我正在尝试使用Node.js拉一个网页并将内容保存为变量,所以我可以解析它但是我觉得. 在Python中,我会这样做: from bs4 import BeautifulSoup #
在Python中,我会这样做:
from bs4 import BeautifulSoup # for parsing import urllib text = urllib.urlopen("http://www.myawesomepage.com/").read() parse_my_awesome_html(text)
我如何在Node中执行此操作?
我已经达到了:
var request = require("request"); request("http://www.myawesomepage.com/", function (error, response, body) { /* Something here that lets me access the text outside of the closure This doesn't work: this.text = body; */ })
var request = require("request"); var parseMyAwesomeHtml = function(html) { //Have at it }; request("http://www.myawesomepage.com/", function (error, response, body) { if (!error) { parseMyAwesomeHtml(body); } else { console.log(error); } });
编辑:正如Kishore所说,解析可用的选项很不错.如果你在Windows上遇到jsdom的python / gyp问题,请参阅cheerio. Cheerio on github