Basically, to delivery this toy is to use axios to accomplish http-request and to use cheerio to parse and extract the info you need.
What is axios
axios is an library based on Promise providing you a capability to realize a http request.
Why choose it? It simplifies the process for sending an Internet request.
What is cheerio
Cheerio is a fast and lightweight library for manipulation of HTML and XML documents.
It’s designed for use in Node.js environments, allowing you to easily parse, traverse, and manipulate the structure of web pages.
With Cheerio, you can load HTML content, select elements using familiar jQuery selectors, and extract or modify data.
write js code
1 | const axios = require("axios"); |
Using these to introduce them.
design a async-function to wrap it.Why async? The function makes an HTTP request, which can take time (like waiting for a server response). Using async
allows the function to pause execution until the request is completed without blocking other operations.
You can add a console.log(“waiting”); after you call the function.It will be printed first.
1 | await axios.get(url); |
this await shall pause the function-exection and axios.get shall get the info.
what does axios.get return
It returns a Promise and the Promise will resolve to an object wrapping the HTTP response.
this object has lots of properties to show information about this Internet request.
like status:200 statusText:ok response-headers
And the data: The body of the response returned by the server, usually data in HTML, JSON
That’s what we need.
1 | const { data } = await axios.get(url); |
1 | const $ = cheerio.load(data); |
这行代码将 HTML 数据加载到 Cheerio 中,可以使用 jQuery 风格的选择器来操作这个文档。
$("h1").each((i, element) => {...});
: 这行代码选择所有的 <h1>
标签,并对每个找到的元素执行一个回调函数。
console.log($(element).text());
: 在回调函数中,$(element).text()
获取当前 <h1>
元素的文本内容