问题描述
为 this 问题提供部分答案,我来了bs4.element.Tag
是一堆嵌套的字典和列表(s
,下面).
Working on a partial answer to this question, I came across a bs4.element.Tag
that is a mess of nested dicts and lists (s
, below).
有没有办法使用 re.find_all
返回包含在 s
中的 url 列表?有关此标签结构的其他评论也很有帮助.
Is there a way to return a list of urls contained in s
without using re.find_all
? Other comments regarding the structure of this tag are helpful too.
我尝试过的:
- 在
s
上随机浏览带有 tab 补全的方法. - 通过文档进行挑选.
- randomly perusing through methods with tab completion on
s
. - picking through the docs.
我的问题是 s
只有 1 个属性(type
)而且似乎没有任何子标签.
My problem is that s
only has 1 attribute (type
) and doesn't seem to have any child tags.
推荐答案
可以使用s.text
来获取脚本的内容.它是 JSON,因此您可以使用 json.loads
对其进行解析.从那里,它是简单的字典访问:
You can use s.text
to get the content of the script. It's JSON, so you can then just parse it with json.loads
. From there, it's simple dictionary access:
这篇关于在 BeautifulSoup 中使用字典解析脚本标签的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!