python BeautifulSoup

Notice

Recent Posts

Recent Comments

Link

« 2025/07 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Tags more

Archives

Today

Total

관리 메뉴

Self-Improvement

python BeautifulSoup 본문

프로그래밍/Python

python BeautifulSoup

JoGeun 2018. 10. 21. 13:03

https://www.crummy.com/software/BeautifulSoup/bs4/doc/

*import

1
from bs4 import BeautifulSoup 
cs

*기본 사용법 1

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
soup = BeautifulSoup(resp, 'html.parser') 
print(soup.prettify()) 
 - response 내요을 보기좋게 출력됨 
 
soup.title 
soup.title.name 
soup.title.string 
soup.p 
soup.p['class'] 
soup.a 
 - a 태그 하나만 출력 
soup.find_all('a') 
 - a 태그 전체 출력 
soup.find(id="link3") 
 - id="link3"으로 되어있는거 찾아서 출력 
cs

*기본 사용법 2

1
2
3
4
5
6
7
8
9
#<a href="http://example.com/elsie" class="sister" id="link1">Elsie</a>, 
#<a href="http://example.com/lacie" class="sister" id="link2">Lacie</a> and 
#<a href="http://example.com/tillie" class="sister" id="link3">Tillie</a>; 
 
for link in soup.find_all('a'): (위의 문장에서 a 태그를 찾으며 href로 되어있는 부분만 출력) 
    print(link.get('href')) 
# http://example.com/elsie 
# http://example.com/lacie 
# http://example.com/tillie 
Colored by Color Scripter
cs

*각 구문 분석기 라이브러리

1
2
3
4
5
6
7
8
9
10
11
12
1. 파이썬의 html.parser 
 -사용(x) lxml이 더 좋음 
 
2. lxml의 HTML parser 
 -BeautifulSoup(markup, "lxml") 
 
3. lxml의 XML parser 
 -BeautifulSoup(markup, "lxml-xml") 
 -BeautifulSoup(markup, "xml") 
 
4. html5lib 
 -BeautifulSoup(markup, "html5lib") 
cs

*Tag

1
2
3
4
5
6
soup = BeautifulSoup('<b class="boldest">Extremely bold</b>', "lxml") 
tag = soup.b 
print(type(tag)) 
# <class 'bs4.element.Tag'> 
 
ex) soup.a, soup.p, soup.title 
Colored by Color Scripter
cs

*Multi-valued attributes

1
2
3
css_soup = BeautifulSoup('<p class="body strikeout"></p>') 
css_soup.p['class'] 
# ["body", "strikeout"] 
Colored by Color Scripter
cs

'프로그래밍 > Python' 카테고리의 다른 글

requests 모듈을 통한 DVWA Low SQL-injection (0)	2018.10.21
requests 모듈을 통한 DVWA Low Command injection (0)	2018.10.21
python request 모듈 (0)	2018.10.21
Head First Python 5-1장 (0)	2018.10.21
Head First Python 4장 (0)	2018.10.21

'프로그래밍/Python' Related Articles

Self-Improvement

python BeautifulSoup 본문

python BeautifulSoup

'프로그래밍 > Python' 카테고리의 다른 글

티스토리툴바