yhjrtj
-
Upload
phuochiep123 -
Category
Business
-
view
0 -
download
0
description
Transcript of yhjrtj
![Page 1: yhjrtj](https://reader036.fdokument.com/reader036/viewer/2022090916/614c51adae14c4740602b1f3/html5/thumbnails/1.jpg)
Evandro Souza
Web Scraping
![Page 2: yhjrtj](https://reader036.fdokument.com/reader036/viewer/2022090916/614c51adae14c4740602b1f3/html5/thumbnails/2.jpg)
O que é?
Web harvesting
Web data-extraction
Web crawler
Web Spider Robot
Knowbot
Web Scraping
![Page 3: yhjrtj](https://reader036.fdokument.com/reader036/viewer/2022090916/614c51adae14c4740602b1f3/html5/thumbnails/3.jpg)
O que é?
Web harvesting
Web data-extraction
Web crawler
Web Spider Robot
Knowbot
Web Scraping
Técnica para mapear\rastrear informações de web sites.Objetivos: • Navegar em páginas web de forma automatizada.• Mapeia todos os links das URLs recursivamente.• Utilizado normalmente em “motores de busca”.
Técnica para extrair informações de web sites.
![Page 4: yhjrtj](https://reader036.fdokument.com/reader036/viewer/2022090916/614c51adae14c4740602b1f3/html5/thumbnails/4.jpg)
Web scraping - Para que serve?• Transformar dados de web sites em estrutura de dados.
• Automatizar tarefas em web sites (simular um usuário)
![Page 5: yhjrtj](https://reader036.fdokument.com/reader036/viewer/2022090916/614c51adae14c4740602b1f3/html5/thumbnails/5.jpg)
Qualquer aplicação que necessite de dados extraidos de web sites:
Pesquisas academicas, marketing e ciêntifica
Análise de Mercado
Comparação de preços
Data mining
Casos de uso
![Page 6: yhjrtj](https://reader036.fdokument.com/reader036/viewer/2022090916/614c51adae14c4740602b1f3/html5/thumbnails/6.jpg)
Ferramentas
FiddlerFirefox\Chrome Dev tools
![Page 7: yhjrtj](https://reader036.fdokument.com/reader036/viewer/2022090916/614c51adae14c4740602b1f3/html5/thumbnails/7.jpg)
ConhecimentosRegular Expression XPath
![Page 8: yhjrtj](https://reader036.fdokument.com/reader036/viewer/2022090916/614c51adae14c4740602b1f3/html5/thumbnails/8.jpg)
Necessidadehttps://www.jucerja.rj.gov.br/JucerjaPortalWeb/Paginas/Informacoes/TabelaPrecosPWJ.aspx
![Page 9: yhjrtj](https://reader036.fdokument.com/reader036/viewer/2022090916/614c51adae14c4740602b1f3/html5/thumbnails/9.jpg)
Análise
• Fiddler• Dev Tools
Extrair HTML
• Requisições HTTP
Transformar
• HTML Parser• XPath• Regex
Os 3 passos
![Page 10: yhjrtj](https://reader036.fdokument.com/reader036/viewer/2022090916/614c51adae14c4740602b1f3/html5/thumbnails/10.jpg)
Exemplos C#HttpWebRequest/HttpWebResponse + Html Agility pack
![Page 11: yhjrtj](https://reader036.fdokument.com/reader036/viewer/2022090916/614c51adae14c4740602b1f3/html5/thumbnails/11.jpg)
Exemplos C#HttpWebRequest/HttpWebResponse + Html Agility pack
![Page 12: yhjrtj](https://reader036.fdokument.com/reader036/viewer/2022090916/614c51adae14c4740602b1f3/html5/thumbnails/12.jpg)
Exemplos C#HttpWebRequest/HttpWebResponse + Html Agility pack
![Page 13: yhjrtj](https://reader036.fdokument.com/reader036/viewer/2022090916/614c51adae14c4740602b1f3/html5/thumbnails/13.jpg)
Exemplos C#Scrapy Sharp
![Page 14: yhjrtj](https://reader036.fdokument.com/reader036/viewer/2022090916/614c51adae14c4740602b1f3/html5/thumbnails/14.jpg)
Exemplos C#Scrapy Sharp
![Page 15: yhjrtj](https://reader036.fdokument.com/reader036/viewer/2022090916/614c51adae14c4740602b1f3/html5/thumbnails/15.jpg)
Exemplos C#Scrapy Sharp
![Page 16: yhjrtj](https://reader036.fdokument.com/reader036/viewer/2022090916/614c51adae14c4740602b1f3/html5/thumbnails/16.jpg)
Exemplos C#HttpWebRequest/HttpWebResponse + AngleSharp
![Page 17: yhjrtj](https://reader036.fdokument.com/reader036/viewer/2022090916/614c51adae14c4740602b1f3/html5/thumbnails/17.jpg)
Obrigado!evandroferreiras
evandroferreiras
evandroferreiras