<pedrocorreia.net ⁄>
 

<Web scraping with PHP and XPath ⁄ >




clicks: 3510 3510 2009-02-18 2009-02-18 goto programacao myNews programacao  Bookmark This Bookmark This


When I was writing about how I use web scraping, I was still hadn't tried using Xpath (shame on me). sssscripting blog responded to my article with very good and rich post about all sorts of different techniques for scraping (with Ruby examples) and after reading this post in Kore Nordmann blog I finally decided to try making something with Xpath.

It turned out, that using Xpath is extremely easy, really. When you master it, you can do everything in seconds. Yes, you need to know how XML works and how to write correct Xpath queries (brief explanation of Xpath syntax is available at W3Schools), but hey - these topics are in 1st year of university.

Also, there are good tools like XPath checker for Firefox which allows you to debug and test your queries without writing any code. Stupid to say, but XPath queries looks a lot like CSS selectors, but with much more power and flexibility. Without further talking, lets look at example (idea from Kore's article):



este é só um excerto do artigo, para aceder ao artigo completo, clique no link em baixo:
this is just a small excerpt from the article, to access the full article please click in the link below:

http://dev.juokaz.com/php/web-scraping-with-php-and-xpath




Subscribe News RSS  Subscribe News Updates by E-mail





myNews <myNews show="rand" cat="programacao" ⁄>

RouterJs: easy routing for your ajax Web applications new ...

RouterJs is a simple router for your ajax web apps. It's build upon History.js which means that Rout (...)

clicks: 17957 17957 2012-05-14 2012-05-14 goto url (new window) haithembelhaj.g... goto myNews programacao


Backbone computed properties new ...

This gist shows one way to implement read- and write-enabled computed properties on a Backbone Model (...)

clicks: 17663 17663 2012-05-13 2012-05-13 goto url (new window) https://gist.gi... goto myNews programacao


Create Instagram Filters With PHP new ...

In this tutorial, I'll demonstrate how to create vintage (just like Instagram does) photos with PHP (...)

clicks: 17619 17619 2012-05-12 2012-05-12 goto url (new window) net.tutsplus.co... goto myNews programacao


HTML5 jQuery Paint Plugin new ...

Websanova Paint is a HTML5 canvas based jQuery plugin. It allows you to free paint on a canvas area (...)

clicks: 28711 28711 2012-05-12 2012-05-12 goto url (new window) websanova.com/t... goto myNews programacao


Android Query new ...

Android-Query (AQuery) is a light-weight library for doing asynchronous tasks and manipulating UI el (...)

clicks: 17632 17632 2012-05-12 2012-05-12 goto url (new window) code.google.com... goto myNews programacao


Real-time Applications With Node.js and Socket.IO new ...

Hey everyone! Sorry about the long pause since the last blog post, life has been quite hectic for th (...)

clicks: 17960 17960 2012-05-11 2012-05-11 goto url (new window) codingcookies.c... goto myNews programacao


Sass vs. LESS vs. Stylus: Preprocessor Shootout new ...

CSS3 preprocessors are languages written for the sole purpose of adding cool, inventive features to (...)

clicks: 17333 17333 2012-05-11 2012-05-11 goto url (new window) net.tutsplus.co... goto myNews programacao


Gettings to know Backbone.ks new ...

In this series, we're going to learn how to build a fully functional contacts manager using Backbone (...)

clicks: 16585 16585 2012-05-10 2012-05-10 goto url (new window) net.tutsplus.co... goto myNews programacao