Screenshot

Project overview:
Cookie Crawler


Details


  • Client
    Netvlies Internetdiensten
  • Date
    2012
  • Type
    Commandline
  • Services
    Front-end, back-end
  • Tags
    Commandline, front-end, back-end, PhantomJS, JavaScript, PHP

Due to the large amount of Cookie-law related requests and Cookie Consent implementations on clients' websites, it became difficult to quickly track down on which pages cookies were being used. Especially when dealing with websites with thousands of pages.

For this, I created a commandline script which could be used by the developers to check if a site still contains cookies or not. It's a crawler/spider just like Google's but instead of indexing, it gathers all links belonging to the website and checks if cookies are being set at each of those. On the background, this script sends commands to a headless (interface-less) browser, which actually calls each link and renders just like a regular browser would, thus being able to detect cookies.

The downside to this is that it doesn't yet check for external content (like images, banners, javascript), which could potentially track users as well. This might be nice improvement for the future.

Comments


    No comments found