indexing - How to crawl / index pages behind a login? -


is possible (are there tools out there) crawl pages (not content, url) that's behind login? looking creating new site, , need index each page on old site in order capture content, content types, map urls new site, etc... have login , i'm not looking add google or anything.

screaming frog won't it. , can't involve dev guys of current site - putting script on server won't work either. other way this?

yes can,integrate crawler "selenium".provide login credentials , can work done. few links may you:-

how use selenium python?

http://www.quora.com/is-it-possible-to-write-a-python-script-for-opening-a-browser-and-logging-into-a-website-how-could-you-do-it

https://selenium-python.readthedocs.org/en/latest/getting-started.html

it may take time , research yes done, take care of logout page while crawling.


Comments

Popular posts from this blog

javascript - AngularJS custom datepicker directive -

javascript - jQuery date picker - Disable dates after the selection from the first date picker -