Clicking to download hundreds of files
30 November 2012, by Rupert Wood
As part of testing some changes I made recently, I needed to check that a page of report spreadsheet downloads still worked, which meant clicking and download all of the 300+ reports in the system.
Of course I considered automating this process, but in the end I decided it would be simplest and quickest to just click down the list of links by hand (you can see my reasoning at the end of this post). Still, the manual route was not exactly straightforward – and I thought I’d share with you the best approach I found in case you’re unlucky enough to find yourself in the same situation!
Each report download takes a few moments to generate so the quickest way to work through the list of reports is to try to download many reports at once. The best way to do this is to open each report download link in a new tab; when the server responds with a spreadsheet file the tab will disappear and become a file download in the browser instead. This is easy to do: the two standard ways to open clicked links in a new tab on Windows are either hold down ‘control’ as you click the link or click the link with the middle mouse button.
The problem now became clicking the links accurately: to make sure I actually tested every report i.e. that I clicked every download link on the page, as opposed to missing and clicking in white space between links in my haste, and to make sure I couldn’t slip and forget where I was when scrolling down to the next page of links in the browser.
And of the three browsers I tried – Internet Explorer 9, latest Firefox and Chrome – this works best in Internet Explorer. IE has a visual marker for last link clicked – the dotted line around the link under the cursor:
which neatly solves both my confirm-the-click and paging marker problems. Unfortunately IE adds a different problem: it requires a confirmation to download each of the 300+ files
I tried letting a stack of these build up and then rapidly clicking the ‘save’ button to clear them but that didn’t work – the button doesn’t always respond, perhaps as a deliberate user-safety measure or perhaps because other internal switching it’s doing taking too long. However I successfully tested my app changes in IE by kicking off batches of 30-50 downloads and clearing the confirmations between batches – which worked, but wasn’t particularly quick.
Firefox and Chrome
Chrome will save most downloads without confirmation out-of-the-box, and Firefox can be configured to:
so are there other ways I can check my click accuracy? Assuming I have the discipline to remember where I was when I scroll the page I then just need to make sure every click I make is a successful download. Now every successful click will open a new tab so I could watch for opening tabs for each click – but I couldn’t follow this accurately because
- tabs were disappearing as well as opening as the server responded to requests
- all three browsers had a slight delay between the click and the new tab opening: IE and Chrome spawn new processes for the tabs – IE in particular has a noticeable delay as it loads anti-virus and Java libraries etc. into the new process – and Firefox as its memory load increased. I could often click the next link before the previous link’s new tab had opened.
But fortunately there is another indicator I can use: Firefox treats control-click in a table cell as selecting that cell and adds a blue outline to the cell:
so I can easily spot failed clicks and correct them. So as long as you control-click not middle-click to open the links, as long as you configure Firefox to automatically save spreadsheet downloads and that you remember where you were when you scroll the page, you can click down a list of hundreds of links in Firefox and download all of the linked files quickly. But, slower though it is overall, I might still use IE myself to better mitigate against mis-clicks and errors when I scroll.
But there may be a better way I don’t know about. Please let me know if there is!
(Why didn’t I automate this? The decision is always a trade-off between the time it would take to get the tools set up and develop the automation against potentially only saving a small amount of time across a small number of runs. In my case the page was protected by a forms login which meant trivial web tools e.g. wget wouldn’t work, and there were other links on the page I wouldn’t want to trigger with a simple crawl of the page. The alternative would be to set this up in Selenium: write a Selenium script that starts Chrome, logs in, navigates to the page and then clicks every ‘Download Report’ link. This wouldn’t take a great deal of time but likely much longer than just getting on with it and manually clicking the links!
In addition, Selenium does have the drawback of – as far as I know – not being able to track when a browser download completes; I wanted to keep the number of simultaneous requests down to a number my development machine could cope with which is easy manually but I’d have to use delays and guesswork to do that in Selenium.)