Building a Command Line File Downloader in Python

Building a Command Line File Downloader in Python

ยท

5 min read

Python is one of the most popular general purpose programming language with a wide range of use cases from general coding to complex fields like AI.

One of the reason for such popularity of python as a programming language is the availablility of many bulit-in as well as third party libraries and packages.

Nowadays python is used extensively for web data extraction and data science. You can do web scraping, crawling and automated browsing on a large scale using extremely simple python scripts thanks to the plethora of libraries available.

In this blog, I'm going to build an simple command line file downloader using which you can download a file if you have the download link.

Libraries required

We are going to use the 'requests' library to download file over HTTP.

import requests
import sys
import time

It will show the file size, download speed, time and a progress bar. So, lets see how to implement this.

# dictionary for detecting file size in proper units
units = {
    'B':{'size':1, 'speed':'B/s'},
    'KB':{'size':1024, 'speed':'KB/s'},
    'MB':{'size':1024*1024, 'speed':'MB/s'},
    'GB':{'size':1024*1024*1024, 'speed':'GB/s'}
    }
def check_unit(length): # length in bytes
    if length < units['KB']['size']:
        return 'B'
    elif length >= units['KB']['size'] and length <= units['MB']['size']:
        return 'KB'
    elif length >= units['MB']['size'] and length <= units['GB']['size']:
        return 'MB'
    elif length > units['GB']['size']:
        return 'GB'

We will open a file stream over HTTP using requests and then save the data in chunks to the local file. Let's see how the code will look and then put it all together.

# Opening file streamm
r = requests.get(link_to_file, stream = True)  

# writing file data in chunks.
with open(file_name, 'wb') as f:  
  for chunk in r.iter_content(chunk_size):  
      f.write(chunk)

Output will consists of a dynamic progess bar with downloading details. For this we use stdout.write() and stdout.flush() methods

sys.stdout.write(format_string % (list_of_variables))

sys.stdout.flush()

Format string and variables will be like this


fmt_str = "\r%6.2f %s [%s%s] %7.2f %s/%7.2f %s  %7.2f %s"

set_of_vars = ( 
  float(done), '%',
  '*' * int(done/2),
  '_' * int(50-done/2),
  d/units[check_unit(d)]['size'],
  check_unit(d),
  tl,
  check_unit(total_length), 
  trs/units[check_unit(trs)]['size'],
  speed
  )

Now putting it all together, write the download and the main (driver) function.

def downloadFile(url, directory) :
    localFilename = url.split('/')[-1]
    with open(directory + '/' + localFilename, 'wb') as f:
        print ("Downloading . . .\n")
        start = time.time()
        r = requests.get(url, stream=True)
        total_length = float(r.headers.get('content-length'))
        d = 0
        if total_length is None: # no content length header
            f.write(r.content)
        else:
            for chunk in r.iter_content(8192):
                d += float(len(chunk))
                f.write(chunk)
                tl = total_length / units[check_unit(total_length)]['size']
                trs = d//(time.time() - start)
                speed = units[check_unit(trs)]['speed']

                done = 100 * d / total_length
                sys.stdout.write("\r%6.2f %s [%s%s] %7.2f%s  /  %4.2f %s  %7.2f %s" % (
                    float(done), '%',
                    '*' * int(done/2), 
                    '_' * int(50-done/2), 
                    d/units[check_unit(d)]['size'], check_unit(d), 
                    tl, check_unit(total_length), 
                    trs/units[check_unit(trs)]['size'], speed))

                sys.stdout.flush()
    return (time.time() - start)
def main() :
    directory = '.'
    if len(sys.argv) > 1 :
        url = sys.argv[1]
        if len(sys.argv) > 2:
            directory = sys.argv[2]
        total_time = downloadFile(url, directory)
        print ('')
        print ("Download complete...")
        print ("\rTime Elapsed: %.2fs" % total_time)
    else :
        print("No link found!")

if __name__ == "__main__" :
    main()

Save the code in a python file and use it as follow

>>> python <program_name>.py <file_link> <save_location(by default '.')>

Here is a test run output.

op.png

Although, it is a very simple file downloader but it can be improved for downloading multiple files as well as multiple download connections.

Thanks for reading. Do share any suggestions to improve this and feedback down in the comments. ๐Ÿ™‚